As the number of Deepfake video contents grows rapidly, automatic Deepfake detection has received a lot of attention in the community of digital forensics. Deepfake videos can be potentially harmful to society, from non-consensual explicit content creation to forged media by foreign adversaries used in disinformation campaigns.
A light-weight high-performance Deepfake detection method, called DefakeHop, is proposed in this work. State-of-the-art Deepfake detection methods are built upon deep neural networks. DefakeHop extracts features automatically using the successive subspace learning (SSL) principle from various parts of face images. DefakeHop consists of three main modules: 1) PixelHop++, 2) feature distillation and 3) ensemble classification. To derive the rich feature representation of faces, DefakeHop extracts features using PixelHop++ units from various parts of face images. The theory of PixelHop++ have been developed by Kuo et al. using SSL. PixelHop++ has been recently used for feature learning from low-resolution face images but, to the best of our knowledge, this is the first time that it is used for feature learning from patches extracted from high-resolution color face images. Since features extracted by PixelHop++ are still not concise enough for classification, we also propose an effective feature distillation module to further reduce the feature dimension and derive a more concise description of the face. Our feature distillation module uses spatial dimension reduction to remove spatial correlation in a face and a soft classifier to include semantic meaning for each channel. Using this module the feature dimension is significantly reduced and only the most important information is kept. With a small model size of 42,845 parameters, DefakeHop achieves state-of-the-art performance with the area under the ROC curve (AUC) of 100%, 94.95%, and 90.56% on UADFV, Celeb-DF v1 and Celeb-DF v2 datasets, respectively.
Currently, DefakeHop considers active learning, incremental learning and transfer learning as the future work.
— by Hong-Shuo Chen