COMICS: End-to-End Bi-Grained Contrastive Learning for Multi-Face Forgery Detection

被引：3

作者：

Zhang, Cong ^{[1
]}

Qi, Honggang ^{[1
]}

Wang, Shuhui ^{[2
]}

Li, Yuezun ^{[3
]}

Lyu, Siwei ^{[4
]}

机构：

[1] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China

[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China

[3] Ocean Univ China, Coll Comp Sci & Technol, Qingdao 266005, Peoples R China

[4] Univ Buffalo State Univ New York Buffalo, Dept Comp Sci & Engn, Amherst, NY 14068 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 10期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Face recognition; Forgery; Feature extraction; Proposals; Object detection; Faces; Generators; DeepFake; multi-face forgery detection; contrastive learning; fine-grained feature learning;

D O I：

10.1109/TCSVT.2024.3405563

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

DeepFakes have raised serious societal concerns, leading to a great surge in detection-based forensics methods in recent years. Face forgery recognition is a standard detection method that usually follows a two-phase pipeline, i.e., it extracts the face first and then determines its authenticity by classification. While those methods perform well in ideal experimental environment, they face challenges when dealing with DeepFakes in the wild involving complex background and multiple faces of varying sizes. Moreover, most face forgery recognition methods can only process one face at a time. One straightforward way to address this issue is to simultaneous process multi-face by integrating face extraction and forgery detection in an end-to-end fashion by adapting advanced object detection architectures. However, as these object detection architectures are designed to capture the discriminative features of different object categories rather than the subtle forgery traces among the faces, the direct adaptation suffers from limited representation ability. In this paper, we propose Contrastive Multi-FaceForensics (COMICS), an end-to-end framework for multi-face forgery detection. COMICS integrates face extraction and forgery detection in a seamless manner and adapts to the advanced object detection architectures. The core of the proposed framework is a bi-grained contrastive learning approach that explores face forgery traces at both the coarse- and fine-grained levels. Specifically, coarse-grained level contrastive learning captures the discriminative features among positive and negative proposal pairs at multiple layers produced by the proposal generator, and the fine-grained level contrastive learning captures the pixel-wise discrepancy between the forged and original areas of the same face and the pixel-wise content inconsistency among different faces. Extensive experiments on the OpenForensics and FFIW datasets demonstrate that our method outperforms other counterparts and shows great potential for being integrated into various architectures. Codes are available at https://github.com/zhangconghhh/COMICS.

引用

页码：10223 / 10236

页数：14

共 85 条

[1]

Afchar D, 2018, IEEE INT WORKS INFOR

[2] YOLACT plus plus Better Real-Time Instance Segmentation [J].

Bolya, Daniel ;

Zhou, Chong ;

Xiao, Fanyi ;

Lee, Yong Jae .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) :1108-1121

[3] YOLACT Real-time Instance Segmentation [J].

Bolya, Daniel ;

Zhou, Chong ;

Xiao, Fanyi ;

Lee, Yong Jae .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9156-9165

[4] Heterogeneous Graph Contrastive Learning Network for Personalized Micro-Video Recommendation [J].

Cai, Desheng ;

Qian, Shengsheng ;

Fang, Quan ;

Hu, Jun ;

Ding, Wenkui ;

Xu, Changsheng .

IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :2761-2773

[5] Learning Features of Intra-Consistency and Inter-Diversity: Keys Toward Generalizable Deepfake Detection [J].

Chen, Han ;

Lin, Yuzhen ;

Li, Bin ;

Tan, Shunquan .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) :1468-1480

[6] BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation [J].

Chen, Hao ;

Sun, Kunyang ;

Tian, Zhi ;

Shen, Chunhua ;

Huang, Yongming ;

Yan, Youliang .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8570-8578

[7]

Chen JY, 2021, Arxiv, DOI arXiv:2107.01152

[8]

Chen T, 2020, PR MACH LEARN RES, V119

[9] Image Manipulation Detection by Multi-View Multi-Scale Supervision [J].

Chen, Xinru ;

Dong, Chengbo ;

Ji, Jiaqi ;

Cao, Juan ;

Li, Xirong .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :14165-14173

[10]

Cheng H, 2024, Arxiv, DOI arXiv:2401.15859

← 1 2 3 4 5 6 7 8 9 →