Causal reasoning in typical computer vision tasks

被引:2
作者
Zhang, Kexuan [1 ]
Sun, Qiyu [1 ]
Zhao, Chaoqiang [2 ,3 ]
Tang, Yang [1 ]
机构
[1] East China Univ Sci & Technol, Key Lab Adv Control & Optimizat Chem Proc, Minist Educ, Shanghai 200237, Peoples R China
[2] Natl Key Lab Air Based Informat Percept & Fus, Luoyang 471000, Peoples R China
[3] Luoyang Inst Electro Opt Equipment Avic, Luoyang 471000, Peoples R China
基金
中国国家自然科学基金;
关键词
causal reasoning; computer vision tasks; vision-language tasks; semantic segmentation; object detection;
D O I
10.1007/s11431-023-2502-9
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Deep learning has revolutionized the field of artificial intelligence. Based on the statistical correlations uncovered by deep learning-based methods, computer vision tasks, such as autonomous driving and robotics, are growing rapidly. Despite being the basis of deep learning, such correlation strongly depends on the distribution of the original data and is susceptible to uncontrolled factors. Without the guidance of prior knowledge, statistical correlations alone cannot correctly reflect the essential causal relations and may even introduce spurious correlations. As a result, researchers are now trying to enhance deep learning-based methods with causal theory. Causal theory can model the intrinsic causal structure unaffected by data bias and effectively avoids spurious correlations. This paper aims to comprehensively review the existing causal methods in typical vision and vision-language tasks such as semantic segmentation, object detection, and image captioning. The advantages of causality and the approaches for building causal paradigms will be summarized. Future roadmaps are also proposed, including facilitating the development of causal theory and its application in other complex scenarios and systems.
引用
收藏
页码:105 / 120
页数:16
相关论文
共 112 条
[1]   Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing [J].
Agarwal, Vedika ;
Shetty, Rakshith ;
Fritz, Mario .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9687-9695
[2]   A General Algorithm for Deciding Transportability of Experimental Results [J].
Bareinboim, Elias ;
Pearl, Judea .
JOURNAL OF CAUSAL INFERENCE, 2013, 1 (01) :107-133
[3]   SIMPSONS PARADOX AND SURE-THING PRINCIPLE [J].
BLYTH, CR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1972, 67 (33) :364-&
[4]  
Borsboom D, 2009, DYNAMICS PROCESS METHODOLOGY IN THE SOCIAL AND DEVELOPMENTAL SCIENCES, P67, DOI 10.1007/978-0-387-95922-1_4
[5]   Efficient phase-induced gabor cube selection and weighted fusion for hyperspectral image classification [J].
Cai RunLin ;
Liu ChenYing ;
Li Jun .
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2022, 65 (04) :778-792
[6]   Causality matters in medical imaging [J].
Castro, Daniel C. ;
Walker, Ian ;
Glocker, Ben .
NATURE COMMUNICATIONS, 2020, 11 (01)
[7]   CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].
Chen, Chun-Fu ;
Fan, Quanfu ;
Panda, Rameswar .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356
[8]   Meta-causal Learning for Single Domain Generalization [J].
Chen, Jin ;
Gao, Zhi ;
Wu, Xinxiao ;
Luo, Jiebo .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :7683-7692
[9]   Counterfactual Samples Synthesizing for Robust Visual Question Answering [J].
Chen, Long ;
Yan, Xin ;
Xiao, Jun ;
Zhang, Hanwang ;
Pu, Shiliang ;
Zhuang, Yueting .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10797-10806
[10]  
Chen WQ, 2021, Arxiv, DOI arXiv:2105.08573