UVaT: Uncertainty Incorporated View-Aware Transformer for Robust Multi-View Classification

被引:0
作者
Li, Yapeng [1 ]
Luo, Yong [1 ]
Du, Bo [1 ]
机构
[1] Wuhan Univ, Inst Artificial Intelligence, Hubei Key Lab Multimedia & Network Commun Engn, Sch Comp Sci,Natl Engn Res Ctr Multimedia Software, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Noise measurement; Uncertainty; Transformers; Training; Robustness; Noise; Data models; Multi-view classification; incomplete view; noisy view; transformer; uncertainty; INCOMPLETE MULTIVIEW; ILLUMINATION;
D O I
10.1109/TIP.2024.3451931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-view classification algorithms usually assume that all examples have observations on all views, and the data in different views are clean. However, in real-world applications, we are often provided with data that have missing representations or contain noise on some views (i.e., missing or noise views). This may lead to significant performance degeneration, and thus many algorithms are proposed to address the incomplete view or noisy view issues. However, most of existing algorithms deal with the two issues separately, and hence may fail when both missing and noisy views exist. They are also usually not flexible in that the view or feature significance cannot be adaptively identified. Besides, the view missing patterns may vary in the training and test phases, and such difference is often ignored. To remedy these drawbacks, we propose a novel multi-view classification framework that is simultaneously robust to both incomplete and noisy views. This is achieved by integrating early fusion and late fusion in a single framework. Specifically, in our early fusion module, we propose a view-aware transformer to mask the missing views and adaptively explore the relationships between views and target tasks to deal with missing views. Considering that view missing patterns may change from the training to the test phase, we also design single-view classification and category-consistency constraints to reduce the dependence of our model on view-missing patterns. In our late fusion module, we quantify the view uncertainty in an ensemble way to estimate the noise level of that view. Then the uncertainty and prediction logits of different views are integrated to make our model robust to noisy views. The framework is trained in an end-to-end manner. Experimental results on diverse datasets demonstrate the robustness and effectiveness of our model for both incomplete and noisy views. Codes are available at https://github.com/li-yapeng/UVaT.
引用
收藏
页码:5129 / 5143
页数:15
相关论文
共 61 条
  • [1] Andrew G., 2013, INT C MACH LEARN, P1247, DOI DOI 10.5555/3042817.3043076
  • [2] Cheng JF, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2973
  • [3] Performance Enhancement of INS/GNSS/Refreshed-SLAM Integration for Acceptable Lane-Level Navigation Accuracy
    Chiang, Kai-Wei
    Tsai, Guang-Je
    Chu, Hone-Jay
    El-Sheimy, Naser
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (03) : 2463 - 2476
  • [4] Davis Jason V., 2007, P 24 INT C MACH LEAR, P209
  • [5] Duin R., 2000, UCI Machine Learning Repository, DOI DOI 10.24432/C5HC70
  • [6] A survey of uncertainty in deep neural networks
    Gawlikowski, Jakob
    Tassi, Cedrique Rovile Njieutcheu
    Ali, Mohsin
    Lee, Jongseok
    Humt, Matthias
    Feng, Jianxiang
    Kruspe, Anna
    Triebel, Rudolph
    Jung, Peter
    Roscher, Ribana
    Shahzad, Muhammad
    Yang, Wen
    Bamler, Richard
    Zhu, Xiao Xiang
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 1) : 1513 - 1589
  • [7] Geng Y, 2021, AAAI CONF ARTIF INTE, V35, P7545
  • [8] From few to many: Illumination cone models for face recognition under variable lighting and pose
    Georghiades, AS
    Belhumeur, PN
    Kriegman, DJ
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (06) : 643 - 660
  • [9] Han Zhen, 2021, P INT C LEARN REPR
  • [10] Multimodal Dynamics: Dynamical Fusion for Trustworthy Multimodal Classification
    Han, Zongbo
    Yang, Fan
    Huang, Junzhou
    Zhang, Changqing
    Yao, Jianhua
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20675 - 20685