Dynamic Multimodal Information Bottleneck for Multimodality Classification

被引：3

作者：

Fang, Yingying ^{[1
]}

Wu, Shuang ^{[4
]}

Zhang, Sheng ^{[1
]}

Huang, Chaoyan ^{[5
]}

Zeng, Tieyong ^{[5
]}

Xing, Xiaodan ^{[2
,3
]}

Walsh, Simon ^{[1
]}

Yang, Guang ^{[1
,2
,3
]}

机构：

[1] Imperial Coll London, Natl Heart & Lung Inst, London SW7 2AZ, England

[2] Imperial Coll London, Bioengn Dept, London W12 7SL, England

[3] Imperial Coll London, Imperial X, London W12 7SL, England

[4] Fusionopolis, Black Sesame Technol, Singapore, Singapore

[5] Chinese Univ Hong Kong, Shatin, Hong Kong, Peoples R China

来源：

2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024 | 2024年

关键词：

RISK;

D O I：

10.1109/WACV57701.2024.00752

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effectively leveraging multimodal data such as various images, laboratory tests and clinical information is becoming increasingly attractive in a variety of AI-based medical diagnosis and prognosis tasks. Most existing multi-modal techniques only focus on enhancing their performance by leveraging the differences or shared features from various modalities and fusing feature across different modalities. These approaches are generally not optimal for clinical settings, which pose the additional challenges of limited training data, as well as being rife with redundant data or noisy modality channels, leading to subpar performance. To address this gap, we study the robustness of existing methods to data redundancy and noise and propose a generalized dynamic multimodal information bottleneck framework for attaining a robust fused feature representation. Specifically, our information bottleneck module serves to filter out the task-irrelevant information and noises in the fused feature, and we further introduce a sufficiency loss to prevent dropping of task-relevant information, thus explicitly preserving the sufficiency of prediction information in the distilled feature. We validate our model on an in-house and a public COVID19 dataset for mortality prediction as well as two public biomedical datasets for diagnostic tasks. Extensive experiments show that our method surpasses the state-of-the-art and is significantly more robust, being the only method to remain performance when large-scale noisy channels exist. Our code is publicly available at https://github.com/ayanglab/DMIB.

引用

页码：7681 / 7691

页数：11

共 50 条

[1] Multimodal biomedical AI [J].

Acosta, Julian N. ;

Falcone, Guido J. ;

Rajpurkar, Pranav ;

Topol, Eric J. .

NATURE MEDICINE, 2022, 28 (09) :1773-1784

[2]

[Anonymous], 2019, PMLR

[3]

[Anonymous], 2018, ARXIV

[4]

Bach Francis, 2017, JOURNAL OF MACHINE LEARNING RESEARCH, V18

[5] An overview of deep learning methods for multimodal medical data mining [J].

Behrad, Fatemeh ;

Abadeh, Mohammad Saniee .

EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200

[6] A multimodal transformer to fuse images and metadata for skin disease classification [J].

Cai, Gan ;

Zhu, Yu ;

Wu, Yue ;

Jiang, Xiaoben ;

Ye, Jiongyao ;

Yang, Dawei .

VISUAL COMPUTER, 2023, 39 (07) :2781-2793

[7] A Survey on Multimodal Data-Driven Smart Healthcare Systems: Approaches and Applications [J].

Cai, Qiong ;

Wang, Hao ;

Li, Zhenmin ;

Liu, Xiao .

IEEE ACCESS, 2019, 7 :133583-133599

[8] Using DeepGCN to identify the autism spectrum disorder from multi-site resting-state data [J].

Cao, Menglin ;

Yang, Ming ;

Qin, Chi ;

Zhu, Xiaofei ;

Chen, Yanni ;

Wang, Jue ;

Liu, Tian .

BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70

[9]

Cui Chaoran, 2022, ARXIV

[10] ON CALCULATION OF MUTUAL INFORMATION [J].

DUNCAN, TE .

SIAM JOURNAL ON APPLIED MATHEMATICS, 1970, 19 (01) :215-&

← 1 2 3 4 5 →