Incomplete Multimodal Learning for Visual Acuity Prediction After Cataract Surgery Using Masked Self-Attention

被引：3

作者：

Zhou, Qian ^{[1
]}

Zou, Hua ^{[1
]}

Jiang, Haifeng ^{[2
]}

Wang, Yong ^{[2
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Wuhan Univ, Aier Eye Hosp, Wuhan, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VII | 2023年 / 14226卷

关键词：

Incomplete Multimodal Learning; Visual Acuity; Prediction; Self-Attention;

D O I：

10.1007/978-3-031-43990-2_69

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As the primary treatment option for cataracts, it is estimated that millions of cataract surgeries are performed each year globally. Predicting the Best Corrected Visual Acuity (BCVA) in cataract patients is crucial before surgeries to avoid medical disputes. However, accurate prediction remains a challenge in clinical practice. Traditional methods based on patient characteristics and surgical parameters have limited accuracy and often underestimate postoperative visual acuity. In this paper, we propose a novel framework for predicting visual acuity after cataract surgery using masked self-attention. Especially different from existing methods, which are based on monomodal data, our proposed method takes preoperative images and patient demographic data as input to leverage multimodal information. Furthermore, we expand our method to a more complex and challenging clinical scenario, i.e., the incomplete multimodal data. Firstly, we apply efficient Transformers to extract modality-specific features. Then, an attentional fusion network is utilized to fuse the multimodal information. To address the modality-missing problem, an attention mask mechanism is proposed to improve the robustness. We evaluate our method on a collected dataset of 1960 patients who underwent cataract surgery and compare its performance with other state-of-the-art approaches. The results show that our proposed method outperforms other methods and achieves a mean absolute error of 0.122 logMAR. The percentages of the prediction errors within +/- 0.10 logMAR are 94.3%. Besides, extensive experiments are conducted to investigate the effectiveness of each component in predicting visual acuity. Codes will be available at https://github.com/liyiersan/MSA.

引用

页码：735 / 744

页数：10

共 50 条

[41] Artificial intelligence based classification and prediction of medical imaging using a novel framework of inverted and self-attention deep neural network architecture
Aftab, Junaid
Khan, Muhammad Attique
Arshad, Sobia
Rehman, Shams ur
Alhammadi, Dina Abdulaziz
Nam, Yunyoung
SCIENTIFIC REPORTS, 2025, 15 (01):
[42] Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
Gao, Peng
Zhang, Xin-Yue
Yang, Xiao-Li
Ni, Jian-Cheng
Wang, Fei
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 161 - 164
[43] Missing well-log reconstruction using a sequence self-attention deep-learning framework
Lin, Lei
Wei, Hao
Wu, Tiantian
Zhang, Pengyun
Zhong, Zhi
Li, Chenglong
GEOPHYSICS, 2023, 88 (06) : D391 - D410
[44] Cost-Aware Dynamic Cloud Workflow Scheduling Using Self-attention and Evolutionary Reinforcement Learning
Shen, Ya
Chen, Gang
Ma, Hui
Zhang, Mengjie
SERVICE-ORIENTED COMPUTING, ICSOC 2024, PT II, 2025, 15405 : 3 - 18
[45] Class-GE2E: Speaker Verification Using Self-Attention and Transfer Learning with Loss Combination
Bae, Ara
Kim, Wooil
ELECTRONICS, 2022, 11 (06)
[46] Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
Alsaad, Rawan
Malluhi, Qutaibah
Abd-alrazaq, Alaa
Boughorbel, Sabri
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 149
[47] SRR-DDI: A drug-drug interaction prediction model with substructure refined representation learning based on self-attention mechanism
Niu, Dongjiang
Xu, Lei
Pan, Shourun
Xia, Leiming
Li, Zhen
KNOWLEDGE-BASED SYSTEMS, 2024, 285
[48] SARB-DF: A Continual Learning Aided Framework for Deepfake Video Detection Using Self-Attention Residual Block
Prathibha, P. G.
Tamizharasan, P. S.
Panthakkan, Alavikunhu
Mansoor, Wathiq
Al Ahmad, Hussain
IEEE ACCESS, 2024, 12 : 189088 - 189101
[49] NFSA-DTI: A Novel Drug-Target Interaction Prediction Model Using Neural Fingerprint and Self-Attention Mechanism
Liu, Feiyang
Xu, Huang
Cui, Peng
Li, Shuo
Wang, Hongbo
Wu, Ziye
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (21)
[50] Multimodal Fusion of EEG and EMG Signals Using Self-Attention Multi-Temporal Convolutional Neural Networks for Enhanced Hand Gesture Recognition in Rehabilitation
Zafar, Muhammad Hamza
Langas, Even Falkenberg
Nyberg, Svein Olav Glesaaen
Sanfilippo, Filippo
2024 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS, COINS 2024, 2024, : 245 - 249

← 1 2 3 4 5 →