Incomplete Multimodal Learning for Visual Acuity Prediction After Cataract Surgery Using Masked Self-Attention

被引:3
|
作者
Zhou, Qian [1 ]
Zou, Hua [1 ]
Jiang, Haifeng [2 ]
Wang, Yong [2 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] Wuhan Univ, Aier Eye Hosp, Wuhan, Peoples R China
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VII | 2023年 / 14226卷
关键词
Incomplete Multimodal Learning; Visual Acuity; Prediction; Self-Attention;
D O I
10.1007/978-3-031-43990-2_69
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the primary treatment option for cataracts, it is estimated that millions of cataract surgeries are performed each year globally. Predicting the Best Corrected Visual Acuity (BCVA) in cataract patients is crucial before surgeries to avoid medical disputes. However, accurate prediction remains a challenge in clinical practice. Traditional methods based on patient characteristics and surgical parameters have limited accuracy and often underestimate postoperative visual acuity. In this paper, we propose a novel framework for predicting visual acuity after cataract surgery using masked self-attention. Especially different from existing methods, which are based on monomodal data, our proposed method takes preoperative images and patient demographic data as input to leverage multimodal information. Furthermore, we expand our method to a more complex and challenging clinical scenario, i.e., the incomplete multimodal data. Firstly, we apply efficient Transformers to extract modality-specific features. Then, an attentional fusion network is utilized to fuse the multimodal information. To address the modality-missing problem, an attention mask mechanism is proposed to improve the robustness. We evaluate our method on a collected dataset of 1960 patients who underwent cataract surgery and compare its performance with other state-of-the-art approaches. The results show that our proposed method outperforms other methods and achieves a mean absolute error of 0.122 logMAR. The percentages of the prediction errors within +/- 0.10 logMAR are 94.3%. Besides, extensive experiments are conducted to investigate the effectiveness of each component in predicting visual acuity. Codes will be available at https://github.com/liyiersan/MSA.
引用
收藏
页码:735 / 744
页数:10
相关论文
共 50 条
  • [41] Artificial intelligence based classification and prediction of medical imaging using a novel framework of inverted and self-attention deep neural network architecture
    Aftab, Junaid
    Khan, Muhammad Attique
    Arshad, Sobia
    Rehman, Shams ur
    Alhammadi, Dina Abdulaziz
    Nam, Yunyoung
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [42] Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
    Gao, Peng
    Zhang, Xin-Yue
    Yang, Xiao-Li
    Ni, Jian-Cheng
    Wang, Fei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 161 - 164
  • [43] Missing well-log reconstruction using a sequence self-attention deep-learning framework
    Lin, Lei
    Wei, Hao
    Wu, Tiantian
    Zhang, Pengyun
    Zhong, Zhi
    Li, Chenglong
    GEOPHYSICS, 2023, 88 (06) : D391 - D410
  • [44] Cost-Aware Dynamic Cloud Workflow Scheduling Using Self-attention and Evolutionary Reinforcement Learning
    Shen, Ya
    Chen, Gang
    Ma, Hui
    Zhang, Mengjie
    SERVICE-ORIENTED COMPUTING, ICSOC 2024, PT II, 2025, 15405 : 3 - 18
  • [45] Class-GE2E: Speaker Verification Using Self-Attention and Transfer Learning with Loss Combination
    Bae, Ara
    Kim, Wooil
    ELECTRONICS, 2022, 11 (06)
  • [46] Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
    Alsaad, Rawan
    Malluhi, Qutaibah
    Abd-alrazaq, Alaa
    Boughorbel, Sabri
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 149
  • [47] SRR-DDI: A drug-drug interaction prediction model with substructure refined representation learning based on self-attention mechanism
    Niu, Dongjiang
    Xu, Lei
    Pan, Shourun
    Xia, Leiming
    Li, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [48] SARB-DF: A Continual Learning Aided Framework for Deepfake Video Detection Using Self-Attention Residual Block
    Prathibha, P. G.
    Tamizharasan, P. S.
    Panthakkan, Alavikunhu
    Mansoor, Wathiq
    Al Ahmad, Hussain
    IEEE ACCESS, 2024, 12 : 189088 - 189101
  • [49] NFSA-DTI: A Novel Drug-Target Interaction Prediction Model Using Neural Fingerprint and Self-Attention Mechanism
    Liu, Feiyang
    Xu, Huang
    Cui, Peng
    Li, Shuo
    Wang, Hongbo
    Wu, Ziye
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (21)
  • [50] Multimodal Fusion of EEG and EMG Signals Using Self-Attention Multi-Temporal Convolutional Neural Networks for Enhanced Hand Gesture Recognition in Rehabilitation
    Zafar, Muhammad Hamza
    Langas, Even Falkenberg
    Nyberg, Svein Olav Glesaaen
    Sanfilippo, Filippo
    2024 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS, COINS 2024, 2024, : 245 - 249