Disentangled global and local features of multi-source data variational autoencoder: An interpretable model for diagnosing IgAN via multi-source Raman spectral fusion techniques

被引:0
|
作者
Shuai, Wei [1 ]
Tian, Xuecong [2 ]
Zuo, Enguang [2 ]
Zhang, Xueqin [3 ]
Lu, Chen [4 ]
Gu, Jin [5 ,6 ]
Chen, Chen [2 ]
Lv, Xiaoyi [1 ]
Chen, Cheng [1 ]
机构
[1] Xinjiang Univ, Coll Software, Urumqi 830046, Peoples R China
[2] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[3] Peoples Hosp Xinjiang Uygur Autonomous Reg, Dept Nephrol, Urumqi 830001, Xinjiang, Peoples R China
[4] Xinjiang Med Univ, Affiliated Hosp 1, Dept Nephrol, Urumqi 830011, Xinjiang, Peoples R China
[5] Tsinghua Univ, Inst Precis Med, BNRIST Bioinformat Div, MOE,Key Lab Bioinformat, Beijing 100084, Peoples R China
[6] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
关键词
lgAN; Raman spectroscopy; Multi-source data fusion; Encoder decoupling; SHAP; SPECTROSCOPY; CLASSIFICATION; URINE;
D O I
10.1016/j.artmed.2024.103053
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A single Raman spectrum reflects limited molecular information. Effective fusion of the Raman spectra of serum and urine source domains helps to obtain richer feature information. However, most of the current studies on immunoglobulin A nephropathy (IgAN) based on Raman spectroscopy are based on small sample data and low signal-to-noise ratio. If a multi-source data fusion strategy is directly adopted, it may even reduce the accuracy of disease diagnosis. To this end, this paper proposes a data enhancement and spectral optimization method based on variational autoencoders to obtain reconstructed Raman spectra with doubled sample size and improved signal-to-noise ratio. In the diagnosis of IgAN in multi-source domain Raman spectra, this paper builds a global and local feature decoupled variational autoencoder (DMSGL-VAE) model based on multi-source data. First, the statistical features after spectral segmentation are extracted, and the latent variables obtained by the variational encoder are decoupled through the decoupling module. The global representation and local representation obtained represent the global shared information and local unique information of the serum and urine source domains, respectively. Then, the cross-source reconstruction loss and decoupling loss are used to constrain the decoupling, and the effectiveness of the decoupling is proved quantitatively and qualitatively. Finally, the features of different source domains were integrated to diagnose IgAN, and the results were analyzed for important features using the SHapley Additive exPlanations algorithm. The experimental results showed that the AUC value of the DMSGL-VAE model for diagnosing IgAN on the test set was as high as 0.9958. The SHAP algorithm was used to further prove that proteins, hydroxybutyrate, and guanine are likely to be common biological fingerprint substances for the diagnosis of IgAN by serum and urine Raman spectroscopy. In summary, the DMSGL-VAE model designed based on Raman spectroscopy in this paper can achieve rapid, non-invasive, and accurate screening of IgAN in terms of classification performance. And interpretable analysis may help doctors further understand IgAN and make more efficient diagnostic measures in the future.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] A multi-source data fusion modeling method for debris flow prevention engineering
    Xu Qing-yang
    Ye Jian
    Lyu Yi-jie
    JOURNAL OF MOUNTAIN SCIENCE, 2021, 18 (04) : 1049 - 1061
  • [22] A Situation Analysis Method for Specific Domain Based on Multi-source Data Fusion
    Wang, Haijian
    Zhang, Zhaohui
    Wang, Pengwei
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT I, 2018, 10954 : 160 - 171
  • [23] Prediction of groundwater pollution diffusion path based on multi-source data fusion
    Zhang, Yanhong
    Huo, Xiaofeng
    Luo, Yue
    FRONTIERS IN ENVIRONMENTAL SCIENCE, 2023, 10
  • [24] Multimodal music emotion recognition method based on multi-source data fusion
    Liu B.
    International Journal of Reasoning-based Intelligent Systems, 2024, 16 (03) : 187 - 194
  • [25] Research on Intelligent Management System of Gas Pipeline with Multi-source Data Fusion
    Cao X.
    Tan J.
    Li H.
    Li R.
    Wang Y.
    Zhang J.
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [26] Generator condition monitoring method based on SAE and multi-source data fusion
    Xing, Chao
    Xi, Xinze
    He, Xin
    Liu, Mingqun
    FRONTIERS IN ENERGY RESEARCH, 2023, 11
  • [27] Deformation Monitoring of Monopole Communication Towers Based on Multi-Source Data Fusion
    Ji, Xiaopeng
    Ren, Liang
    Fu, Xing
    Zhang, Qing
    Li, Hao
    BUILDINGS, 2023, 13 (11)
  • [28] Rock hardness identification based on optimized PNN and multi-source data fusion
    He, Ying
    Tian, Muqin
    Song, Jiancheng
    Feng, Junling
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2022, 236 (07) : 3701 - 3716
  • [29] Biofuser: a multi-source data fusion platform for fusing the data of fermentation process devices
    Zhang, Dequan
    Jiang, Wei
    Lou, Jincheng
    Han, Xuanzhou
    Xia, Jianye
    FRONTIERS IN DIGITAL HEALTH, 2024, 6
  • [30] Conversion and fusion method of multi-source and different populations maintainability prior data
    Zhou, Cheng
    Xu, Da
    Wang, Zhaoyang
    HELIYON, 2023, 9 (11)