Disentangled global and local features of multi-source data variational autoencoder: An interpretable model for diagnosing IgAN via multi-source Raman spectral fusion techniques

被引:0
|
作者
Shuai, Wei [1 ]
Tian, Xuecong [2 ]
Zuo, Enguang [2 ]
Zhang, Xueqin [3 ]
Lu, Chen [4 ]
Gu, Jin [5 ,6 ]
Chen, Chen [2 ]
Lv, Xiaoyi [1 ]
Chen, Cheng [1 ]
机构
[1] Xinjiang Univ, Coll Software, Urumqi 830046, Peoples R China
[2] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[3] Peoples Hosp Xinjiang Uygur Autonomous Reg, Dept Nephrol, Urumqi 830001, Xinjiang, Peoples R China
[4] Xinjiang Med Univ, Affiliated Hosp 1, Dept Nephrol, Urumqi 830011, Xinjiang, Peoples R China
[5] Tsinghua Univ, Inst Precis Med, BNRIST Bioinformat Div, MOE,Key Lab Bioinformat, Beijing 100084, Peoples R China
[6] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
关键词
lgAN; Raman spectroscopy; Multi-source data fusion; Encoder decoupling; SHAP; SPECTROSCOPY; CLASSIFICATION; URINE;
D O I
10.1016/j.artmed.2024.103053
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A single Raman spectrum reflects limited molecular information. Effective fusion of the Raman spectra of serum and urine source domains helps to obtain richer feature information. However, most of the current studies on immunoglobulin A nephropathy (IgAN) based on Raman spectroscopy are based on small sample data and low signal-to-noise ratio. If a multi-source data fusion strategy is directly adopted, it may even reduce the accuracy of disease diagnosis. To this end, this paper proposes a data enhancement and spectral optimization method based on variational autoencoders to obtain reconstructed Raman spectra with doubled sample size and improved signal-to-noise ratio. In the diagnosis of IgAN in multi-source domain Raman spectra, this paper builds a global and local feature decoupled variational autoencoder (DMSGL-VAE) model based on multi-source data. First, the statistical features after spectral segmentation are extracted, and the latent variables obtained by the variational encoder are decoupled through the decoupling module. The global representation and local representation obtained represent the global shared information and local unique information of the serum and urine source domains, respectively. Then, the cross-source reconstruction loss and decoupling loss are used to constrain the decoupling, and the effectiveness of the decoupling is proved quantitatively and qualitatively. Finally, the features of different source domains were integrated to diagnose IgAN, and the results were analyzed for important features using the SHapley Additive exPlanations algorithm. The experimental results showed that the AUC value of the DMSGL-VAE model for diagnosing IgAN on the test set was as high as 0.9958. The SHAP algorithm was used to further prove that proteins, hydroxybutyrate, and guanine are likely to be common biological fingerprint substances for the diagnosis of IgAN by serum and urine Raman spectroscopy. In summary, the DMSGL-VAE model designed based on Raman spectroscopy in this paper can achieve rapid, non-invasive, and accurate screening of IgAN in terms of classification performance. And interpretable analysis may help doctors further understand IgAN and make more efficient diagnostic measures in the future.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] THz Spectroscopic Investigation of Wheat-Quality by Using Multi-Source Data Fusion
    Ge, Hongyi
    Jiang, Yuying
    Zhang, Yuan
    SENSORS, 2018, 18 (11)
  • [42] Optimised LSTM Neural Network for Traffic Speed Prediction with Multi-Source Data Fusion
    Zhao, Yongpeng
    Li, Yongcang
    Ma, Changxi
    Wang, Ke
    Xu, Xuecai
    PROMET-TRAFFIC & TRANSPORTATION, 2024, 36 (04): : 765 - 778
  • [43] Remote Sensing Monitoring of Grasslands Based on Adaptive Feature Fusion with Multi-Source Data
    Wang, Weitao
    Ma, Qin
    Huang, Jianxi
    Feng, Quanlong
    Zhao, Yuanyuan
    Guo, Hao
    Chen, Boan
    Li, Chenxi
    Zhang, Yuxin
    REMOTE SENSING, 2022, 14 (03)
  • [44] An Improved Multi-Source Data Fusion Method Based on the Belief Entropy and Divergence Measure
    Wang, Zhe
    Xiao, Fuyuan
    ENTROPY, 2019, 21 (06)
  • [45] Construction and Application of Piano to Intelligent Teaching System Based on Multi-Source Data Fusion
    Jing, Zhen
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (04)
  • [46] Deep well construction of big data platform based on multi-source heterogeneous data fusion
    Zhang Y.
    Wang Y.
    Ding H.
    Li Y.
    Bai Y.
    International Journal of Internet Manufacturing and Services, 2019, 6 (04) : 371 - 388
  • [47] A fault diagnosis method with multi-source data fusion based on hierarchical attention for AUV
    Xia, Shaoxuan
    Zhou, Xiaofeng
    Shi, Haibo
    Li, Shuai
    Xu, Chunhui
    OCEAN ENGINEERING, 2022, 266
  • [48] Crop classification based on multi-source remote sensing data fusion and LSTM algorithm
    Xie Y.
    Zhang Y.
    Xun L.
    Chai X.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2019, 35 (15): : 129 - 137
  • [49] A practical prediction method for grinding accuracy based on multi-source data fusion in manufacturing
    Wu, Haipeng
    Li, Zhihang
    Tang, Qian
    Zhang, Penghui
    Xia, Dong
    Zhao, Lianchang
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2023, 127 (3-4) : 1407 - 1417
  • [50] MDFD: A multi-source data fusion detection framework for Sybil attack detection in VANETs
    Chen, Ye
    Lai, Yingxu
    Zhang, Zhaoyi
    Li, Hanmei
    Wang, Yuhang
    COMPUTER NETWORKS, 2023, 224