SpectraTr: A novel deep learning model for qualitative analysis of drug spectroscopy based on transformer structure

被引:22
作者
Fu, Pengyou [1 ,2 ]
Wen, Yue [2 ]
Zhang, Yuke [3 ]
Li, Lingqiao [1 ]
Feng, Yanchun [4 ]
Yin, Lihui [4 ]
Yang, Huihua [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, 1 Jinji Rd, Guilin 541004, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Articial Intelligence, 10 Xitucheng Rd, Beijing 100876, Peoples R China
[3] Beijing Univ Posts & Telecommun, Sch Int, 10 Xitucheng Rd, Beijing 100876, Peoples R China
[4] Natl Inst Food & Drug Control, 10 Tiantanxili Rd, Beijing 100050, Peoples R China
基金
中国国家自然科学基金;
关键词
Near-infrared spectroscopy analysis; drug supervision; transformer structure; deep learning; chemometrics; NETWORK; AUTOENCODER;
D O I
10.1142/S1793545822500213
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
The drug supervision methods based on near-infrared spectroscopy analysis are heavily dependent on the chemometrics model which characterizes the relationship between spectral data and drug categories. The preliminary application of convolution neural network in spectral analysis demonstrates excellent end-to-end prediction ability, but it is sensitive to the hyper-parameters of the network. The transformer is a deep-learning model based on self-attention mechanism that compares convolutional neural networks (CNNs) in predictive performance and has an easy-to-design model structure. Hence, a novel calibration model named SpectraTr, based on the transformer structure, is proposed and used for the qualitative analysis of drug spectrum. The experimental results of seven classes of drug and 18 classes of drug show that the proposed SpectraTr model can automatically extract features from a huge number of spectra, is not dependent on pre-processing algorithms, and is insensitive to model hyperparameters. When the ratio of the training set to test set is 8:2, the prediction accuracy of the SpectraTr model reaches 100% and 99.52%, respectively, which outperforms PLS_DA, SVM, SAE, and CNN. The model is also tested on a public drug data set, and achieved classification accuracy of 96.97% without pre-processing algorithm, which is 34.85%, 28.28%, 5.05%, and 2.73% higher than PLS_DA, SVM, SAE, and CNN, respectively. The research shows that the SpectraTr model performs exceptionally well in spectral analysis and is expected to be a novel deep calibration model after Autoencoder networks (AEs) and CNN.
引用
收藏
页数:11
相关论文
共 31 条
[1]   Convolutional neural networks for vibrational spectroscopic data analysis [J].
Acquarelli, Jacopo ;
van Laarhoven, Twan ;
Gerretzen, Jan ;
Tran, Thanh N. ;
Buydens, Lutgarde M. C. ;
Marchiori, Elena .
ANALYTICA CHIMICA ACTA, 2017, 954 :22-31
[2]   Handheld short-wavelength NIR spectroscopy for rapid determination of sugars and carbohydrate in fresh juice with sampling error profile analysis [J].
Chen, Wanchao ;
Li, Hui ;
Zhang, Feiyu ;
Xiao, Weimin ;
Zhang, Ruoqiu ;
Chen, Zengkai ;
Du, Yiping .
INFRARED PHYSICS & TECHNOLOGY, 2021, 115
[3]   Feature selection based convolutional neural network pruning and its application in calibration modeling for NIR spectroscopy [J].
Chen, Yuan-yuan ;
Wang, Zhi-bin .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 191 :103-108
[4]   TransMed: Transformers Advance Multi-Modal Medical Image Classification [J].
Dai, Yin ;
Gao, Yifan ;
Liu, Fayu .
DIAGNOSTICS, 2021, 11 (08)
[5]  
Dosovitskiy A, 2020, ARXIV
[6]   Stacked Contractive Auto-Encoders Application in Identification of Pharmaceuticals [J].
Gan Bo-rui ;
Yang Hui-hua ;
Zhang Wei-dong ;
Feng Yan-chun ;
Yin Li-hui ;
Hu Chang-qin .
SPECTROSCOPY AND SPECTRAL ANALYSIS, 2019, 39 (01) :96-102
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   Attention based residual network for medicinal fungi near infrared spectroscopy analysis [J].
Huang, Lan ;
Guo, Shuyu ;
Wang, Ye ;
Wang, Shang ;
Chu, Qiubo ;
Li, Lu ;
Bai, Tian .
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2019, 16 (04) :3003-3017
[9]   Evaluation of an autoencoder as a feature extraction tool for near-infrared spectroscopic discriminant analysis [J].
Jo, Seeun ;
Sohng, Woosuk ;
Lee, Hyeseon ;
Chung, Hoeil .
FOOD CHEMISTRY, 2020, 331
[10]   Highly accurate protein structure prediction with AlphaFold [J].
Jumper, John ;
Evans, Richard ;
Pritzel, Alexander ;
Green, Tim ;
Figurnov, Michael ;
Ronneberger, Olaf ;
Tunyasuvunakool, Kathryn ;
Bates, Russ ;
Zidek, Augustin ;
Potapenko, Anna ;
Bridgland, Alex ;
Meyer, Clemens ;
Kohl, Simon A. A. ;
Ballard, Andrew J. ;
Cowie, Andrew ;
Romera-Paredes, Bernardino ;
Nikolov, Stanislav ;
Jain, Rishub ;
Adler, Jonas ;
Back, Trevor ;
Petersen, Stig ;
Reiman, David ;
Clancy, Ellen ;
Zielinski, Michal ;
Steinegger, Martin ;
Pacholska, Michalina ;
Berghammer, Tamas ;
Bodenstein, Sebastian ;
Silver, David ;
Vinyals, Oriol ;
Senior, Andrew W. ;
Kavukcuoglu, Koray ;
Kohli, Pushmeet ;
Hassabis, Demis .
NATURE, 2021, 596 (7873) :583-+