Principal Component Analysis Coupled with Artificial Neural Networks-A Combined Technique Classifying Small Molecular Structures Using a Concatenated Spectral Database

被引:44
作者
Gosav, Steluta [1 ,2 ]
Praisler, Mirela [1 ]
Birsa, Mihail Lucian [2 ]
机构
[1] Dunarea de Jos Univ Galati, Dept Chem Phys & Environm, Galati 800008, Romania
[2] Alexandru Ioan Cuza Univ, Dept Chem, Iasi 700506, Romania
关键词
GC-FTIR; GC-MS; amphetamines; PCA; ANN; PATTERN-RECOGNITION; MASS-SPECTROMETRY; GC-FTIR; IDENTIFICATION;
D O I
10.3390/ijms12106668
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this paper we present several expert systems that predict the class identity of the modeled compounds, based on a preprocessed spectral database. The expert systems were built using Artificial Neural Networks (ANN) and are designed to predict if an unknown compound has the toxicological activity of amphetamines (stimulant and hallucinogen), or whether it is a nonamphetamine. In attempts to circumvent the laws controlling drugs of abuse, new chemical structures are very frequently introduced on the black market. They are obtained by slightly modifying the controlled molecular structures by adding or changing substituents at various positions on the banned molecules. As a result, no substance similar to those forming a prohibited class may be used nowadays, even if it has not been specifically listed. Therefore, reliable, fast and accessible systems capable of modeling and then identifying similarities at molecular level, are highly needed for epidemiological, clinical, and forensic purposes. In order to obtain the expert systems, we have preprocessed a concatenated spectral database, representing the GC-FTIR (gas chromatography-Fourier transform infrared spectrometry) and GC-MS (gas chromatography-mass spectrometry) spectra of 103 forensic compounds. The database was used as input for a Principal Component Analysis (PCA). The scores of the forensic compounds on the main principal components (PCs) were then used as inputs for the ANN systems. We have built eight PC-ANN systems (principal component analysis coupled with artificial neural network) with a different number of input variables: 15 PCs, 16 PCs, 17 PCs, 18 PCs, 19 PCs, 20 PCs, 21 PCs and 22 PCs. The best expert system was found to be the ANN network built with 18 PCs, which accounts for an explained variance of 77%. This expert system has the best sensitivity (a rate of classification C = 100% and a rate of true positives TP = 100%), as well as a good selectivity (a rate of true negatives TN = 92.77%). A comparative analysis of the validation results of all expert systems is presented, and the input variables with the highest discrimination power are discussed.
引用
收藏
页码:6668 / 6684
页数:17
相关论文
共 25 条
[1]  
[Anonymous], 1997, Data Handling in Science and Technology 20a: Handbook of Chemometrics and Qualimentrics: Part A
[2]  
[Anonymous], SPECTROSCOPY INFRARE
[3]  
[Anonymous], EASYNNPLUS VERS 3 0I
[4]  
BELLAMY LJ, 1978, INFRARED SPECTRA COM
[5]   Development of neural networks for identification of structural features from mass spectral data. [J].
Eghbaldar, A ;
Forrest, TP ;
Cabrol-Bass', D .
ANALYTICA CHIMICA ACTA, 1998, 359 (03) :283-301
[6]   Quantitative structure-activity relationships of noncompetitive antagonists of the NMDA receptor: A study of a series of MK801 derivative molecules using statistical methods and neural network [J].
Elhallaoui, M ;
Elasri, M ;
Ouazzani, F ;
Mechaqrane, A ;
Lakhlifi, T .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2003, 4 (05) :249-262
[7]   Trace identification of plant substances by combining gas chromatography mass spectrometry and direct deposition gas chromatography Fourier transform infrared spectrometry [J].
Ferary, S ;
Auger, J ;
Touche, A .
TALANTA, 1996, 43 (03) :349-357
[8]   Choosing between GC-FTIR and GC-MS spectra for an efficient intelligent identification of illicit amphetamines [J].
Gosav, S. ;
Dinica, R. ;
Praisler, M. .
JOURNAL OF MOLECULAR STRUCTURE, 2008, 887 (1-3) :269-278
[9]   Class identity assignment for amphetamines using neural networks and GC-FTIR data [J].
Gosav, S. ;
Praisler, M. ;
Van Bocxlaer, J. ;
De Leenheer, A. P. ;
Massart, D. L. .
SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2006, 64 (05) :1110-1117
[10]   Automated identification of novel amphetamines using a pure neural network and neural networks coupled with principal component analysis [J].
Gosav, S ;
Praisler, M ;
Dorohoi, DO ;
Popa, G .
JOURNAL OF MOLECULAR STRUCTURE, 2005, 744 :821-825