Enhanced structural encoding algorithm for database retrievals of carbon-13 nuclear magnetic resonance chemical shifts

被引:7
作者
Schweitzer, RC [1 ]
Small, GW [1 ]
机构
[1] OHIO UNIV, CTR INTELLIGENT CHEM INSTRUMENTAT, DEPT CHEM, ATHENS, OH 45701 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1996年 / 36卷 / 02期
关键词
D O I
10.1021/ci950142q
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
An enhanced version of an algorithm is discussed which encodes a description of the chemical environment of carbon atoms in a manner that correlates to carbon-13 nuclear magnetic resonance (C-13 NMR) chemical shifts. The encoding algorithm uses a vector-based approach in which the first dimension of the vector represents the chemical shift of the carbon atom, the second dimension represents the collective influence of atoms one bond away from the carbon on its chemical shift, and each successive dimension represents the influence of the atoms one bond further away. This encoding algorithm is a key component of a C-13 NMR spectrum simulation procedure in which each of the carbons in a large database of known structures and spectra is represented as a vector. Database search methods based on vector comparisons are used to find the closest matching chemical environments and associated chemical shifts for each of the carbons in a structure input by a user. Enhancements to the original algorithm include an expansion of the number of atom classes treated, the addition of a scheme to treat aromatic systems as a special case, and the use of an expanded vector format to regain some of the information lost by collapsing the molecular structure to a vector representation. To test this algorithm, a database of structures and spectra is split into training and test sets consisting of 16959 and 4240 structures, respectively. Experiments performed to optimize several parameters associated with the encoding algorithm are followed by comparing the retrieved (i.e., predicted) and actual chemical shifts for the structures in the test set. For the optimal parameter settings found, the median of the mean absolute deviations in chemical shifts for the structures in the test set was 1.30 ppm and was obtained with an expanded vector representation based on 15 dimensions.
引用
收藏
页码:310 / 322
页数:13
相关论文
共 48 条
[1]  
BREMSER W, 1978, ANAL CHIM ACTA-COMP, V2, P355
[2]   EXPECTATION RANGES OF C-13 NMR CHEMICAL-SHIFTS [J].
BREMSER, W .
MAGNETIC RESONANCE IN CHEMISTRY, 1985, 23 (04) :271-275
[3]   THE CSEARCH-NMR DATA-BASE APPROACH TO SOLVE FREQUENT QUESTIONS CONCERNING SUBSTITUENT EFFECTS ON C-13 NMR CHEMICAL-SHIFTS [J].
CHEN, LR ;
ROBIEN, W .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1993, 19 (02) :217-223
[4]   OPSI - A UNIVERSAL METHOD FOR PREDICTION OF C-13-NMR SPECTRA BASED ON OPTIMIZED ADDITIVITY MODELS [J].
CHEN, LR ;
ROBIEN, W .
ANALYTICAL CHEMISTRY, 1993, 65 (17) :2282-2287
[5]   INTEGRATED APPROACH FOR C-13 NUCLEAR-MAGNETIC-RESONANCE SHIFT PREDICTION, SPECTRAL SIMULATION AND LIBRARY SEARCH [J].
CHENG, HN ;
KASEHAGEN, LJ .
ANALYTICA CHIMICA ACTA, 1994, 285 (1-2) :223-235
[6]  
CHRISTL M, 1971, J AM CHEM SOC, V93, P3463, DOI 10.1021/ja00743a028
[7]   C-13 NMR-SPECTRA OF SOME POLYCHLOROALKENES [J].
CHUKOVSKAYA, EC ;
DOSTOVALOVA, VI ;
VASILEVA, TT ;
FREIDLINA, RK .
ORGANIC MAGNETIC RESONANCE, 1976, 8 (05) :229-232
[8]   MINICOMPUTER PROGRAM BASED ON ADDITIVITY RULES FOR ESTIMATION OF C-13-NMR CHEMICAL-SHIFTS [J].
CLERC, JT ;
SOMMERAUER, H .
ANALYTICA CHIMICA ACTA, 1977, 95 (1-2) :33-40
[9]   SIMULATION OF C-13 NUCLEAR-MAGNETIC-RESONANCE SPECTRA OF TETRAHYDROPYRANS USING REGRESSION-ANALYSIS AND NEURAL NETWORKS [J].
CLOUSER, DL ;
JURS, PC .
ANALYTICA CHIMICA ACTA, 1994, 295 (03) :221-231
[10]   C-13 CHEMICAL-SHIFTS OF SOME MODEL OLEFINS [J].
COUPERUS, PA ;
CLAGUE, ADH ;
VANDONGEN, JPCM .
ORGANIC MAGNETIC RESONANCE, 1976, 8 (08) :426-431