1D 13C-NMR Data as Molecular Descriptors in Spectra - Structure Relationship Analysis of Oligosaccharides

被引:3
作者
Pereira, Florbela [1 ,2 ]
机构
[1] Univ Nova Lisboa, CQFB, P-2829516 Caparica, Portugal
[2] Univ Nova Lisboa, REQUIMTE, Dept Quim, Fac Ciencias & Tecnol, P-2829516 Caparica, Portugal
来源
MOLECULES | 2012年 / 17卷 / 04期
关键词
machine learning techniques; Random Forest; classification tree; CPGNN; C-13-NMR; oligosaccharides; disaccharides; trisaccharides; NMR-SPECTROSCOPY; NEURAL-NETWORK; RANDOM FOREST; CASPER; H-1; POLYSACCHARIDES; IDENTIFICATION; ASSIGNMENTS; SKELETONS; H-1-NMR;
D O I
10.3390/molecules17043818
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Spectra-structure relationships were investigated for estimating the anomeric configuration, residues and type of linkages of linear and branched trisaccharides using C-13-NMR chemical shifts. For this study, 119 pyranosyl trisaccharides were used that are trimers of the alpha or beta anomers of D-glucose, D-galactose, D-mannose, L-fucose or L-rhamnose residues bonded through alpha or beta glycosidic linkages of types 1 -> 2, 1 -> 3, 1 -> 4, or 1 -> 6, as well as methoxylated and/or N-acetylated amino trisaccharides. Machine learning experiments were performed for: (1) classification of the anomeric configuration of the first unit, second unit and reducing end; (2) classification of the type of first and second linkages; (3) classification of the three residues: reducing end, middle and first residue; and (4) classification of the chain type. Our previously model for predicting the structure of disaccharides was incorporated in this new model with an improvement of the predictive power. The best results were achieved using Random Forests with 204 di- and trisaccharides for the training set-it could correctly classify 83%, 90%, 88%, 85%, 85%, 75%, 79%, 68% and 94% of the test set (69 compounds) for the nine tasks, respectively, on the basis of unassigned chemical shifts.
引用
收藏
页码:3818 / 3833
页数:16
相关论文
共 38 条
  • [1] JATOON: Java']Java tools for neural networks
    Aires-de-Sousa, J
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 61 (1-2) : 167 - 173
  • [2] [Anonymous], 2004, R LANG ENV STAT COMP
  • [3] SYNTHESIS, NMR, AND CONFORMATIONAL STUDIES OF SOME 3,4-DI-O-GLYCOPYRANOSYL-SUBSTITUTED METHYL ALPHA-D-GALACTOPYRANOSIDES
    BAUMANN, H
    ERBING, B
    JANSSON, PE
    KENNE, L
    [J]. JOURNAL OF THE CHEMICAL SOCIETY-PERKIN TRANSACTIONS 1, 1989, (12): : 2167 - 2178
  • [4] NMR AND CONFORMATIONAL STUDIES OF SOME 3-ORTHO-GLYCOPYRANOSYL, 4-ORTHO-GLYCOPYRANOSYL, AND 3,4-DI-ORTHO-GLYCOPYRANOSYL-SUBSTITUTED METHYL ALPHA-D-GALACTOPYRANOSIDES
    BAUMANN, H
    ERBING, B
    JANSSON, PE
    KENNE, L
    [J]. JOURNAL OF THE CHEMICAL SOCIETY-PERKIN TRANSACTIONS 1, 1989, (12): : 2153 - 2165
  • [5] APPLICATIONS OF HIGH-RESOLUTION SELF-ORGANIZING MAPS TO RETROSYNTHETIC AND QSAR ANALYSIS
    BIENFAIT, B
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1994, 34 (04): : 890 - 898
  • [6] ASSIGNMENT OF STRUCTURES TO OLIGOSACCHARIDES PRODUCED BY ENZYMATIC DEGRADATION OF A BETA-D-GLUCAN FROM BARLEY BY H-1-NMR AND C-13-NMR SPECTROSCOPY
    BOCK, K
    DUUS, JO
    NORMAN, B
    PEDERSEN, S
    [J]. CARBOHYDRATE RESEARCH, 1991, 211 (02) : 219 - 233
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] Breiman L., 2000, CLASSIFICATION REGRE
  • [9] NMR spectroscopy in the study of carbohydrates: Characterizing the structural complexity
    Bubb, WA
    [J]. CONCEPTS IN MAGNETIC RESONANCE PART A, 2003, 19A (01) : 1 - 19
  • [10] Dominik M., 2006, THESIS U BASEL BASEL