Genetic algorithms and self-organizing maps: a powerful combination for modeling complex QSAR and QSPR problems

被引:16
作者
Bayram, E
Santago, P
Harris, R
Xiao, YD
Clauset, AJ
Schmitt, JD
机构
[1] Targacept Inc, Mol Design Grp, Winston Salem, NC 27101 USA
[2] Wake Forest Univ, Dept Biomed Engn, Winston Salem, NC 27157 USA
[3] Univ New Mexico, Dept Comp Sci, Albuquerque, NM 87131 USA
基金
美国国家科学基金会;
关键词
genetic algorithm; neural networks; QSAR; QSPR; supervised self-organizing maps; variable selection;
D O I
10.1007/s10822-004-5321-2
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Modeling non-linear descriptor-target activity/property relationships with many dependent descriptors has been a long-standing challenge in the design of biologically active molecules. In an effort to address this problem, we couple the supervised self-organizing map with the genetic algorithm. Although self-organizing maps are non-linear and topology-preserving techniques that hold great potential for modeling and decoding relationships, the large number of descriptors in typical quantitative structure--activity relationship or quantitative structure--property relationship analysis may lead to spurious correlation(s) and/or difficulty in the interpretation of resulting models. To reduce the number of descriptors to a manageable size, we chose the genetic algorithm for descriptor selection because of its flexibility and efficiency in solving complex problems. Feasibility studies were conducted using six different datasets, of moderate-to-large size and moderate-to-great diversity; each with a different biological endpoint. Since favorable training set statistics do not necessarily indicate a highly predictive model, the quality of all models was confirmed by withholding a portion of each dataset for external validation. We also address the variability introduced onto modeling through dataset partitioning and through the stochastic nature of the combined genetic algorithm supervised self-organizing map method using the z-score and other tests. Experiments show that the combined method provides comparable accuracy to the supervised self-organizing map alone, but using significantly fewer descriptors in the models generated. We observed consistently better results than partial least squares models. We conclude that the combination of genetic algorithms with the supervised self-organizing map shows great potential as a quantitative structure--activity/property relationship modeling tool.
引用
收藏
页码:483 / 493
页数:11
相关论文
共 31 条
  • [1] Nonlinear mapping networks
    Agrafiotis, DK
    Lobanov, VS
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (06): : 1356 - 1362
  • [2] APPLICATIONS OF NEURAL NETWORKS IN QUANTITATIVE STRUCTURE-ACTIVITY-RELATIONSHIPS OF DIHYDROFOLATE-REDUCTASE INHIBITORS
    ANDREA, TA
    KALAYEH, H
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 1991, 34 (09) : 2824 - 2836
  • [3] [Anonymous], 1999, MATL DSP C ESP FINL
  • [4] [Anonymous], 1998, Applied regression analysis, DOI 10.1002/9781118625590
  • [5] The use of self-organizing neural networks in drug design
    Anzali, S
    Gasteiger, J
    Holzgrabe, U
    Polanski, J
    Sadowski, J
    Teckentrup, A
    Wagener, M
    [J]. PERSPECTIVES IN DRUG DISCOVERY AND DESIGN, 1998, 9-11 : 273 - 299
  • [6] BARNETT S, 1999, RACE E R D PHARMA 20
  • [7] Comparison of chemical databases:: Analysis of molecular diversity with Self Organising Maps (SOM)
    Bernard, P
    Golbraikh, A
    Kireev, D
    Chrétien, JR
    Rozhkova, N
    [J]. ANALUSIS, 1998, 26 (08) : 333 - 341
  • [8] A fuzzy ARTMAP-based quantitative structure-property relationship (QSPR) for predicting physical properties of organic compounds
    Espinosa, G
    Yaffe, D
    Arenas, A
    Cohen, Y
    Giralt, F
    [J]. INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2001, 40 (12) : 2757 - 2766
  • [9] Application of BCUT metrics and genetic algorithm in binary QSAR analysis
    Gao, H
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02): : 402 - 407
  • [10] PARTIAL LEAST-SQUARES REGRESSION - A TUTORIAL
    GELADI, P
    KOWALSKI, BR
    [J]. ANALYTICA CHIMICA ACTA, 1986, 185 : 1 - 17