Artificial Neural Network-Based Analysis of High-Throughput Screening Data for Improved Prediction of Active Compounds

被引:6
作者
Chakrabarti, Swapan [1 ]
Svojanovsky, Stan R. [2 ]
Slavik, Romana [3 ]
Georg, Gunda I. [4 ]
Wilson, George S. [5 ]
Smith, Peter G. [2 ,6 ,7 ]
机构
[1] Univ Kansas, Dept Elect Engn & Comp Sci, Lawrence, KS 66045 USA
[2] Univ Kansas, Med Ctr, Kansas City, KS 66103 USA
[3] Adv Response Management Inc, Kansas City, KS USA
[4] Univ Minnesota, Coll Pharm, Dept Med Chem, Minneapolis, MN 55455 USA
[5] Univ Kansas, Assoc Vice Provost Res & Grad Studies, Lawrence, KS 66045 USA
[6] Univ Kansas, Med Ctr, Dept Mol & Integrat Physiol, Kansas City, KS 66103 USA
[7] Univ Kansas, Med Ctr, RL Smith Intellectual & Dev Disabil Res Ctr, Kansas City, KS 66103 USA
关键词
pattern classification; neural networks; generalization property;
D O I
10.1177/1087057109351312
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Artificial neural networks (ANNs) are trained using high-throughput screening (HTS) data to recover active compounds from a large data set. Improved classification performance was obtained on combining predictions made by multiple ANNs. The HTS data, acquired from a methionine aminopeptidases inhibition study, consisted of a library of 43,347 compounds, and the ratio of active to nonactive compounds, R-A/N, was 0.0321. Back-propagation ANNs were trained and validated using principal components derived from the physicochemical features of the compounds. On selecting the training parameters carefully, an ANN recovers one-third of all active compounds from the validation set with a 3-fold gain in R-A/N value. Further gains in R-A/N values were obtained upon combining the predictions made by a number of ANNs. The generalization property of the back-propagation ANNs was used to train those ANNs with the same training samples, after being initialized with different sets of random weights. As a result, only 10% of all available compounds were needed for training and validation, and the rest of the data set was screened with more than a 10-fold gain of the original R-A/N value. Thus, ANNs trained with limited HTS data might become useful in recovering active compounds from large data sets. (Journal of Biomolecular Screening 2009:1236-1244)
引用
收藏
页码:1236 / 1244
页数:9
相关论文
共 13 条
  • [1] ARMSTRON JW, REV HIGH THROUGHPUT
  • [2] Comparison of four approaches to a rock facies classification problem
    Dubois, Martin K.
    Bohling, Geoffrey C.
    Chakrabarti, Swapan
    [J]. COMPUTERS & GEOSCIENCES, 2007, 33 (05) : 599 - 617
  • [3] Support vector machines in HTS data mining: Type I MetAPs inhibition study
    Fang, JW
    Dong, YH
    Lushington, GH
    Ye, QZ
    Georg, GI
    [J]. JOURNAL OF BIOMOLECULAR SCREENING, 2006, 11 (02) : 138 - 144
  • [4] Haykin S., 1999, NEURAL NETWORKS COMP, DOI DOI 10.1017/S0269888998214044
  • [5] HEBAR M, 2005, 9 C GEN FUT MED SAN
  • [6] *MATLAB, NEUR NETW WAV TOOLB
  • [7] Boosting neural networks
    Schwenk, H
    Bengio, Y
    [J]. NEURAL COMPUTATION, 2000, 12 (08) : 1869 - 1887
  • [8] SETIAWAN E, 1997, P ART NEUR NETW ENG, V7, P817
  • [9] SINGTZE B, 2002, PATTERN RECOGNITION
  • [10] Theodoridis S, 2006, PATTERN RECOGNITION, 3RD EDITION, P1