Inductive Transfer of Knowledge: Application of Multi-Task Learning and Feature Net Approaches to Model Tissue-Air Partition Coefficients

被引:66
作者
Varnek, Alexandre [1 ]
Gaudin, Cedric [1 ]
Marcou, Gilles [1 ]
Baskin, Igor [2 ]
Pandey, Anil Kumar [3 ]
Tetko, Igor V. [3 ,4 ]
机构
[1] Univ Strasbourg, CNRS, UMR 7177, Lab Infochim, F-67000 Strasbourg, France
[2] Moscow MV Lomonosov State Univ, Dept Chem, Moscow 119991, Russia
[3] Inst Bioinformat & Syst Biol, Helmholtz Zentrum Munchen, German Res Ctr Environm Hlth GmbH, D-85764 Neuherberg, Germany
[4] Natl Ukrainian Acad Sci, Inst Bioorgan & Petrochem, UA-02660 Kiev, Ukraine
关键词
STRUCTURE-PROPERTY; NEURAL NETWORKS; COMBINATORIAL LIBRARY; DATA SETS; PREDICTION; BIAS; LIPOPHILICITY; ALOGPS-2.1; GENERATION; ALGORITHM;
D O I
10.1021/ci8002914
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Two inductive knowledge transfer approaches - multitask learning (MTL) and Feature Net (FN) - have been used to build predictive neural networks (ASNN) and PLS models for I I types of tissue-air partition coefficients (TAPC). Unlike conventional single-task learning (STL) modeling focused only on a single target property without any relations to other properties, in the framework of inductive transfer approach, the individual models are viewed as nodes in the network of interrelated models built in parallel (MTL) or sequentially (FN). It has been demonstrated that MTL and FN techniques are extremely useful in structure-property modeling on small and structurally diverse data sets, when conventional STL modeling is unable to produce any predictive model. The predictive STL individual models were obtained for 4 out of I I TAPC, whereas application of inductive knowledge transfer techniques resulted in models for 9 TAPC. Differences in prediction performances of the models as a function of the machine-learning method, and of the number of properties simultaneously involved in the learning, has been discussed.
引用
收藏
页码:133 / 144
页数:12
相关论文
共 69 条
[1]  
Abu-Mostafa Y. S., 1990, Journal of Complexity, V6, P192, DOI 10.1016/0885-064X(90)90006-Y
[2]  
*ACC INC, DIVA 2 1 PROGR
[3]  
[Anonymous], STUDY CROSS VALIDATI
[4]  
[Anonymous], PROCEEDINGS OF THE S
[5]  
[Anonymous], 1993, P 1993 CONN MOD SUMM
[6]  
[Anonymous], 1998, LEARNING LEARN, DOI DOI 10.1007/978-1-4615-5529-2_8
[7]  
[Anonymous], 1996, ICML. Ed. by
[8]   In silico approaches to prediction of aqueous and DMSO solubility of drug-like compounds:: Trends, problems and solutions [J].
Balakin, KV ;
Savchuk, NP ;
Tetko, IV .
CURRENT MEDICINAL CHEMISTRY, 2006, 13 (02) :223-241
[9]  
BASKIN II, 1993, DOKL AKAD NAUK+, V332, P713
[10]   A Bayesian information theoretic model of learning to learn via multiple task sampling [J].
Baxter, J .
MACHINE LEARNING, 1997, 28 (01) :7-39