DeepTox: Toxicity Prediction using Deep Learning

被引:608
作者
Mayr, Andreas [1 ,2 ]
Klambauer, Gunter [1 ]
Unterthiner, Thomas [1 ,2 ]
Hochreiter, Sepp [1 ]
机构
[1] Johannes Kepler Univ Linz, Inst Bioinformat, Linz, Austria
[2] Johannes Kepler Univ Linz, RISC Software GmbH, Hagenberg, Austria
关键词
Deep Learning; deep networks; Tox21; machine learning; tox prediction; toxicophores; challenge winner; neural networks;
D O I
10.3389/fenvs.2015.00080
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Tox21 Data Challenge has been the largest effort of the scientific community to compare computational methods for toxicity prediction. This challenge comprised 12,000 environmental chemicals and drugs which were measured for 12 different toxic effects by specifically designed assays. We participated in this challenge to assess the performance of Deep Learning in computational toxicity prediction. Deep Learning has already revolutionized image processing, speech recognition, and language understanding but has not yet been applied to computational toxicity. Deep Learning is founded on novel algorithms and architectures for artificial neural networks together with the recent availability of very fast computers and massive datasets. It discovers multiple levels of distributed representations of the input, with higher levels representing more abstract concepts. We hypothesized that the construction of a hierarchy of chemical features gives Deep Learning the edge over other toxicity prediction methods. Furthermore, Deep Learning naturally enables multi-task learning, that is, learning of all toxic effects in one neural network and thereby learning of highly informative chemical features. In order to utilize Deep Learning for toxicity prediction, we have developed the DeepTox pipeline. First, DeepTox normalizes the chemical representations of the compounds. Then it computes a large number of chemical descriptors that are used as input to machine learning methods. In its next step, DeepTox trains models, evaluates them, and combines the best of them to ensembles. Finally, DeepTox predicts the toxicity of new compounds. In the Tox21 Data Challenge, DeepTox had the highest performance of all computational methods winning the grand challenge, the nuclear receptor panel, the stress response panel, and six single assays (teams Bioinf@JKU"). We found that Deep Learning excelled in toxicity prediction and outperformed many other computational approaches like naive Bayes, support vector machines, and random forests.
引用
收藏
页数:15
相关论文
共 72 条
[11]   Nuclear receptors and lipid physiology: Opening the X-files [J].
Chawla, A ;
Repa, JJ ;
Evans, RM ;
Mangelsdorf, DJ .
SCIENCE, 2001, 294 (5548) :1866-1870
[12]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[13]   Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks [J].
Ciresan, Dan C. ;
Giusti, Alessandro ;
Gambardella, Luca M. ;
Schmidhuber, Juergen .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2013, PT II, 2013, 8150 :411-418
[14]  
Clevert D.-A., 2015, ADV NEURAL INFORM PR, P1846
[15]  
Dahl G.E., 2014, ARXIV14061231
[16]   Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J].
Dahl, George E. ;
Yu, Dong ;
Deng, Li ;
Acero, Alex .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :30-42
[17]   Support vector machines: Development of QSAR models for predicting anti-HIV-1 activity of TIBO derivatives [J].
Darnag, Rachid ;
Mostapha Mazouz, E. L. ;
Schmitzer, Andreea ;
Villemin, Didier ;
Jarid, Abdellah ;
Cherqaoui, Driss .
EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY, 2010, 45 (04) :1590-1597
[18]  
Deng L, 2013, IEEE INT NEW CIRC
[19]   Prediction of human population responses to toxic compounds by a collaborative competition [J].
Eduati, Federica ;
Mangravite, Lara M. ;
Wang, Tao ;
Tang, Hao ;
Bare, J. Christopher ;
Huang, Ruili ;
Norman, Thea ;
Kellen, Mike ;
Menden, Michael P. ;
Yang, Jichen ;
Zhan, Xiaowei ;
Zhong, Rui ;
Xiao, Guanghua ;
Xia, Menghang ;
Abdo, Nour ;
Kosyk, Oksana ;
Friend, Stephen ;
Dearry, Allen ;
Simeonov, Anton ;
Tice, Raymond R. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Stolovitzky, Gustavo ;
Xie, Yang ;
Saez-Rodriguez, Julio ;
Aittokallio, Tero ;
Alaimo, Salvatore ;
Amadoz, Alicia ;
Ammad-ud-din, Muhammad ;
Azencott, Chloe-Agathe ;
Bacardit, Jaume ;
Barron, Pelham ;
Bernard, Elsa ;
Beyer, Andreas ;
Bin, Shao ;
van Boemmel, Alena ;
Borgwardt, Karsten ;
Brys, April M. ;
Caffrey, Brian ;
Chang, Jeffrey ;
Chang, Jungsoo ;
Chheda, Himanshu ;
Christodoulou, Eleni G. ;
Clement-Ziza, Mathieu ;
Cohen, Trevor ;
Cowherd, Marianne ;
Demeyer, Sofie ;
Dopazo, Joaquin ;
Elhard, Joel D. ;
Falcao, Andre O. .
NATURE BIOTECHNOLOGY, 2015, 33 (09) :933-+
[20]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22