Conformal prediction to define applicability domain - A case study on predicting ER and AR binding

被引:24
作者
Norinder, U. [1 ,2 ]
Rybacka, A. [3 ]
Andersson, P. L. [3 ]
机构
[1] Swedish Toxicol Sci Res Ctr, Sodertalje, Sweden
[2] Stockholm Univ, Dept Comp & Syst Sci, Kista, Sweden
[3] Umea Univ, Dept Chem, Umea, Sweden
基金
瑞典研究理事会;
关键词
Conformal prediction; oestrogen receptor; androgen receptor; random forest; signature descriptors; ENDOCRINE-DISRUPTING CHEMICALS; ENVIRONMENTAL CHEMICALS; DIVERSE SET; QSAR MODELS; ESTROGEN; MACHINE; CLASSIFICATION; IDENTIFICATION; TRANSPARENT; SELECTION;
D O I
10.1080/1062936X.2016.1172665
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A fundamental element when deriving a robust and predictive in silico model is not only the statistical quality of the model in question but, equally important, the estimate of its predictive boundaries. This work presents a new method, conformal prediction, for applicability domain estimation in the field of endocrine disruptors. The method is applied to binders and non-binders related to the oestrogen and androgen receptors. Ensembles of decision trees are used as statistical method and three different sets (dragon, rdkit and signature fingerprints) are investigated as chemical descriptors. The conformal prediction method results in valid models where there is an excellent balance in quality between the internally validated training set and the corresponding external test set, both in terms of validity and with respect to sensitivity and specificity. With this method the level of confidence can be readily altered by the user and the consequences thereof immediately inspected. Furthermore, the predictive boundaries for the derived models are rigorously defined by using the conformal prediction framework, thus no ambiguity exists as to the level of similarity needed for new compounds to be in or out of the predictive boundaries of the derived models where reliable predictions can be expected.
引用
收藏
页码:303 / 316
页数:14
相关论文
共 44 条
  • [1] [Anonymous], 2004, JOINT M CHEM COMM WO
  • [2] [Anonymous], 2012, STATE SCI ENDOCRINE
  • [3] [Anonymous], 2007, JOINT M CHEM COMM WO
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Carlsson L., 2014, ARTIFICIAL INTELLIGE
  • [6] Endocrine-Disrupting Chemicals: An Endocrine Society Scientific Statement
    Diamanti-Kandarakis, Evanthia
    Bourguignon, Jean-Pierre
    Giudice, Linda C.
    Hauser, Russ
    Prins, Gail S.
    Soto, Ana M.
    Zoeller, R. Thomas
    Gore, Andrea C.
    [J]. ENDOCRINE REVIEWS, 2009, 30 (04) : 293 - 342
  • [7] The EDKB: an established knowledge base for endocrine disrupting chemicals
    Ding, Don
    Xu, Lei
    Fang, Hong
    Hong, Huixiao
    Perkins, Roger
    Harris, Steve
    Bearden, Edward D.
    Shi, Leming
    Tong, Weida
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [8] Eklund M., 2013, ANN MATH ARTIF INTEL, DOI DOI 10.1007/SI0472-013-9378-2
  • [9] Eklund M., 2012, IFIP AICT, V382, P166, DOI DOI 10.1007/978-3-642-33412-2_17
  • [10] Study of 202 natural, synthetic, and environmental chemicals for binding to the androgen receptor
    Fang, H
    Tong, WD
    Branham, WS
    Moland, CL
    Dial, SL
    Hong, HX
    Xie, Q
    Perkins, R
    Owens, W
    Sheehan, DM
    [J]. CHEMICAL RESEARCH IN TOXICOLOGY, 2003, 16 (10) : 1338 - 1358