Check Your Confidence: Size Really Does Matter

被引:12
作者
Carlson, Heather A. [1 ]
机构
[1] Univ Michigan, Coll Pharm, Dept Med Chem, Ann Arbor, MI 48109 USA
关键词
CSAR BENCHMARK EXERCISE;
D O I
10.1021/ci4004249
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Heather A. Carlson describes the influence of many factors on linear regression and highlights the limits that they impose. A simple least-squares linear regression starts with a fit line that must intersect the point, the average calculated and experimental values. The fit is then dictated by finding a slope that minimizes the squared distances in the y direction between the data points and the fit line. A tighter correlation means better agreement between the data points and the fit line; therefore, there are smaller residuals and a tighter distribution of those residuals around the value zero. A tighter distribution means that there is a smaller standard deviation of the distribution of the residuals for the data points and a higher R. Experimental error can be incorporated as error bars in the y direction if there is a need to address some difference in uncertainty between different data points. This simply calls for a weighted linear regression.
引用
收藏
页码:1837 / 1841
页数:5
相关论文
共 8 条
[1]   Sample size requirements for estimating Pearson, Kendall and Spearman correlations [J].
Bonett, DG ;
Wright, TA .
PSYCHOMETRIKA, 2000, 65 (01) :23-28
[2]   Healthy skepticism: assessing realistic model performance [J].
Brown, Scott P. ;
Muchmore, Steven W. ;
Hajduk, Philip J. .
DRUG DISCOVERY TODAY, 2009, 14 (7-8) :420-427
[3]  
Dunbar JB, 2011, J CHEM INF MODEL, V51, P2036, DOI 10.1021/ci200082t
[4]   ChEMBL: a large-scale bioactivity database for drug discovery [J].
Gaulton, Anna ;
Bellis, Louisa J. ;
Bento, A. Patricia ;
Chambers, Jon ;
Davies, Mark ;
Hersey, Anne ;
Light, Yvonne ;
McGlinchey, Shaun ;
Michalovich, David ;
Al-Lazikani, Bissan ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D1100-D1107
[5]  
Hogg R. V, 2001, PROBABILITY STAT INF, P402
[6]   Comparability of Mixed IC50 Data - A Statistical Analysis [J].
Kalliokoski, Tuomo ;
Kramer, Christian ;
Vulpetti, Anna ;
Gedeck, Peter .
PLOS ONE, 2013, 8 (04)
[7]   The Experimental Uncertainty of Heterogeneous Public Ki Data [J].
Kramer, Christian ;
Kalliokoski, Tuomo ;
Gedeck, Peter ;
Vulpetti, Anna .
JOURNAL OF MEDICINAL CHEMISTRY, 2012, 55 (11) :5165-5173
[8]   CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions [J].
Smith, Richard D. ;
Dunbar, James B., Jr. ;
Ung, Peter Man-Un ;
Esposito, Emilio X. ;
Yang, Chao-Yie ;
Wang, Shaomeng ;
Carlson, Heather A. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2011, 51 (09) :2115-2131