A Value-Based Approach for Training of Classifiers with High-Throughput Small Molecule Screening Data

被引:1
作者
Khuri, Natalia [1 ]
Parsons, Sarah [1 ]
机构
[1] Wake Forest Univ, Winston Salem, NC 27101 USA
来源
12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021) | 2021年
关键词
DRUG; INHIBITORS; CLASSIFICATION; DISCOVERY;
D O I
10.1145/3459930.3469514
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In many practical applications of machine learning, models are built using experimental data that are noisy, biased and of low quality. Binary classifiers trained with such data have low performance in independent and prospective tests. This work builds upon techniques for the estimation of the value of training data and evaluates a batch-based data valuation. Comparative experiments conducted in this work with seven challenging benchmarks, demonstrate that classification performance can be improved by 10% to 25% in independent tests, using value-based training of classifiers. Additionally, between 97% to 100% of class labels can be detected among low-valued training samples. Finally, results show that simpler and faster learning methods, such as generalized linear models, perform as well as complex gradient boosting trees when training data comprises only the high-valued samples extracted from high-throughput small molecule screens.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A new scalable approach for missing value imputation in high-throughput microarray data on apache spark
    Gupta, Madhuri
    Gupta, Bharat
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 23 (01) : 79 - 100
  • [22] Cell-Based Assays for High-Throughput Screening
    An, W. Frank
    Tolliday, Nicola
    MOLECULAR BIOTECHNOLOGY, 2010, 45 (02) : 180 - 186
  • [23] Inhibitors of Leishmania GDP-Mannose Pyrophosphorylase Identified by High-Throughput Screening of Small-Molecule Chemical Library
    Lackovic, Kurt
    Parisot, John P.
    Sleebs, Nerida
    Baell, Jonathan B.
    Debien, Laurent
    Watson, Keith G.
    Curtis, Joan M.
    Handman, Emanuela
    Street, Ian P.
    Kedzierski, Lukasz
    ANTIMICROBIAL AGENTS AND CHEMOTHERAPY, 2010, 54 (05) : 1712 - 1719
  • [24] Identification of small-molecule EGFR allosteric inhibitors by high-throughput docking
    Caporuscio, Fabiana
    Tinivella, Annachiara
    Restelli, Valentina
    Semrau, Marta S.
    Pinzi, Luca
    Storici, Paola
    Broggini, Massimo
    Rastelli, Giulio
    FUTURE MEDICINAL CHEMISTRY, 2018, 10 (13) : 1545 - 1553
  • [25] High-Throughput Screening Approach for Nanoporous Materials Genome Using Topological Data Analysis: Application to Zeolites
    Lee, Yongjin
    Barthel, Senja D.
    Dlotko, Pawel
    Moosavi, Seyed Mohamad
    Hess, Kathryn
    Smit, Berend
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2018, 14 (08) : 4427 - 4437
  • [26] Learning from the data: Mining of large high-throughput screening databases
    Yan, S. Frank
    King, Frederick J.
    He, Yun
    Caldwell, Jeremy S.
    Zhou, Yingyao
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (06) : 2381 - 2395
  • [27] Quantitative high-throughput screening data analysis: challenges and recent advances
    Shockley, Keith R.
    DRUG DISCOVERY TODAY, 2015, 20 (03) : 296 - 300
  • [28] A Novel Automated Framework for QSAR Modeling of Highly Imbalanced Leishmania High-Throughput Screening Data
    Casanova-Alvarez, Omar
    Morales-Helguera, Aliuska
    Angel Cabrera-Perez, Miguel
    Molina-Ruiz, Reinaldo
    Molina, Christophe
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (07) : 3213 - 3231
  • [29] New fluorescence-based high-throughput screening assay for small molecule inhibitors of tyrosyl-DNA phosphodiesterase 2 (TDP2)
    Ribeiro, Carlos J. A.
    Kankanala, Jayakanth
    Shi, Ke
    Kurahashi, Kayo
    Kiselev, Evgeny
    Ravji, Azhar
    Pommier, Yves
    Aihara, Hideki
    Wang, Zhengqiang
    EUROPEAN JOURNAL OF PHARMACEUTICAL SCIENCES, 2018, 118 : 67 - 79
  • [30] Added predictive value of high-throughput molecular data to clinical data and its validation
    Boulesteix, Anne-Laure
    Sauerbrei, Willi
    BRIEFINGS IN BIOINFORMATICS, 2011, 12 (03) : 215 - 229