Feasibility of Active Machine Learning for Multiclass Compound Classification

被引:30
|
作者
Lang, Tobias [1 ,2 ]
Flachsenberg, Florian [1 ]
von Luxburg, Ulrike [3 ]
Rarey, Matthias [1 ]
机构
[1] Univ Hamburg, Ctr Bioinformat, D-20146 Hamburg, Germany
[2] Univ Hamburg, Dept Comp Sci, Schluterstr 70, D-20146 Hamburg, Germany
[3] Univ Tubingen, Dept Comp Sci, D-72076 Tubingen, Germany
关键词
DISCOVERY; TOOL;
D O I
10.1021/acs.jcim.5b00332
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [1] Multiclass Classification of Cancer Based on Microarray Data Using Extreme Learning Machine
    Khadijah
    Rismiyati
    Mantau, Aprinaldi Jasa
    2017 1ST INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS), 2017, : 159 - 164
  • [2] Speaking with mask in the COVID-19 era: Multiclass machine learning classification of acoustic and perceptual parameters
    Cala, F.
    Manfredi, C.
    Battilocchi, L.
    Frassineti, L.
    Cantarella, G.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (02) : 1204 - 1218
  • [3] Classification of Hemilabile Ligands Using Machine Learning
    Kevlishvili, Ilia
    Duan, Chenru
    Kulik, Heather J.
    JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2023, 14 (49) : 11100 - 11109
  • [4] Application of Machine Learning to Proteomics Data: Classification and Biomarker Identification in Postgenomics Biology
    Swan, Anna Louise
    Mobasheri, Ali
    Allaway, David
    Liddell, Susan
    Bacardit, Jaume
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2013, 17 (12) : 595 - 610
  • [5] MACHINING WITH ACTIVE DRIVEN ROTARY TOOL ON COMPOUND MULTIAXIS MACHINE TOOL
    Shibasaka, Toshiroh
    Suryadiwans, Harun
    ANNALS OF DAAAM FOR 2009 & PROCEEDINGS OF THE 20TH INTERNATIONAL DAAAM SYMPOSIUM, 2009, 20 : 1729 - 1730
  • [6] Machine learning approaches for large scale classification of produce
    Gupta, Otkrist
    Das, Anshuman J.
    Hellerstein, Joshua
    Raskar, Ramesh
    SCIENTIFIC REPORTS, 2018, 8
  • [7] Machine learning classification of SDSS transient survey images
    du Buisson, L.
    Sivanandam, N.
    Bassett, Bruce A.
    Smith, M.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2015, 454 (02) : 2026 - 2038
  • [8] Machine learning aided classification of tremor in multiple sclerosis
    Hossen, Abdulnasir
    Anwar, Abdul Rauf
    Koirala, Nabin
    Ding, Hao
    Budker, Dmitry
    Wickenbrock, Arne
    Heute, Ulrich
    Groppa, Sergiu
    Muthuraman, Muthuraman
    Deuschl, Gunther
    EBIOMEDICINE, 2022, 82
  • [9] Machine Learning Based Emergency Patient Classification System
    Puttinaovarat, Supattra
    Pruitikanee, Siwipa
    Kongcharoen, Jinda
    Horkaew, Paramate
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2021, 17 (05) : 133 - 146
  • [10] Exploring chemical compound space with quantum-based machine learning
    von Lilienfeld, O. Anatole
    Mueller, Klaus-Robert
    Tkatchenko, Alexandre
    NATURE REVIEWS CHEMISTRY, 2020, 4 (07) : 347 - 358