Feasibility of Active Machine Learning for Multiclass Compound Classification

被引:30
作者
Lang, Tobias [1 ,2 ]
Flachsenberg, Florian [1 ]
von Luxburg, Ulrike [3 ]
Rarey, Matthias [1 ]
机构
[1] Univ Hamburg, Ctr Bioinformat, D-20146 Hamburg, Germany
[2] Univ Hamburg, Dept Comp Sci, Schluterstr 70, D-20146 Hamburg, Germany
[3] Univ Tubingen, Dept Comp Sci, D-72076 Tubingen, Germany
关键词
DISCOVERY; TOOL;
D O I
10.1021/acs.jcim.5b00332
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [41] Active Learning for Convenient Annotation and Classification of Secondary Ion Mass Spectrometry Images
    Hanselmann, Michael
    Roeder, Jens
    Koethe, Ullrich
    Renard, Bernhard Y.
    Heeren, Ron M. A.
    Hamprecht, Fred A.
    ANALYTICAL CHEMISTRY, 2013, 85 (01) : 147 - 155
  • [42] Classification of breast cancer patients using somatic mutation profiles and machine learning approaches
    Vural, Suleyman
    Wang, Xiaosheng
    Guda, Chittibabu
    BMC SYSTEMS BIOLOGY, 2016, 10
  • [43] Automatic classification of mice vocalizations using Machine Learning techniques and Convolutional Neural Networks
    Premoli, Marika
    Baggi, Daniele
    Bianchetti, Marco
    Gnutti, Alessandro
    Bondaschi, Marco
    Mastinu, Andrea
    Migliorati, Pierangelo
    Signoroni, Alberto
    Leonardi, Riccardo
    Memo, Maurizio
    Bonini, Sara Anna
    PLOS ONE, 2021, 16 (01):
  • [44] Machine Learning for Patient-Specific Quality Assurance of VMAT: Prediction and Classification Accuracy
    Li, Jiaqi
    Wang, Le
    Zhang, Xile
    Liu, Lu
    Li, Jun
    Chan, Maria F.
    Sui, Jing
    Yang, Ruijie
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2019, 105 (04): : 893 - 902
  • [45] Classification of Soft Keyboard Typing Behaviors Using Mobile Device Sensors with Machine Learning
    Yuksel, Asim Sinan
    Senel, Fatih Ahmet
    Cankaya, Ibrahim Arda
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (04) : 3929 - 3942
  • [46] Fluorescent Machine Learning Aided Classification of Pathogenic Bacteria Using the Excitation Emission Matrix
    Sundaramoorthy, Anandh
    Thoufeeq, Jamal Mohamed
    Ganesan, Bharanidharan
    Prakasarao, Aruna
    Ganesan, Singaravelu
    ANALYTICAL LETTERS, 2025, 58 (01) : 136 - 151
  • [47] Rock Layer Classification and Identification in Ground-Penetrating Radar via Machine Learning
    Xu, Hong
    Yan, Jie
    Feng, Guangliang
    Jia, Zhuo
    Jing, Peiqi
    REMOTE SENSING, 2024, 16 (08)
  • [48] Improved machine learning classifiers combined with a stochastic local search for Web services classification
    Laachemi, Abdelouahab
    Boughaci, Dalila
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2020, 14 (04): : 581 - 609
  • [49] MultiClassMetabo: A Superior Classification Model Constructed Using Metabolic Markers in Multiclass Metabolomics
    Yang, Qingxia
    Chen, Shuman
    Jiang, Wenyu
    Mi, Lan
    Liu, Jiarui
    Hu, Yu
    Ji, Xinglai
    Wang, Jun
    Zhu, Feng
    ANALYTICAL CHEMISTRY, 2024, 96 (04) : 1410 - 1418
  • [50] Machine learning for antimicrobial peptide identification and design
    Wan, Fangping
    Wong, Felix
    Collins, James J.
    de la Fuente-nunez, Cesar
    NATURE REVIEWS BIOENGINEERING, 2024, 2 (05): : 392 - 407