Classification of Imbalanced Bioassay Data with Features Learned Using Stacked Autoencoder

被引:0
|
作者
Shah, Jeni [1 ]
Joshi, Manjunath [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, India
关键词
Stacked Autoencoder; SMOTE; Imbalanced data;
D O I
10.1117/12.2679627
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bioassay data classification is an important task in drug discovery. However, the data used in classification is highly imbalanced, leading to inaccuracies in classification for the minority class. We propose a novel approach for classification in which we train separate models by using different features that are derived by training stacked autoencoders (SAE). Experiments are performed on 7 bioassay datasets, in which each data file consists of feature descriptors for every compound along with class label of compound being active, or inactive. We first perform data cleaning using borderline synthetic minority oversampling technique (SMOTE) followed by removing the Tomek links, and then learn different features hierarchically, based on the cleaned data or feature vectors. We then train separate cost-sensitive feed-forward neural network (FNN) classifiers using the hierarchical features in order to obtain the final classification. To increase the True Positive Rate (TPR), a test sample is labeled as active if at least one classifier predicts it as active. In this paper, we demonstrate that by data cleaning and learning separate classifiers one can improve the TPR and F1 score when compared to other machine learning approaches. To the best of our knowledge, the researchers have not yet attempted the use of SAE and FNN for classifying bioassay data.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] An Improved Stacked Autoencoder for Metabolomic Data Classification
    Fan, Xiaojing
    Wang, Xiye
    Jiang, Mingyang
    Pei, Zhili
    Qiao, Shicheng
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [2] Classification of data on stacked autoencoder using modified sigmoid activation function
    Kumar, Arvind
    Sodhi, Sartaj Singh
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (01) : 1 - 18
  • [3] Extract Features Using Stacked Denoised Autoencoder
    Gao, Yushu
    Zhu, Lin
    Zhu, Hao-Dong
    Gan, Yong
    Shang, Li
    INTELLIGENT COMPUTING IN BIOINFORMATICS, 2014, 8590 : 10 - 14
  • [4] Classification of Human Activity by Using a Stacked Autoencoder
    Badem, Hasan
    Caliskan, Abdullah
    Basturk, Alper
    Yuksel, Mehmet Emin
    2016 MEDICAL TECHNOLOGIES NATIONAL CONFERENCE (TIPTEKNO), 2015,
  • [5] Stacked Sparse Autoencoder in PolSAR Data Classification Using Local Spatial Information
    Zhang, Lu
    Ma, Wenping
    Zhang, Dan
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (09) : 1359 - 1363
  • [6] Hyperspectral image data classification with refined spectral spatial features based on stacked autoencoder approach
    Menezes J.
    Poojary N.
    Recent Patents on Engineering, 2021, 15 (02) : 140 - 149
  • [7] IMBALANCED DATA CLASSIFICATION BASED ON EXTREME LEARNING MACHINE AUTOENCODER
    Shen, Chu
    Zhang, Su-Fang
    Zhai, Jun-Hal
    Luo, Ding-Sheng
    Chen, Jun-Fen
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2018, : 399 - 404
  • [8] An Imbalanced Data Classification Algorithm of Improved Autoencoder Neural Network
    Zhang, Chenggang
    Song, Jiazhi
    Gao, Wei
    Jiang, Jinqing
    2016 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2016, : 95 - 99
  • [9] ADA-INCVAE: Improved data generation using variational autoencoder for imbalanced classification
    Huang, Kai
    Wang, Xiaoguo
    APPLIED INTELLIGENCE, 2022, 52 (03) : 2838 - 2853
  • [10] Classification of Imbalanced Data Using SMOTE and AutoEncoder Based Deep Convolutional Neural Network
    Alex, Suja A.
    Nayahi, J. Jesu Vedha
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2023, 31 (03) : 437 - 469