Classification of Imbalanced Bioassay Data with Features Learned Using Stacked Autoencoder

被引:0
|
作者
Shah, Jeni [1 ]
Joshi, Manjunath [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, India
关键词
Stacked Autoencoder; SMOTE; Imbalanced data;
D O I
10.1117/12.2679627
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bioassay data classification is an important task in drug discovery. However, the data used in classification is highly imbalanced, leading to inaccuracies in classification for the minority class. We propose a novel approach for classification in which we train separate models by using different features that are derived by training stacked autoencoders (SAE). Experiments are performed on 7 bioassay datasets, in which each data file consists of feature descriptors for every compound along with class label of compound being active, or inactive. We first perform data cleaning using borderline synthetic minority oversampling technique (SMOTE) followed by removing the Tomek links, and then learn different features hierarchically, based on the cleaned data or feature vectors. We then train separate cost-sensitive feed-forward neural network (FNN) classifiers using the hierarchical features in order to obtain the final classification. To increase the True Positive Rate (TPR), a test sample is labeled as active if at least one classifier predicts it as active. In this paper, we demonstrate that by data cleaning and learning separate classifiers one can improve the TPR and F1 score when compared to other machine learning approaches. To the best of our knowledge, the researchers have not yet attempted the use of SAE and FNN for classifying bioassay data.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Jointly Using Deep Model Learned Features and Traditional Visual Features in a Stacked SVM for Medical Subfigure Classification
    Wang, Hongyu
    Zhang, Jianpeng
    Xia, Yong
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING, ISCIDE 2017, 2017, 10559 : 191 - 199
  • [22] Plant classification based on Stacked Autoencoder
    Yang, Meng-Meng
    Nayeem, Arifur
    Shen, Ling-Ling
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 1082 - 1086
  • [23] MMD-encouraging convolutional autoencoder: a novel classification algorithm for imbalanced data
    Bin Li
    Xiaofeng Gong
    Chen Wang
    Ruijuan Wu
    Tong Bian
    Yanming Li
    Zhiyuan Wang
    Ruisen Luo
    Applied Intelligence, 2021, 51 : 7384 - 7401
  • [24] MMD-encouraging convolutional autoencoder: a novel classification algorithm for imbalanced data
    Li, Bin
    Gong, Xiaofeng
    Wang, Chen
    Wu, Ruijuan
    Bian, Tong
    Li, Yanming
    Wang, Zhiyuan
    Luo, Ruisen
    APPLIED INTELLIGENCE, 2021, 51 (10) : 7384 - 7401
  • [25] KDSAE: Chronic kidney disease classification with multimedia data learning using deep stacked autoencoder network
    Aditya Khamparia
    Gurinder Saini
    Babita Pandey
    Shrasti Tiwari
    Deepak Gupta
    Ashish Khanna
    Multimedia Tools and Applications, 2020, 79 : 35425 - 35440
  • [26] EEG Signal Clustering With Learned Features Using Deep Autoencoder
    Villazana, Sergio
    Seijas, Cesar
    Montilla, Guillermo
    Perez, Egilda
    INGENIERIA UC, 2021, 28 (01): : 180 - 192
  • [27] KDSAE: Chronic kidney disease classification with multimedia data learning using deep stacked autoencoder network
    Khamparia, Aditya
    Saini, Gurinder
    Pandey, Babita
    Tiwari, Shrasti
    Gupta, Deepak
    Khanna, Ashish
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35425 - 35440
  • [28] Classification of Alzheimer's Disease Using Stacked Sparse Convolutional Autoencoder
    Baydargil, Husnu Baris
    Park, Jang-Sik
    Kang, Do-Young
    2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 891 - 895
  • [29] Lung Sounds Classification Using Stacked Autoencoder and Support Vector Machine
    Falah, Adnan Hassal
    Jondri
    2019 7TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT), 2019, : 460 - 464
  • [30] Cervical cancer classification using sparse stacked autoencoder and fuzzy ARTMAP
    Liaw L.C.M.
    Tan S.C.
    Goh P.Y.
    Lim C.P.
    Neural Computing and Applications, 2024, 36 (22) : 13895 - 13913