Classification of toxicity effects of biotransformed hepatic drugs using whale optimized support vector machines

被引:63
作者
Tharwat, Alaa [1 ,2 ]
Moemen, Yasmine S. [2 ,3 ]
Hassanien, Aboul Ella [2 ,4 ]
机构
[1] Suez Canal Univ, Fac Engn, Ismailia, Egypt
[2] Sci Res Grp Egypt SRGE, Cairo, Egypt
[3] Menoufia Univ, Natl Liver Inst, Dept Clin Pathol, Menoufia, Egypt
[4] Cairo Univ, Fac Comp & Informat, Giza, Egypt
关键词
Imbalanced dataset; Random sampling; Synthetic Minority Over-sampling; Technique (SMOTE); Support Vector Machines (SVM); Whale Optimization Algorithm (WOA); Toxic effects; FEATURE-SELECTION; IMBALANCED DATA; ROUGH SETS; SYSTEM; SVM; PERFORMANCE; PREDICTION; PARAMETERS; DISCOVERY; BEHAVIOR;
D O I
10.1016/j.jbi.2017.03.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Measuring toxicity is an important step in drug development. Nevertheless, the current experimental methods used to estimate the drug toxicity are expensive and time-consuming, indicating that they are not suitable for large-scale evaluation of drug toxicity in the early stage of drug development. Hence, there is a high demand to develop computational models that can predict the drug toxicity risks. In this study, we used a dataset that consists of 553 drugs that biotransformed in liver. The toxic effects were calculated for the current data, namely, mutagenic, tumorigenic, irritant and reproductive effect. Each drug is represented by 31 chemical descriptors (features). The proposed model consists of three phases. In the first phase, the most discriminative subset of features is selected using rough set-based methods to reduce the classification time while improving the classification performance. In the second phase, different sampling methods such as Random Under-Sampling, Random Over-Sampling and Synthetic Minority Oversampling Technique (SMOTE), BorderLine SMOTE and Safe Level SMOTE are used to solve the problem of imbalanced dataset. In the third phase, the Support Vector Machines (SVM) classifier is used to classify an unknown drug into toxic or non-toxic. SVM parameters such as the penalty parameter and kernel parameter have a great impact on the classification accuracy of the model. In this paper, Whale Optimization Algorithm (WOA) has been proposed to optimize the parameters of SVM, so that the classification error can be reduced. The experimental results proved that the proposed model achieved high sensitivity to all toxic effects. Overall, the high sensitivity of the WOA + SVM model indicates that it could be used for the prediction of drug toxicity in the early stage of drug development. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:132 / 149
页数:18
相关论文
共 72 条
  • [1] Applying support vector machines to imbalanced datasets
    Akbani, R
    Kwek, S
    Japkowicz, N
    [J]. MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 39 - 50
  • [2] [Anonymous], 2001, Learning with Kernels |
  • [3] [Anonymous], 2004, P IRIS MACH LEARN WO
  • [4] [Anonymous], 2014, COMBINING PATTERN CL
  • [5] [Anonymous], 2006, 23 INT C MACH LEARN, DOI [10.1145/1143844.1143874, DOI 10.1145/1143844.1143874]
  • [6] [Anonymous], INT JOINT C ART INT
  • [7] Automatic model selection for the optimization of SVM kernels
    Ayat, NE
    Cheriet, M
    Suen, CY
    [J]. PATTERN RECOGNITION, 2005, 38 (10) : 1733 - 1745
  • [8] Toward an optimal SVM classification system for hyperspectral remote sensing images
    Bazi, Yakoub
    Melgani, Farid
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2006, 44 (11): : 3374 - 3385
  • [9] NMR-based metabonomic toxicity classification: hierarchical cluster analysis and k-nearest-neighbour approaches
    Beckonert, O
    Bollard, ME
    Ebbels, TMD
    Keun, HC
    Antti, H
    Holmes, E
    Lindon, JC
    Nicholson, JK
    [J]. ANALYTICA CHIMICA ACTA, 2003, 490 (1-2) : 3 - 15
  • [10] POM analyses of antitrypanosomal activity of 2-iminobenzimidazoles: favorable and unfavorable parameters for drugs optimization
    Ben Hadda, Taibi
    Mouhoub, Rahima
    Jawarkar, Rahul
    Masand, Vijay
    Warad, Ismail
    [J]. MEDICINAL CHEMISTRY RESEARCH, 2013, 22 (05) : 2437 - 2445