A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction

被引:2
|
作者
Paulose, Renjith [1 ]
Jegatheesan, Kalirajan [2 ]
Balakrishnan, Gopal Samy [3 ]
机构
[1] Bharathiar Univ, Res & Dev Ctr, Coimbatore 641046, Tamil Nadu, India
[2] Thiagarajar Coll Autonomous, Ctr Res & PG Studies Bot & Biotechnol, Madurai, Tamil Nadu, India
[3] Liatris Biosci LLP, Dept Biotechnol, Kottayam, Kerala, India
关键词
Artificial neural network; big data; chemical absorption; distribution; metabolism; and excretion-toxicity screening; endocrine receptor disruption; Hadoop; machine learning; DRUG DISCOVERY; FINGERPRINTS;
D O I
10.4103/ijp.IJP_304_17
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
CONTEXT: Chemical toxicity prediction at early stage drug discovery phase has been researched for years, and newest methods are always investigated. Research data comprising chemical physicochemical properties, toxicity, assay, and activity details create massive data which are becoming difficult to manage. Identifying the desired featured chemical with the desired biological activity from millions of chemicals is a challenging task. AIMS: In this study, we investigate and explore big data technologies and machine learning approaches to do an efficient chemical data mining for endocrine receptor disruption prediction and virtual compound screening. The power of artificial neural network (ANN) in predicting chemicals' activity toward androgen receptor (AR) and estrogen receptor (ER) and thereby classifying into human endocrine disruptor or nondisruptor is investigated. SUBJECTS AND METHODS: Molecules are collected along with their Inhibitory Concentration (IC50) values toward AR and ER. Training and test datasets are created with active and inactive classes of molecules. Molecular fingerprints of Electro Topological State (E-State) are generated for describing every compound. ANN machine learning model is created using Apache Spark and implemented in Hadoop big data environment. Test chemical's structural similarity toward active class of training compounds is estimated and combined with ANN model for improving prediction accuracy. RESULTS: AR and ER predictive models applied on corresponding test datasets gave 86.31% and 89.57% accuracies, respectively, in correctly classifying molecules as disruptor or nondisruptor. Molecular fragments and functional groups are ranked based on their importance in forming ANN model and influence toward the AR and ER disruption behavior. Training molecules that are specific to the test molecules' endocrine disruption prediction are retrieved based on the structural similarity values. CONCLUSIONS: The current study demonstrates a new approach of chemical endocrine receptor disruption prediction combining ANN machine learning method and molecular similarity in a big data environment. This method of predictive modeling can be further tested with more receptors and hormones and predictive power can be examined.
引用
收藏
页码:169 / 176
页数:8
相关论文
共 50 条
  • [21] Approach to the neural-network-based data mining
    Zheng, Zhijun
    Lin, Xiaguang
    Zheng, Shouqi
    Xi'an Jianzhu Keji Daxue Xuebao/Journal of Xi'an University of Architecture & Technology, 2000, 32 (01): : 28 - 30
  • [22] An Efficient Electricity Generation Forecasting System Using Artificial Neural Network Approach with Big Data
    Rahman, Mohammad Naimur
    Esmailpour, Amir
    2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 213 - 217
  • [23] Application of BP Neural Network Based on Petrophysical Big Data Mining
    Yu, Ding
    Yuan Shixiong
    Rui, Deng
    Luo Chenxiang
    JOURNAL OF INTERCONNECTION NETWORKS, 2022, 22 (SUPP02)
  • [24] Redundancy Avoidance for Big Data in Data Centers: A Conventional Neural Network Approach
    Xu, Chenhan
    Wang, Kun
    Sun, Yanfei
    Guo, Song
    Zomaya, Albert Y.
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (01): : 104 - 114
  • [25] Heart diseases prediction with Data Mining and Neural Network Techniques
    Rathnayake, Bandaragc Shchani Sanketha
    Gancgoda, Gamagc Upcksha
    2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [26] Interpolating paleovegetation data with an artificial neural network approach
    Grieger, B
    GLOBAL AND PLANETARY CHANGE, 2002, 34 (3-4) : 199 - 208
  • [27] An artificial neural network approach to the solution of molecular chemical equilibrium
    Ramos, A. Asensio
    Socas-Navarro, H.
    Astronomy and Astrophysics, 2005, 438 (03): : 1021 - 1028
  • [28] An artificial neural network approach to the solution of molecular chemical equilibrium
    Ramos, AA
    Socas-Navarro, H
    ASTRONOMY & ASTROPHYSICS, 2005, 438 (03): : 1021 - 1028
  • [29] Data Preparation for Data Mining in Chemical Plants using Big Data
    Borrison, Reuben
    Kloepper, Benjamin
    Mullen, Jennifer
    2019 IEEE 17TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2019, : 1185 - 1191
  • [30] Application of Data Mining on Fault Detection and Prediction in Boiler of Power Plant Using Artificial Neural Network
    Rakhshani, Elyas
    Sariri, Iman
    Rouzbehi, Kumars
    2009 INTERNATIONAL CONFERENCE ON POWER ENGINEERING, ENERGY AND ELECTRICAL DRIVES, 2009, : 462 - +