A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction

被引:2
|
作者
Paulose, Renjith [1 ]
Jegatheesan, Kalirajan [2 ]
Balakrishnan, Gopal Samy [3 ]
机构
[1] Bharathiar Univ, Res & Dev Ctr, Coimbatore 641046, Tamil Nadu, India
[2] Thiagarajar Coll Autonomous, Ctr Res & PG Studies Bot & Biotechnol, Madurai, Tamil Nadu, India
[3] Liatris Biosci LLP, Dept Biotechnol, Kottayam, Kerala, India
关键词
Artificial neural network; big data; chemical absorption; distribution; metabolism; and excretion-toxicity screening; endocrine receptor disruption; Hadoop; machine learning; DRUG DISCOVERY; FINGERPRINTS;
D O I
10.4103/ijp.IJP_304_17
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
CONTEXT: Chemical toxicity prediction at early stage drug discovery phase has been researched for years, and newest methods are always investigated. Research data comprising chemical physicochemical properties, toxicity, assay, and activity details create massive data which are becoming difficult to manage. Identifying the desired featured chemical with the desired biological activity from millions of chemicals is a challenging task. AIMS: In this study, we investigate and explore big data technologies and machine learning approaches to do an efficient chemical data mining for endocrine receptor disruption prediction and virtual compound screening. The power of artificial neural network (ANN) in predicting chemicals' activity toward androgen receptor (AR) and estrogen receptor (ER) and thereby classifying into human endocrine disruptor or nondisruptor is investigated. SUBJECTS AND METHODS: Molecules are collected along with their Inhibitory Concentration (IC50) values toward AR and ER. Training and test datasets are created with active and inactive classes of molecules. Molecular fingerprints of Electro Topological State (E-State) are generated for describing every compound. ANN machine learning model is created using Apache Spark and implemented in Hadoop big data environment. Test chemical's structural similarity toward active class of training compounds is estimated and combined with ANN model for improving prediction accuracy. RESULTS: AR and ER predictive models applied on corresponding test datasets gave 86.31% and 89.57% accuracies, respectively, in correctly classifying molecules as disruptor or nondisruptor. Molecular fragments and functional groups are ranked based on their importance in forming ANN model and influence toward the AR and ER disruption behavior. Training molecules that are specific to the test molecules' endocrine disruption prediction are retrieved based on the structural similarity values. CONCLUSIONS: The current study demonstrates a new approach of chemical endocrine receptor disruption prediction combining ANN machine learning method and molecular similarity in a big data environment. This method of predictive modeling can be further tested with more receptors and hormones and predictive power can be examined.
引用
收藏
页码:169 / 176
页数:8
相关论文
共 50 条
  • [1] Improving Tourist Arrival Prediction: A Big Data and Artificial Neural Network Approach
    Hoepken, Wolfram
    Eberle, Tobias
    Fuchs, Matthias
    Lexhagen, Maria
    JOURNAL OF TRAVEL RESEARCH, 2021, 60 (05) : 998 - 1017
  • [2] A Data Mining Approach for Prediction of Students' Depression Using Logistic Regression And Artificial Neural Network
    Mohd, Norhatta
    Yahya, Yasmin
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2018), 2018,
  • [3] Similarity Measurement of Metadata of Geospatial Data: An Artificial Neural Network Approach
    Chen, Zugang
    Song, Jia
    Yang, Yaping
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2018, 7 (03)
  • [4] Artificial Neural Network for Incremental Data Mining
    Driff, Lydia Nahla
    Drias, Habiba
    RECENT ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2017, 569 : 133 - 143
  • [5] Neural network approach for data mining
    Rahman, SMM
    Yu, XH
    Martin, G
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 851 - 854
  • [6] Artificial neural network based prediction of malaria abundances using big data: A knowledge capturing approach
    Santosh, Thakur
    Ramesh, Dharavath
    CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2019, 7 (01): : 121 - 126
  • [7] Healthcare Big Data Analysis with Artificial Neural Network for Cardiac Disease Prediction
    Mohapatra, Sulagna
    Sahoo, Prasan Kumar
    Mohapatra, Suvendu Kumar
    ELECTRONICS, 2024, 13 (01)
  • [8] Hydrological big data prediction based on similarity search and improved BP neural network
    Wan, Dingsheng
    Xiao, Yan
    Zhang, Pengcheng
    Leung, Hareton
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 343 - 350
  • [9] Stock Market Prediction with Big Data Through Hybridization of Data Mining and Optimized Neural Network Techniques
    Das, Debashish
    Sadiq, Ali Safa
    Ahmad, Noraziah Binti
    Lloret, Jaime
    JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2017, 29 (1-2) : 157 - 181
  • [10] Early Prediction of Student Success Based on Data Mining and Artificial Neural Network
    Bursac, Marko
    Blagojevic, Marija
    Milosevic, Danijela
    HUMAN CENTERED COMPUTING, 2019, 11956 : 26 - 31