Protein-Protein Interaction Network Extraction Using Text Mining Methods Adds Insight into Autism Spectrum Disorder

被引:1
作者
Nezamuldeen, Leena [1 ,2 ]
Jafri, Mohsin Saleet [1 ,3 ]
机构
[1] George Mason Univ, Sch Syst Biol, Fairfax, VA 22030 USA
[2] King Abdulaziz Univ, King Fahd Med Res Ctr, Jeddah 21589, Saudi Arabia
[3] Univ Maryland, Ctr Biomed Engn & Technol, Sch Med, Baltimore, MD 21201 USA
来源
BIOLOGY-BASEL | 2023年 / 12卷 / 10期
关键词
artificial intelligence; PPI; protein-protein interaction; text mining; BiLSTM; recurrent neural network; FILAMIN; GENE; PHOSPHORYLATION; ACTIVATION; COMPLEX; SITE; RSK;
D O I
10.3390/biology12101344
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary Research on proteins and their interactions with other proteins yields many new findings that help explain how diseases emerge. However, manual curation of scientific literature delays new discoveries in the field. Artificial intelligence and deep learning techniques have played a significant part in information extraction from textual forms. In this study, we used text mining and artificial intelligence techniques to address the issue of extracting protein-protein interaction networks from the vast amount of scientific research literature. We have created an automated system consisting of three models using deep learning and natural language processing methods. The accuracy of our first model, which employs recurrent neural networks using sentiment analysis, was 95%. Additionally, the accuracy of our second model, which employs the named entity recognition technique in NLP, was effective and achieved an accuracy of 98%. In comparison to the protein interaction network, we discovered by manual curation of more than 30 articles on Autism Spectrum Disorder, that the automated system testing on 6027 abstracts was successful in developing the network of interactions and provided an improved view. Discovering these networks will greatly help physicians and scientists understand how these molecules interact for physiological, pharmacological, and pathological insight.Abstract Text mining methods are being developed to assimilate the volume of biomedical textual materials that are continually expanding. Understanding protein-protein interaction (PPI) deficits would assist in explaining the genesis of diseases. In this study, we designed an automated system to extract PPIs from the biomedical literature that uses a deep learning sentence classification model, a pretrained word embedding, and a BiLSTM recurrent neural network with additional layers, a conditional random field (CRF) named entity recognition (NER) model, and shortest-dependency path (SDP) model using the SpaCy library in Python. The automated system ensures that it targets sentences that contain PPIs and not just these proteins mentioned in the framework of disease discovery or other context. Our first model achieved 13% greater precision on the Aimed/BioInfr benchmark corpus than the previous state-of-the-art BiLSTM neural network models. The NER model presented in this study achieved 98% precision on the Aimed/BioInfr corpus over previous models. In order to facilitate the production of an accurate representation of the PPI network, the processes were developed to systematically map the protein interactions in the texts. Overall, evaluating our system through the use of 6027 abstracts pertaining to seven proteins associated with Autism Spectrum Disorder completed the manually curated PPI network for these proteins. When it comes to complicated diseases, these networks would assist in understanding how PPI deficits contribute to disease development while also emphasizing the influence of interactions on protein function and biological processes.
引用
收藏
页数:20
相关论文
共 65 条
  • [11] Activation and Function of the MAPKs and Their Substrates, the MAPK-Activated Protein Kinases
    Cargnello, Marie
    Roux, Philippe P.
    [J]. MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, 2011, 75 (01) : 50 - 83
  • [12] Oncogenic MAPK signaling stimulates mTORC1 activity by promoting RSK-mediated Raptor phosphorylation
    Carriere, Audrey
    Cargnello, Marie
    Julien, Louis-Andre
    Gao, Huanhuan
    Bonneil, Eric
    Thibault, Pierre
    Roux, Philippe P.
    [J]. CURRENT BIOLOGY, 2008, 18 (17) : 1269 - 1277
  • [13] Pathway Commons, a web resource for biological pathway data
    Cerami, Ethan G.
    Gross, Benjamin E.
    Demir, Emek
    Rodchenkov, Igor
    Babur, Oezguen
    Anwar, Nadia
    Schultz, Nikolaus
    Bader, Gary D.
    Sander, Chris
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D685 - D690
  • [14] Small RNA-induced INTS6 gene up-regulation suppresses castration-resistant prostate cancer cells by regulating -catenin signaling
    Chen, Hong
    Shen, Hai-Xiang
    Lin, Yi-Wei
    Mao, Ye-Qing
    Liu, Ben
    Xie, Li-Ping
    [J]. CELL CYCLE, 2018, 17 (13) : 1602 - 1613
  • [15] Chiu B., 2016, P 15 WORKSHOP BIOMED, P1, DOI [DOI 10.18653/V1/W16-2922, 10.18653/v1/w16-2922]
  • [17] Clarke Lorne A., 2008, Expert Reviews in Molecular Medicine, V10, P1, DOI 10.1017/S1462399408000550
  • [18] The BioPAX community standard for pathway data sharing
    Demir, Emek
    Cary, Michael P.
    Paley, Suzanne
    Fukuda, Ken
    Lemer, Christian
    Vastrik, Imre
    Wu, Guanming
    D'Eustachio, Peter
    Schaefer, Carl
    Luciano, Joanne
    Schacherer, Frank
    Martinez-Flores, Irma
    Hu, Zhenjun
    Jimenez-Jacinto, Veronica
    Joshi-Tope, Geeta
    Kandasamy, Kumaran
    Lopez-Fuentes, Alejandra C.
    Mi, Huaiyu
    Pichler, Elgar
    Rodchenkov, Igor
    Splendiani, Andrea
    Tkachev, Sasha
    Zucker, Jeremy
    Gopinath, Gopal
    Rajasimha, Harsha
    Ramakrishnan, Ranjani
    Shah, Imran
    Syed, Mustafa
    Anwar, Nadia
    Babur, Oezguen
    Blinov, Michael
    Brauner, Erik
    Corwin, Dan
    Donaldson, Sylva
    Gibbons, Frank
    Goldberg, Robert
    Hornbeck, Peter
    Luna, Augustin
    Murray-Rust, Peter
    Neumann, Eric
    Reubenacker, Oliver
    Samwald, Matthias
    van Iersel, Martijn
    Wimalaratne, Sarala
    Allen, Keith
    Braun, Burk
    Whirl-Carrillo, Michelle
    Cheung, Kei-Hoi
    Dahlquist, Kam
    Finney, Andrew
    [J]. NATURE BIOTECHNOLOGY, 2010, 28 (09) : 935 - 942
  • [19] Decreased levels of serum fibroblast growth factor-2 in children with autism spectrum disorder
    Esnafoglu, Erman
    Ayyildiz, Sema Nur
    [J]. PSYCHIATRY RESEARCH, 2017, 257 : 79 - 83
  • [20] Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning
    Feng, Yuntian
    Zhang, Hongjun
    Hao, Wenning
    Chen, Gang
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2017, 2017