Triage of documents containing protein interactions affected by mutations using an NLP based machine learning approach

被引:5
|
作者
Qu, Jinchan [1 ]
Steppi, Albert [2 ]
Zhong, Dongrui [1 ]
Hao, Jie [1 ]
Wang, Jian [3 ]
Lung, Pei-Yau [4 ]
Zhao, Tingting [5 ]
He, Zhe [6 ]
Zhang, Jinfeng [1 ]
机构
[1] Florida State Univ, Dept Stat, Tallahassee, FL 32306 USA
[2] Harvard Med Sch, Lab Syst Pharmacol, Boston, MA 02115 USA
[3] CloudMedx, Palo Alto, CA 94301 USA
[4] Verisk Insurance Solut, Middletown, CT 06457 USA
[5] Florida State Univ, Dept Geog, Tallahassee, FL 32306 USA
[6] Florida State Univ, Coll Commun & Informat, Tallahassee, FL 32306 USA
关键词
Protein-protein interactions; Mutations; Text mining; Biomedical literature retrieval; Protein interactions affected by mutations; INTERACTION EXTRACTION; EXPRESSION; DRUG; TOOL;
D O I
10.1186/s12864-020-07185-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundInformation on protein-protein interactions affected by mutations is very useful for understanding the biological effect of mutations and for developing treatments targeting the interactions. In this study, we developed a natural language processing (NLP) based machine learning approach for extracting such information from literature. Our aim is to identify journal abstracts or paragraphs in full-text articles that contain at least one occurrence of a protein-protein interaction (PPI) affected by a mutation.ResultsOur system makes use of latest NLP methods with a large number of engineered features including some based on pre-trained word embedding. Our final model achieved satisfactory performance in the Document Triage Task of the BioCreative VI Precision Medicine Track with highest recall and comparable F1-score.ConclusionsThe performance of our method indicates that it is ideally suited for being combined with manual annotations. Our machine learning framework and engineered features will also be very helpful for other researchers to further improve this and other related biological text mining tasks using either traditional machine learning or deep learning based methods.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Triage of documents containing protein interactions affected by mutations using an NLP based machine learning approach
    Jinchan Qu
    Albert Steppi
    Dongrui Zhong
    Jie Hao
    Jian Wang
    Pei-Yau Lung
    Tingting Zhao
    Zhe He
    Jinfeng Zhang
    BMC Genomics, 21
  • [2] Document triage for identifying protein-protein interactions affected by mutations: a neural network ensemble approach
    Luo, Ling
    Yang, Zhihao
    Lin, Hongfei
    Wang, Jian
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2018,
  • [3] Prediction of Depression Using Machine Learning and NLP Approach
    Mali, Amrat
    Sedamkar, R. R.
    INTELLIGENT COMPUTING AND NETWORKING, IC-ICN 2021, 2022, 301 : 172 - 181
  • [4] Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning
    Pandurangan, Arun Prasad
    Blundell, Tom L.
    PROTEIN SCIENCE, 2020, 29 (01) : 247 - 257
  • [5] Residue-Frustration-Based Prediction of Protein-Protein Interactions Using Machine Learning
    Zhou, Xiaozhou
    Song, Haoyu
    Li, Jingyuan
    JOURNAL OF PHYSICAL CHEMISTRY B, 2022, 126 (08): : 1719 - 1727
  • [6] A Machine Learning Approach to Predicting Boarding and Admission Surges Using Triage Information
    Makutonin, M.
    Desnoyers, B.
    Nathanson, L.
    Meltzer, A.
    ANNALS OF EMERGENCY MEDICINE, 2023, 82 (04) : S135 - S136
  • [7] Predicting triage of pediatric patients in the emergency department using machine learning approach
    Halwani, Manal Ahmed
    Merdad, Ghada
    Almasre, Miada
    Doman, Ghadeer
    Alsharif, Shafiqa
    Alshiakh, Safinaz M.
    Mahboob, Duaa Yousof
    Halwani, Marwah A.
    Faqerah, Nojoud Adnan
    Mosuily, Mahmoud Talal
    INTERNATIONAL JOURNAL OF EMERGENCY MEDICINE, 2025, 18 (01)
  • [8] BCBSLA APPROACH USING NATURAL LANGUAGE PROCESSING (NLP) AND MACHINE LEARNING TO PREDICT THE RISK OF HOSPITALIZATIONS
    Holloway, J.
    Neely, C.
    Yuan, X.
    Zhang, Y.
    Ouyang, J.
    Cantrell, D.
    Chaisson, J.
    Tisdale, K.
    Bergeron, T.
    Nigam, S.
    VALUE IN HEALTH, 2019, 22 : S266 - S266
  • [9] Decoding the effects of mutation on protein interactions using machine learning
    Xu, Wang
    Li, Anbang
    Zhao, Yunjie
    Peng, Yunhui
    BIOPHYSICS REVIEWS, 2025, 6 (01):
  • [10] An end-to-end deep learning architecture for extracting protein-protein interactions affected by genetic mutations
    Tung Tran
    Kavuluru, Ramakanth
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2018,