Large-scale automated machine reading discovers new cancer-driving mechanisms

被引:44
作者
Valenzuela-Escarcega, Marco A. [1 ]
Babur, Ozgun [2 ]
Hahn-Powell, Gus [3 ]
Bell, Dane [3 ]
Hicks, Thomas [1 ]
Noriega-Atala, Enrique [4 ]
Wang, Xia [5 ]
Surdeanu, Mihai [1 ]
Demir, Emek [2 ]
Morrison, Clayton T. [4 ]
机构
[1] Univ Arizona, Dept Comp Sci, Tucson, AZ 85721 USA
[2] Oregon Hlth & Sci Univ, Sch Med, Portland, OR 97201 USA
[3] Univ Arizona, Dept Linguist, Tucson, AZ 85721 USA
[4] Univ Arizona, Sch Informat, Tucson, AZ USA
[5] Univ Arizona, Dept Mol & Cellular Biol, Tucson, AZ 85721 USA
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2018年
关键词
EVENT EXTRACTION; PATHWAY DATA;
D O I
10.1093/database/bay098
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated 'big mechanisms' with extracted 'big data' can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.
引用
收藏
页数:14
相关论文
共 45 条
[1]   Prediction of individualized therapeutic vulnerabilities in cancer from genomic profiles [J].
Aksoy, Buelent Arman ;
Demir, Emek ;
Babur, Oezguen ;
Wang, Weiqing ;
Jing, Xiaohong ;
Schultz, Nikolaus ;
Sander, Chris .
BIOINFORMATICS, 2014, 30 (14) :2051-2059
[2]  
[Anonymous], 2008, SEMANTICS TEXT PROCE
[3]  
[Anonymous], 2013, P BIONLP SHARED TASK
[4]   Platelet procoagulant phenotype is modulated by a p38-MK2 axis that regulates RTN4/Nogo proximal to the endoplasmic reticulum: utility of pathway analysis [J].
Babur, Ozgun ;
Ngo, Anh T. P. ;
Rigg, Rachel A. ;
Pang, Jiaqing ;
Rub, Zhoe T. ;
Buchanan, Ariana E. ;
Mitrugno, Annachiara ;
David, Larry L. ;
McCarty, Owen J. T. ;
Demir, Emek ;
Aslan, Joseph E. .
AMERICAN JOURNAL OF PHYSIOLOGY-CELL PHYSIOLOGY, 2018, 314 (05) :C603-C615
[5]   Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations [J].
Babur, Ozgun ;
Gonen, Mithat ;
Aksoy, Bulent Arman ;
Schultz, Nikolaus ;
Ciriello, Giovanni ;
Sander, Chris ;
Demir, Emek .
GENOME BIOLOGY, 2015, 16
[6]   Discovering modulators of gene expression [J].
Babur, Ozgun ;
Demir, Emek ;
Gonen, Mithat ;
Sander, Chris ;
Dogrusoz, Ugur .
NUCLEIC ACIDS RESEARCH, 2010, 38 (17) :5648-5656
[7]  
Banarescu L, 2012, Parsing on Freebase from Question-Answer Pairs, P1533
[8]  
Bear J, 1993, P INT JOINT C ART IN
[9]  
Bell D, 2016, LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P177
[10]  
Bjorne J., 2013, Proceedings of the BioNLP Shared Task 2013 Workshop, P16