iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature

被引：14

作者：

Ren, Jia ^{[1
]}

Li, Gang ^{[2
]}

Ross, Karen ^{[3
]}

Arighi, Cecilia ^{[1
,2
]}

McGarvey, Peter ^{[3
,4
]}

Rao, Shruti ^{[4
]}

Cowart, Julie ^{[1
]}

Madhavan, Subha ^{[4
,5
]}

Vijay-Shanker, K. ^{[2
]}

Wu, Cathy H. ^{[1
,2
,3
]}

机构：

[1] Univ Delaware, Ctr Bioinformat & Computat Biol, Newark, DE 19711 USA

[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA

[3] Georgetown Univ, Med Ctr, Prot Informat Resource, Washington, DC 20007 USA

[4] Georgetown Univ, Innovat Ctr Biomed Informat, Washington, DC 20007 USA

[5] Georgetown Univ, Med Ctr, Lombardi Comprehens Canc Ctr, Washington, DC 20057 USA

来源：

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2018年

基金：

美国国家卫生研究院;

关键词：

BINDING-PROTEIN; 1; MULTIDRUG-RESISTANCE; SATB1; PHOSPHORYLATION; INVASION;

D O I：

10.1093/database/bay128

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programming language, system dependency and input/output format. There are few previous works that concern the integration of different text-mining tools and their results from large-scale text processing. In this paper, we describe the iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction. We employ parallel processing with dockerized text-mining tools with a standardized JSON output format and implement a text alignment algorithm to solve the text discrepancy for result integration. iTextMine presently integrates four relation extraction tools, which have been used to process all the Medline abstracts and PMC open access full-length articles. The website allows users to browse the text evidence and view integrated results for knowledge discovery through a network view. We demonstrate the utilities of iTextMine with two use cases involving the gene PTEN and breast cancer and the gene SATB1.

引用

页数：10

共 50 条

[1] Causal Knowledge Extraction through Large-Scale Text Mining
Hassanzadeh, Oktie
Bhattacharjya, Debarun
Feblowitz, Mark
Srinivas, Kavitha
Perrone, Michael
Sohrabi, Shirin
Katz, Michael
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13610 - 13611
[2] BioContext: an integrated text mining system for large-scale extraction and contextualization of biomolecular events
Gerner, Martin
Sarafraz, Farzaneh
Bergman, Casey M.
Nenadic, Goran
BIOINFORMATICS, 2012, 28 (16) : 2154 - 2161
[3] Large-Scale Extraction and Use of Knowledge from Text
Clark, Peter
Harrison, Phil
K-CAP'09: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 2009, : 153 - 160
[4] Mining Large-scale Event Knowledge from Web Text
Cao, Ya-nan
Zhang, Peng
Guo, Jing
Guo, Li
2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 478 - 487
[5] Large-Scale Text Mining of Biomedical Literature
Ginter, Filip
ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2013, (116): : 43 - 44
[6] Graph Clustering for Large-Scale Text-Mining of Brain Imaging Studies
Chawla, Manisha
Mesa, Mounika
Miyapuram, Krishna P.
PROCEEDING OF THE THIRD INTERNATIONAL SYMPOSIUM ON WOMEN IN COMPUTING AND INFORMATICS (WCI-2015), 2015, : 163 - 168
[7] Graph clustering for large-scale text-mining of brain imaging studies
Center for Cognitive Science, Indian Institute of Technology, Gandhinagar, Ahmedabad, India
不详
不详
ACM Int. Conf. Proc. Ser., (163-168):
[8] Temporal knowledge extraction from large-scale text corpus
Yu Liu
Wen Hua
Xiaofang Zhou
World Wide Web, 2021, 24 : 135 - 156
[9] Temporal knowledge extraction from large-scale text corpus
Liu, Yu
Hua, Wen
Zhou, Xiaofang
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 135 - 156
[10] A text-mining system for knowledge discovery from Biomedical Documents
Uramoto, N
Matsuzawa, H
Nagano, T
Murakami, A
Takeuchi, H
Takeda, K
IBM SYSTEMS JOURNAL, 2004, 43 (03) : 516 - 533

← 1 2 3 4 5 →