A text mining approach to detect mentions of protein glycosylation in biomedical text

被引：0

作者：

Shukla, Daksha ^{[1
]}

Jayaraman, Valadi K. ^{[2
]}

机构：

[1] Univ Pune, Bioinformat Ctr, Pune, Maharashtra, India

[2] Univ Pune, Ctr Dev Adv Comp, Pune, Maharashtra, India

来源：

BIOINFORMATION | 2012年 / 8卷 / 16期

关键词：

Text mining; Glycosylation; Rule-based approach; Dictionary -based approach;

D O I：

10.6026/97320630008758

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Protein Glycosylation is an important post translational event that plays a pivotal role in protein folding and protein is trafficking. We describe a dictionary based and a rule based approach to mine 'mentions' of protein glycosylation in text. The dictionary based approach relies on a set of manually curated dictionaries specially constructed to address this task. Abstracts are then screened for the 'mentions' of words from these dictionaries which are further scored followed by classification on the basis of a threshold. The rule based approaches also relies on the words in the dictionary to arrive at the features which are used for classification. The performance of the system using both the approaches has been evaluated using a manually curated corpus of 3133 abstracts. The evaluation suggests that the performance of the Rule based approach supersedes that of the Dictionary based approach.

引用

页码：758 / 762

页数：5

共 8 条

[1] Text mining for biology - the way forward: opinions from leading scientists
Altman, Russ B.
Bergman, Casey M.
Blake, Judith
Blaschke, Christian
Cohen, Aaron
Gannon, Frank
Grivell, Les
Hahn, Udo
Hersh, William
Hirschman, Lynette
Jensen, Lars Juhl
Krallinger, Martin
Mons, Barend
O'Donoghue, Sean I.
Peitsch, Manuel C.
Rebholz-Schuhmann, Dietrich
Shatkay, Hagit
Valencia, Alfonso
[J]. GENOME BIOLOGY, 2008, 9
[2] Evaluation of BioCreAtIvE assessment of task 2
Blaschke, Christian
Leon, Eduardo Andres
Krallinger, Martin
Valencia, Alfonso
[J]. BMC Bioinformatics, 2005, 6 (SUPPL.1)
[3] BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009
Chang, Antje
Scheer, Maurice
Grote, Andreas
Schomburg, Ida
Schomburg, Dietmar
[J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D588 - D592
[4] A survey of current work in biomedical text mining
Cohen, AM
Hersh, WR
[J]. BRIEFINGS IN BIOINFORMATICS, 2005, 6 (01) : 57 - 71
[5] Dowell K. G., 2009, DATABASE-OXFORD, V2009, DOI DOI 10.1093/DATABASE/BAP019
[6] PLAN2L: a web tool for integrated text mining and literature-based bioentity relation extraction
Krallinger, Martin
Rodriguez-Penagos, Carlos
Tendulkar, Ashish
Valencia, Alfonso
[J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : W160 - W165
[7] Supervised and Traditional Term Weighting Methods for Automatic Text Categorization
Lan, Man
Tan, Chew Lim
Su, Jian
Lu, Yue
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (04) : 721 - 735
[8] Tagging gene and protein names in biomedical text
Tanabe, L
Wilbur, WJ
[J]. BIOINFORMATICS, 2002, 18 (08) : 1124 - 1132

← 1 →