A method for extracting task-oriented information from biological text sources

被引:1
作者
Kuttiyapillai, Dhanasekaran [1 ]
Rajeswari, R. [2 ]
机构
[1] Info Inst Engn, Dept Comp Sci & Engn, Coimbatore 641107, Tamil Nadu, India
[2] Govt Coll Technol, Dept Elect & Elect Engn, Coimbatore 641013, Tamil Nadu, India
关键词
dimensionality reduction; knowledge discovery; machine learning; semantic relevance; information extraction; natural language; disease prevention; gene prediction; biological text; dynamic programming;
D O I
10.1504/IJDMB.2015.070072
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A method for information extraction which processes the unstructured data from document collection has been introduced. A dynamic programming technique adopted to find relevant genes from sequences which are longest and accurate is used for finding matching sequences and identifying effects of various factors. The proposed method could handle complex information sequences which give different meanings in different situations, eliminating irrelevant information. The text contents were pre-processed using a general-purpose method and were applied with entity tagging component. The bottom-up scanning of key-value pairs improves content finding to generate relevant sequences to the testing task. This paper highlights context-based extraction method for extracting food safety information, which is identified from articles, guideline documents and laboratory results. The graphical disease model verifies weak component through utilisation of development data set. This improves the accuracy of information retrieval in biological text analysis and reporting applications.
引用
收藏
页码:387 / 399
页数:13
相关论文
共 19 条
[11]   An Ontology-Driven Approach for Semantic Information Retrieval on the Web [J].
Rinaldi, Antonio M. .
ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2009, 9 (03)
[12]   Multi-label Classification with ART Neural Networks [J].
Sapozhnikova, Elena P. .
WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, :144-147
[13]   Comparison of clinical knowledge management capabilities of commercially-available and leading internally-developed electronic health records [J].
Sittig, Dean F. ;
Wright, Adam ;
Meltzer, Seth ;
Simonaitis, Linas ;
Evans, R. Scott ;
Nichol, W. Paul ;
Ash, Joan S. ;
Middleton, Blackford .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2011, 11
[14]   A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data [J].
Song, Qinbao ;
Ni, Jingjie ;
Wang, Guangtao .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (01) :1-14
[15]   Health Care 2009: Health Care and the American Recovery and Reinvestment Act. [J].
Steinbrook, Robert .
NEW ENGLAND JOURNAL OF MEDICINE, 2009, 360 (11) :1057-1060
[16]  
Thunkijjanukij A., 2008, World conference on agricultural information and IT, IAALD AFITA WCCA 2008, Tokyo University of Agriculture, Tokyo, Japan, 24 - 27 August, 2008, P495
[17]   Probabilistic Topic Models for Learning Terminological Ontologies [J].
Wei, Wang ;
Barnaghi, Payam ;
Bargiela, Andrzej .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (07) :1028-1040
[18]   Product Feature Grouping for Opinion Mining [J].
Zhai, Zhongwu ;
Liu, Bing ;
Wang, Jingyuan ;
Xu, Hua ;
Jia, Peifa .
IEEE INTELLIGENT SYSTEMS, 2012, 27 (04) :37-44
[19]  
Zhao Shubin., 2005, Extracting relations with integrated information using kernel methods. pages, P419