A method for extracting task-oriented information from biological text sources

被引:1
作者
Kuttiyapillai, Dhanasekaran [1 ]
Rajeswari, R. [2 ]
机构
[1] Info Inst Engn, Dept Comp Sci & Engn, Coimbatore 641107, Tamil Nadu, India
[2] Govt Coll Technol, Dept Elect & Elect Engn, Coimbatore 641013, Tamil Nadu, India
关键词
dimensionality reduction; knowledge discovery; machine learning; semantic relevance; information extraction; natural language; disease prevention; gene prediction; biological text; dynamic programming;
D O I
10.1504/IJDMB.2015.070072
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A method for information extraction which processes the unstructured data from document collection has been introduced. A dynamic programming technique adopted to find relevant genes from sequences which are longest and accurate is used for finding matching sequences and identifying effects of various factors. The proposed method could handle complex information sequences which give different meanings in different situations, eliminating irrelevant information. The text contents were pre-processed using a general-purpose method and were applied with entity tagging component. The bottom-up scanning of key-value pairs improves content finding to generate relevant sequences to the testing task. This paper highlights context-based extraction method for extracting food safety information, which is identified from articles, guideline documents and laboratory results. The graphical disease model verifies weak component through utilisation of development data set. This improves the accuracy of information retrieval in biological text analysis and reporting applications.
引用
收藏
页码:387 / 399
页数:13
相关论文
共 19 条
[1]  
Ahmed R., 2006, P INT C ART INT MACH
[2]   DataServer: An infrastructure to support evidence-based radiology [J].
Bui, AAT ;
Dionisio, JDN ;
Morioka, CA ;
Sinha, U ;
Taira, RK ;
Kangarloo, H .
ACADEMIC RADIOLOGY, 2002, 9 (06) :670-678
[3]   Improving communication E-democracy using natural language processing [J].
Carenini, Michele ;
Whyte, Angus ;
Bertorello, Lorenzo ;
Vanocchi, Massimo .
IEEE INTELLIGENT SYSTEMS, 2007, 22 (01) :20-27
[4]  
Christine W. C., 2004, INT J SOFTW ENG KNOW, V14, P603
[5]  
Dayhoff RE, 1999, J AM MED INFORM ASSN, P241
[6]  
Hong H., 2011, IEEE ACM T COMPUT BI, V10, P1
[7]   An approach to automatic acquisition of translation templates based on phrase structure extraction and alignment [J].
Hu, Rile ;
Zong, Chengqing ;
Xu, Bo .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1656-1663
[8]  
Liu HB, 2010, STUD COMPUT INTELL, V263, P445
[9]   Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data [J].
Paul, Topon Kumar ;
Iba, Hitoshi .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (02) :353-367
[10]   GRAPHICAL SUMMARY OF PATIENT STATUS [J].
POWSNER, SM ;
TUFTE, ER .
LANCET, 1994, 344 (8919) :386-389