PubMiner: Machine learning-based text mining system for biomedical information mining

被引:0
作者
Eom, JH [1 ]
Zhang, BT [1 ]
机构
[1] Seoul Natl Univ, Sch Engn & Comp Sci, Biointelligence Lab, Seoul 151744, South Korea
来源
ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS | 2004年 / 3192卷
关键词
natural language processing; data mining; machine learning; bioinformatics; and software application;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
PubMiner, an intelligent machine learning based text mining system for mining biological information from the literature is introduced. PubMiner utilize natural language processing and machine learning based data mining techniques for mining useful biological information such as protein-protein interaction from the massive literature data. The system recognizes biological terms such as gene, protein, and enzymes and extracts their interactions described in the document through natural language analysis. The extracted interactions are further analyzed with a set of features of each entity which were constructed from the related public databases to infer more interactions from the original interactions. An inferred interaction from the interaction analysis and native interaction are provided to the user with the link of literature sources. The evaluation of system performance proceeded with the protein interaction data of S.cerevisiae (bakers yeast) from MIPS and SGD.
引用
收藏
页码:216 / 225
页数:10
相关论文
共 23 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]   Automated extraction of information in molecular biology [J].
Andrade, MA ;
Bork, P .
FEBS LETTERS, 2000, 476 (1-2) :12-17
[3]   Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families [J].
Andrade, MA ;
Valencia, A .
BIOINFORMATICS, 1998, 14 (07) :600-607
[4]  
Blaschke C, 1999, Proc Int Conf Intell Syst Mol Biol, P60
[5]   GIS: a biomedical text-mining system for gene information discovery [J].
Chiang, JH ;
Yu, HC ;
Hsu, HJ .
BIOINFORMATICS, 2004, 20 (01) :120-121
[6]   Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms [J].
Christie, KR ;
Weng, S ;
Balakrishnan, R ;
Costanzo, MC ;
Dolinski, K ;
Dwight, SS ;
Engel, SR ;
Feierbach, B ;
Fisk, DG ;
Hirschman, JE ;
Hong, EL ;
Issel-Tarver, L ;
Nash, R ;
Sethuraman, A ;
Starr, B ;
Theesfeld, CL ;
Andrada, R ;
Binkley, G ;
Dong, Q ;
Lane, C ;
Schroeder, M ;
Botstein, D ;
Cherry, JM .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D311-D314
[7]   Extracting human protein interactions from MEDLINE using a full-sentence parser [J].
Daraselia, N ;
Yuryev, A ;
Egorov, S ;
Novichkova, S ;
Nikitin, A ;
Mazo, I .
BIOINFORMATICS, 2004, 20 (05) :604-U43
[8]  
EOM JH, 2004, BITR0401 BIOINTELLIG
[9]   Research on collaborative negotiation for e-commerce. [J].
Feng, YQ ;
Lei, Y ;
Li, Y ;
Cao, RZ .
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, :2085-2088
[10]  
Friedman C, 2001, Bioinformatics, V17 Suppl 1, pS74