Discovery of protein-protein interactions using a combination of linguistic, statistical and graphical information

被引:16
作者
Cooper, JW
Kershenbaum, A
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Text Analyt, Yorktown Hts, NY 10598 USA
[2] IBM Corp, Thomas J Watson Res Ctr, Bioinformat, Yorktown Hts, NY 10598 USA
关键词
Protein Interaction; Noun Phrase; Database Table; Sentence Boundary; Medline Abstract;
D O I
10.1186/1471-2105-6-143
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The rapid publication of important research in the biomedical literature makes it increasingly difficult for researchers to keep current with significant work in their area of interest. Results: This paper reports a scalable method for the discovery of protein-protein interactions in Medline abstracts, using a combination of text analytics, statistical and graphical analysis, and a set of easily implemented rules. Applying these techniques to 12,300 abstracts, a precision of 0.61 and a recall of 0.97 were obtained, (f = 0.74) and when allowing for two-hop and three-hop relations discovered by graphical analysis, the precision was 0.74 ( f = 0.83). Conclusion: This combination of linguistic and statistical approaches appears to provide the highest precision and recall thus far reported in detecting protein-protein relations using text analytic approaches.
引用
收藏
页数:8
相关论文
共 17 条
  • [1] Blaschke C, 2001, Genome Inform, V12, P123
  • [2] Cormen T. H., 1990, INTRO ALGORITHMS
  • [3] Extracting human protein interactions from MEDLINE using a full-sentence parser
    Daraselia, N
    Yuryev, A
    Egorov, S
    Novichkova, S
    Nikitin, A
    Mazo, I
    [J]. BIOINFORMATICS, 2004, 20 (05) : 604 - U43
  • [4] Lethality and centrality in protein networks
    Jeong, H
    Mason, SP
    Barabási, AL
    Oltvai, ZN
    [J]. NATURE, 2001, 411 (6833) : 41 - 42
  • [5] A network of protein-protein interactions in yeast
    Schwikowski, B
    Uetz, P
    Fields, S
    [J]. NATURE BIOTECHNOLOGY, 2000, 18 (12) : 1257 - 1261
  • [6] SWANSON DR, 1986, PERSPECT BIOL MED, V30, P7
  • [7] [No title captured]
  • [8] [No title captured]
  • [9] [No title captured]
  • [10] [No title captured]