Knowledge discovery by automated identification and ranking of implicit relationships

被引:141
作者
Wren, JD
Bekeredjian, R
Stewart, JA
Shohet, RV
Garner, HR
机构
[1] Univ Oklahoma, Dept Bot & Microbiol, Adv Ctr Genome Technol, Norman, OK 73019 USA
[2] Univ Texas, SW Med Ctr, Dept Internal Med, Dallas, TX 75390 USA
[3] Univ Texas, SW Med Ctr, Div Cardiol, Dallas, TX 75390 USA
[4] Univ Texas, SW Med Ctr, McDermott Ctr Human Growth & Dev, Dept Biochem,Ctr Biomed Invent, Dallas, TX 75390 USA
关键词
D O I
10.1093/bioinformatics/btg421
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: New relationships are often implicit from existing information, but the amount and growth of published literature limits the scope of analysis an individual can accomplish. Our goal was to develop and test a computational method to identify relationships within scientific reports, such that large sets of relationships between unrelated items could be sought out and statistically ranked for their potential relevance as a set. Results: We first construct a network of tentative relationships between 'objects' of biomedical research interest (e.g. genes, diseases, phenotypes, chemicals) by identifying their co-occurrences within all electronically available MEDLINE records. Relationships shared by two unrelated objects are then ranked against a random network model to estimate the statistical significance of any given grouping. When compared against known relationships, we find that this ranking correlates with both the probability and frequency of object co-occurrence, demonstrating the method is well suited to discover novel relationships based upon existing shared relationships. To test this, we identified compounds whose shared relationships predicted they might affect the development and/or progression of cardiac hypertrophy. When laboratory tests were performed in a rodent model, chlorpromazine was found to reduce the progression of cardiac hypertrophy.
引用
收藏
页码:389 / 398
页数:10
相关论文
共 38 条