Quantitative biomedical annotation using medical subject heading over-representation profiles (MeSHOPs)

被引:19
作者
Cheung, Warren A. [1 ,2 ]
Ouellette, B. F. Francis [3 ,4 ]
Wasserman, Wyeth W. [1 ]
机构
[1] Univ British Columbia, Dept Med Genet, Ctr Mol Med & Therapeut, Child & Family Res Inst, Vancouver, BC, Canada
[2] Univ British Columbia, Bioinformat Grad Program, Vancouver, BC V5Z 1M9, Canada
[3] Ontario Inst Canc Res, Toronto, ON, Canada
[4] Univ Toronto, Dept Cell & Syst Biol, Toronto, ON, Canada
来源
BMC BIOINFORMATICS | 2012年 / 13卷
基金
加拿大健康研究院;
关键词
GENOMIC DNA; GENE; DISPLAY; TOOLS; OMICS;
D O I
10.1186/1471-2105-13-249
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: MEDLINE (R)/PubMed (R) indexes over 20 million biomedical articles, providing curated annotation of its contents using a controlled vocabulary known as Medical Subject Headings (MeSH). The MeSH vocabulary, developed over 50+ years, provides a broad coverage of topics across biomedical research. Distilling the essential biomedical themes for a topic of interest from the relevant literature is important to both understand the importance of related concepts and discover new relationships. Results: We introduce a novel method for determining enriched curator-assigned MeSH annotations in a set of papers associated to a topic, such as a gene, an author or a disease. We generate MeSH Over-representation Profiles (MeSHOPs) to quantitatively summarize the annotations in a form convenient for further computational analysis and visualization. Based on a hypergeometric distribution of assigned terms, MeSHOPs statistically account for the prevalence of the associated biomedical annotation while highlighting unusually prevalent terms based on a specified background. MeSHOPs can be visualized using word clouds, providing a succinct quantitative graphical representation of the relative importance of terms. Using the publication dates of articles, MeSHOPs track changing patterns of annotation over time. Since MeSHOPs are quantitative vectors, MeSHOPs can be compared using standard techniques such as hierarchical clustering. The reliability of MeSHOP annotations is assessed based on the capacity to re-derive the subset of the Gene Ontology annotations with equivalent MeSH terms. Conclusions: MeSHOPs allows quantitative measurement of the degree of association between any entity and the annotated medical concepts, based directly on relevant primary literature. Comparison of MeSHOPs allows entities to be related based on shared medical themes in their literature. A web interface is provided for generating and visualizing MeSHOPs.
引用
收藏
页数:11
相关论文
共 29 条
  • [1] Can literature analysis identify innovation drivers in drug discovery?
    Agarwal, Pankaj
    Searls, David B.
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2009, 8 (11) : 865 - 878
  • [2] Literature mining in support of drug discovery
    Agarwal, Pankaj
    Searls, David B.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2008, 9 (06) : 479 - 492
  • [3] Genes2WordCloud: a quick way to identify biological themes from gene lists and free text
    Baroukh, Caroline
    Jenkins, Sherry L.
    Dannenfelser, Ruth
    Ma'ayan, Avi
    [J]. SOURCE CODE FOR BIOLOGY AND MEDICINE, 2011, 6 (01):
  • [4] DAVID: Database for annotation, visualization, and integrated discovery
    Dennis, G
    Sherman, BT
    Hosack, DA
    Yang, J
    Gao, W
    Lane, HC
    Lempicki, RA
    [J]. GENOME BIOLOGY, 2003, 4 (09)
  • [5] Desai J, 2011, CANCER, V680, P709
  • [6] MeSHer: Identifying biological concepts in microarray assays based on PubMed references and MeSH terms
    Djebbari, A
    Karamycheva, S
    Howe, E
    Quackenbush, J
    [J]. BIOINFORMATICS, 2005, 21 (15) : 3324 - 3326
  • [7] Mining the Gene Wiki for functional genomic knowledge
    Good, Benjamin M.
    Howe, Douglas G.
    Lin, Simon M.
    Kibbe, Warren A.
    Su, Andrew I.
    [J]. BMC GENOMICS, 2011, 12
  • [8] Improved detection of overrepresentation of Gene-Ontology annotations with parentchild analysis
    Grossmann, Steffen
    Bauer, Sebastian
    Robinson, Peter N.
    Vingron, Martin
    [J]. BIOINFORMATICS, 2007, 23 (22) : 3024 - 3031
  • [9] Exploring Clinical Associations Using '-Omics' Based Enrichment Analyses
    Hanauer, David A.
    Rhodes, Daniel R.
    Chinnaiyan, Arul M.
    [J]. PLOS ONE, 2009, 4 (04):
  • [10] Hirschman L., 2007, SEMANTIC WEB, P53, DOI [10.1007/978-0-387-48438-9_4, DOI 10.1007/978-0-387-48438-9_4]