Anne O'Tate: Value-added PubMed search engine for analysis and text mining

被引:3
作者
Smalheiser, Neil R. [1 ]
Fragnito, Dean P. [2 ]
Tirk, Eric E. [2 ]
机构
[1] Univ Illinois, Dept Psychiat, Chicago, IL 60612 USA
[2] Xornet Inc, Rochester, NY USA
关键词
MODEL; ARTICLES;
D O I
10.1371/journal.pone.0248335
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Over a decade ago, we introduced Anne O'Tate, a free, public web-based tool http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/AnneOTate.cgi to support user-driven summarization, drill-down and mining of search results from PubMed, the leading search engine for biomedical literature. A set of hotlinked buttons allows the user to sort and rank retrieved articles according to important words in titles and abstracts; topics; author names; affiliations; journal names; publication year; and clustered by topic. Any result can be further mined by choosing any other button, and small search results can be expanded to include related articles. It has been deployed continuously, serving a wide range of biomedical users and needs, and over time has also served as a platform to support the creation of new tools that address additional needs. Here we describe the current, greatly expanded implementation of Anne O'Tate, which has added additional buttons to provide new functionalities: We now allow users to sort and rank search results by important phrases contained in titles and abstracts; the number of authors listed on the article; and pairs of topics that cooccur significantly more than chance. We also display articles according to NLM-indexed publication types, as well as according to 50 different publication types and study designs as predicted by a novel machine learning-based model. Furthermore, users can import search results into two new tools: e) Mine the Gap!, which identifies pairs of topics that are underrepresented within set of the search results, and f) Citation Cloud, which for any given article, allows users to visualize the set of articles that cite it; that are cited by it; that are cocited with it; and that are bibliographically coupled to it. We invite the scientific community to explore how Anne O'Tate can assist in analyzing biomedical literature, in a variety of use cases.
引用
收藏
页数:15
相关论文
共 31 条
[1]  
Boyack KW, 2019, SPRINGER HBK, P187, DOI 10.1007/978-3-030-02511-3_8
[2]   Content-rich biological network constructed by mining PubMed abstracts [J].
Chen, H ;
Sharp, BM .
BMC BIOINFORMATICS, 2004, 5 (1)
[3]  
Demner-Fushman D., MINING TEXT DATA 201, P465
[4]   Scalable Topical Phrase Mining from Text Corpora [J].
El-Kishky, Ahmed ;
Song, Yanglei ;
Wang, Chi ;
Voss, Clare R. ;
Han, Jiawei .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 8 (03) :305-316
[5]  
Engwall KD., 2017, J MED LIB ASS JMLA, V105, P200
[6]   Application of text mining in the biomedical domain [J].
Fleuren, Wilco W. M. ;
Alkema, Wynand .
METHODS, 2015, 74 :97-106
[7]   The NIH Open Citation Collection: A public access, broad coverage resource [J].
Hutchins, B. Ian ;
Baker, Kirk L. ;
Davis, Matthew T. ;
Diwersy, Mario A. ;
Haque, Ehsanul ;
Harriman, Robert M. ;
Hoppe, Travis A. ;
Leicht, Stephen A. ;
Meyer, Payam ;
Santangelo, George M. .
PLOS BIOLOGY, 2019, 17 (10)
[8]   BIBLIOGRAPHIC COUPLING BETWEEN SCIENTIFIC PAPERS [J].
KESSLER, MM .
AMERICAN DOCUMENTATION, 1963, 14 (01) :10-&
[9]   From Sole Investigator to Team Scientist: Trends in the Practice and Study of Research Collaboration [J].
Leahey, Erin .
ANNUAL REVIEW OF SOCIOLOGY, VOL 42, 2016, 42 :81-100
[10]   PubMed related articles: a probabilistic topic-based model for content similarity [J].
Lin, Jimmy ;
Wilbur, W. John .
BMC BIOINFORMATICS, 2007, 8 (1)