A method of extracting the number of trial participants from abstracts describing randomized controlled trials

被引:15
作者
Hansen, Marie J. [1 ]
Rasmussen, Nana O. [1 ]
Chung, Grace [2 ]
机构
[1] Aalborg Univ, Aalborg, Denmark
[2] Univ New S Wales, Sydney, NSW, Australia
关键词
D O I
10.1258/jtt.2008.007007
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
We have developed a method for extracting the number of trial participants from abstracts describing randomized controlled trials (RCTs); the number of trial participants may be an indication of the reliability of the trial. The method depends on statistical natural language processing. The number of interest was determined by a binary supervised classification based on a support vector machine algorithm. The method was trialled on 223 abstracts in which the number of trial participants was identified manually to act as a gold standard. Automatic extraction resulted in 2 false-positive and 19 false-negative classifications. The algorithm was capable of extracting the number of trial participants with an accuracy of 97% and an F-measure of 0.84. The algorithm may improve the selection of relevant articles in regard to question-answering, and hence may assist in decision-making.
引用
收藏
页码:354 / 358
页数:5
相关论文
共 23 条
[1]  
[Anonymous], P ACM SIGIR C RES DE
[2]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[3]  
BURNS G, 2008, INTELLIGENT APPROACH
[4]  
CHUNG GY, P BIONLP 2007
[5]   Answering clinical questions with knowledge-based and statistical techniques [J].
Demner-Fushman, Dina ;
Lin, Jimmy .
COMPUTATIONAL LINGUISTICS, 2007, 33 (01) :63-103
[6]   Support vector machines for spam categorization [J].
Drucker, H ;
Wu, DH ;
Vapnik, VN .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :1048-1054
[7]   Searching the medical literature using PubMed: A tutorial [J].
Ebbert, JO ;
Dupras, DM ;
Erwin, PJ .
MAYO CLINIC PROCEEDINGS, 2003, 78 (01) :87-91
[8]  
Huang Xiaoli, 2006, AMIA Annu Symp Proc, P359
[9]  
Humphreys BL, 1998, J AM MED INFORM ASSN, V5, P1
[10]  
Joachims Thorsten, 1998, P ECML 98 10 EUR C M, P137