A Matter of Words: NLP for Quality Evaluation of Wikipedia Medical Articles

被引:7
作者
Cozza, Vittoria [1 ]
Petrocchi, Marinella [1 ]
Spognardi, Angelo [2 ]
机构
[1] IIT CNR, Pisa, Italy
[2] DTU Compute, Lyngby, Denmark
来源
WEB ENGINEERING (ICWE 2016) | 2016年 / 9671卷
关键词
D O I
10.1007/978-3-319-38791-8_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains, like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles, relying on Natural Language Processing (NLP) and dictionaries-based techniques. The results of our experiments confirm that, by considering domain-oriented features, it is possible to improve existing solutions, mainly with those articles that other approaches have less correctly classified.
引用
收藏
页码:448 / 456
页数:9
相关论文
共 18 条
[1]  
Attardi G., 2014, ITALIAN C COMP LING
[2]   Is Wikipedia a reliable learning resource for medical students? Evaluating respiratory topics [J].
Azer, Samy A. .
ADVANCES IN PHYSIOLOGY EDUCATION, 2015, 39 (01) :5-14
[3]  
Blumenstock JE, 2008, P 17 INT C WORLD WID, P1095, DOI [10.1145/1367497.1367673, DOI 10.1145/1367497.1367673]
[4]   Exploring semantic groups through visual approaches [J].
Bodenreider, O ;
McCray, AT .
JOURNAL OF BIOMEDICAL INFORMATICS, 2003, 36 (06) :414-432
[5]  
Cabitza F., 2013, Quality issues in the management of web information, P159, DOI [DOI 10.1007/978-3-642-37688-78, 10.1007/978-3-642-37688-7 8]
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]  
Cozza V., 2016, CORRABS160301987
[8]  
Hall M., 2009, SIGKDD EXPLORATIONS, V11, P10, DOI [DOI 10.1145/1656274.1656278, 10.1145/1656274.1656278]
[9]  
Hodson R., 2015, NATURE NEWS
[10]  
Kewen Wu, 2010, Proceedings of the 2010 International Conference of Information Science and Management Engineering. ISME 2010, P343, DOI 10.1109/ISME.2010.114