Improvement of the quality of medical databases: data-mining-based prediction of diagnostic codes from previous patient codes

被引:20
作者
Djennaoui, Mehdi [1 ]
Ficheur, Gregoire [1 ]
Beuscart, Regis [1 ]
Chazard, Emmanuel [1 ]
机构
[1] Univ Lille, EA 2694, Dept Publ Hlth, F-59000 Lille, France
来源
DIGITAL HEALTHCARE EMPOWERING EUROPEANS | 2015年 / 210卷
关键词
Electronic Health Records; Decision Support Techniques; Data mining; Nationwide Database; SAFETY;
D O I
10.3233/978-1-61499-512-8-419
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction. Diagnoses and medical procedures collected under the French system of information are recorded in a nationwide database, the "PMSI national database", which is accessible for exploitation. Quality of the data in this database is directly related to the quality of coding, which can be of poor quality. Among the proposed methods for the exploitation of health databases, data mining techniques are particularly interesting. Our objective is to build sequential rules for missing diagnoses prediction by data mining of the PMSI national database. Method. Our working sample was constructed from the national database for years 2007 to 2010. The information retained for rules construction were medical diagnoses and medical procedures. The rules were selected using a statistical filter, and selected rules were validated by case review based on medical letters, which enabled to estimate the improvement of diagnoses recoding. Results. The work sample was made of 59,170 inpatient stays. The predicted ICD codes were Ell (non-insulin-dependent diabetes mellitus), 148 (atrial fibrillation and flutter) and ISO (heart failure).We validated three sequential rules with a substantial improvement of positive predictive value: {E11,110,DZQM006}=>{E11} {E11,110,I48}=>{E11} {148,I69}=>{148} Discussion. We were able to extract by data mining three simple, reliable and effective sequential rules, with a substantial improvement in diagnoses recoding. The results of our study indicate the opportunity to improve the data quality of the national database by data mining methods.
引用
收藏
页码:419 / 423
页数:5
相关论文
共 13 条
[1]   HARD DATA ANALYTICS PROBLEMS MAKE FOR BETTER DATA ANALYSIS ALGORITHMS: Bioinformatics as an Example [J].
Bacardit, Jaume ;
Widera, Pawe ;
Lazzarini, Nicola ;
Krasnogor, Natalio .
BIG DATA, 2014, 2 (03) :164-176
[2]   Prospective Data Mining of Six Products in the US FDA Adverse Event Reporting System Disposition of Events Identified and Impact on Product Safety Profiles [J].
Bailey, Steven ;
Singh, Ajay ;
Azadian, Robert ;
Huber, Peter ;
Blum, Michael .
DRUG SAFETY, 2010, 33 (02) :139-146
[3]   Patient Safety Through Intelligent Procedures in Medication: The PSIP Project [J].
Beuscart, Regis ;
McNair, Peter ;
Brender, Jytte ;
Consortium, P. S. I. P. .
DETECTION AND PREVENTION OF ADVERSE DRUG EVENTS: INFORMATION TECHNOLOGIES AND HUMAN FACTORS, 2009, 148 :6-13
[4]  
Ficheur G, 2013, REV DEPIDEMIOLOGIE S, V61, pS18
[5]   Trustworthy reuse of health data: A transnational perspective [J].
Geissbuhler, A. ;
Safran, C. ;
Buchan, I. ;
Bellazzi, R. ;
Labkoff, S. ;
Eilenberg, K. ;
Leese, A. ;
Richardson, C. ;
Mantas, J. ;
Murray, P. ;
De Moor, G. .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2013, 82 (01) :1-9
[6]   Mining electronic health records: towards better research applications and clinical care [J].
Jensen, Peter B. ;
Jensen, Lars J. ;
Brunak, Soren .
NATURE REVIEWS GENETICS, 2012, 13 (06) :395-405
[7]   A Pragmatic Framework for Single-site and Multisite Data Quality Assessment in Electronic Health Record-based Clinical Research [J].
Kahn, Michael G. ;
Raebel, Marsha A. ;
Glanz, Jason M. ;
Riedlinger, Karen ;
Steiner, John F. .
MEDICAL CARE, 2012, 50 (07) :S21-S29
[8]  
Koh Hian Chye, 2005, J Healthc Inf Manag, V19, P64
[9]  
Pournelle G. H., 1953, Journal of Mammalogy, V34, P133, DOI 10.1890/0012-9658(2002)083[1421:SDEOLC]2.0.CO
[10]  
2