EMR2vec: Bridging the gap between patient data and clinical trial

被引:14
作者
Dhayne, Houssein [1 ]
Kilany, Rima [1 ]
Haque, Rafiqul [2 ]
Taher, Yehia [3 ]
机构
[1] St Joseph Univ, Beirut, Lebanon
[2] Intelligencia, 66 Ave Champs Elysees, Paris, France
[3] David Lab, 45 Ave Etats Unis, Versailles, France
关键词
EMR; Clinical trial; Medical data integration; Neural network; Semantic web; MSC; 2010; 00-01; 99-00; ELECTRONIC HEALTH RECORDS; TEXT; CLASSIFICATION; IDENTIFICATION; INTEGRATION; SEARCH;
D O I
10.1016/j.cie.2021.107236
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Y The human suffering from diseases caused by life-threatening viruses such as SARS, Ebola, and COVID-19 motivated many of us to study and discover the best means to harness the potential of data integration to assist clinical researchers to curb these viruses. Integrating patients data with clinical trials data is enormously promising as it provides a comprehensive knowledge base that accelerates the clinical research response-ability to tackle emerging infectious disease outbreaks. This work introduces EMR2vec, a platform that customises advanced NLP, machine learning and semantic web techniques to link potential patients to suitable clinical trials. Linking these two different but complementary datasets allows clinicians and researchers to compare patients to clinical research opportunities or to automatically select patients for personalized clinical care. The platform derives a 'bag of medical terms' (BoMT) from eligibility criteria by normalizing extracted entities through SNOMED-CT ontology. With the usage of BoMT, an ontological reasoning method is proposed to represent EMR and clinical trials in a vector space model. The platform presents a matching process that reduces vector dimensionality using a neural network, then applies orthogonality projection to measure the similarity between vectors. Finally, the proposed EMR2vec platform is evaluated with an extendable prototype based on Big data tools.
引用
收藏
页数:15
相关论文
共 65 条
[1]   Biomedical negation scope detection with conditional random fields [J].
Agarwal, Shashank ;
Yu, Hong .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (06) :696-701
[2]  
Albrecht T.L., 2008, J CLIN ONCOLOGY OFFI, V26
[3]  
[Anonymous], 2020, TECHNICAL IMPLEMENTA
[4]  
[Anonymous], Resources, DOI DOI 10.16010/J.CNKI.14-1127/S
[5]  
[Anonymous], 2010, P LREC 2010 WORKSHOP
[6]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[7]   The value of structured data elements from electronic health records for identifying subjects for primary care clinical trials [J].
Ateya, Mohammad B. ;
Delaney, Brendan C. ;
Speedie, Stuart M. .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2016, 16
[8]   Audio Classification of Bird Species: a Statistical Manifold Approach [J].
Briggs, Forrest ;
Raich, Raviv ;
Fern, Xiaoli Z. .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :51-60
[9]   MILES: Multiple-Instance Learning via Embedded instance Selection [J].
Chen, Yixin ;
Bi, Jinbo ;
Wang, James Z. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (12) :1931-1947
[10]   Electronic health records: new opportunities for clinical research [J].
Coorevits, P. ;
Sundgren, M. ;
Klein, G. O. ;
Bahr, A. ;
Claerhout, B. ;
Daniel, C. ;
Dugas, M. ;
Dupont, D. ;
Schmidt, A. ;
Singleton, P. ;
De Moor, G. ;
Kalra, D. .
JOURNAL OF INTERNAL MEDICINE, 2013, 274 (06) :547-560