COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching

被引:31
作者
Gao, Junyi [1 ]
Xiao, Cao [1 ]
Glass, Lucas M. [1 ,2 ]
Sun, Jimeng [3 ]
机构
[1] IQVIA, Analyt Ctr Excellence, Durham, NC 27703 USA
[2] Temple Univ, Dept Stat, Philadelphia, PA 19122 USA
[3] Univ Illinois, Dept Comp Sci, Champaign, IL USA
来源
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2020年
基金
美国国家科学基金会;
关键词
cross-modal learning; pseudo-siamese network; trial recruitment; EXTRACTION; RECORDS;
D O I
10.1145/3394486.3403123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clinical trials play important roles in drug development but often suffer from expensive, inaccurate and insufficient patient recruitment. The availability of massive electronic health records (EHR) data and trial eligibility criteria (EC) bring a new opportunity to data driven patient recruitment. One key task named patient-trial matching is to find qualified patients for clinical trials given structured EHR and unstructured EC text (both inclusion and exclusion criteria). How to match complex EC text with longitudinal patient EHRs? How to embed many-to-many relationships between patients and trials? How to explicitly handle the difference between inclusion and exclusion criteria? In this paper, we proposed CrOss-Modal PseudO-SiamEse network (COMPOSE) to address these challenges for patient-trial matching. One path of the network encodes EC using convolutional highway network. The other path processes EHR with multi-granularity memory network that encodes structured patient records into multiple levels based on medical ontology. Using the EC embedding as query, COMPOSE performs attentional record alignment and thus enables dynamic patient-trial matching. COMPOSE also introduces a composite loss term to maximize the similarity between patient records and inclusion criteria while minimize the similarity to the exclusion criteria. Experiment results show COMPOSE can reach 98.0% AUC on patient-criteria matching and 83.7% accuracy on patient-trial matching, which leads 24.3% improvement over the best baseline on real-world patient-trial matching tasks.
引用
收藏
页码:803 / 812
页数:10
相关论文
共 34 条
[1]   Unsupervised entity and relation extraction from clinical records in Italian [J].
Alicante, Anita ;
Corazza, Anna ;
Isgro, Francesco ;
Silvestri, Stefano .
COMPUTERS IN BIOLOGY AND MEDICINE, 2016, 72 :263-275
[2]  
Alsentzer E., 2019, P 2 CLIN NAT LANG PR, P72, DOI DOI 10.18653/V1/W19-1909
[3]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[4]  
[Anonymous], 2017, 31 C NEURAL INFORM P
[5]   Learning Eligibility in Cancer Clinical Trials Using Deep Neural Networks [J].
Bustos, Aurelia ;
Pertusa, Antonio .
APPLIED SCIENCES-BASEL, 2018, 8 (07)
[6]  
Campbell M K, 2007, Health Technol Assess, V11, pix
[7]  
Choi E, 2018, ADV NEUR IN, V31
[8]   GRAM: Graph-based Attention Model for Healthcare Representation Learning [J].
Choi, Edward ;
Bahadori, Mohammad Taha ;
Song, Le ;
Stewart, Walter F. ;
Sun, Jimeng .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :787-795
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]  
Feller S., 2015, One in four cancer trials fails to enroll enough participants