Using Mutual Information Clustering to Discover Food Allergen Cross-Reactivity

被引:0
作者
Lai, Kenneth H. [1 ]
Blackley, Suzanne V. [1 ]
Zhou, Li [2 ]
机构
[1] Partners Healthcare Syst, Clin & Qual Anal, Boston, MA 02199 USA
[2] Harvard Med Sch, Brigham & Womens Hosp, Boston, MA USA
来源
2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) | 2017年
基金
美国医疗保健研究与质量局;
关键词
mutual information; clustering; food allergy; electronic health records;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Mutual information clustering is an agglomerative hierarchical clustering method that has been used to group random variables or sets thereof. Some researchers have found that the normalization method used can lead to oddly-sized clusters that do not line up with expected results. We introduce a new normalization parameter to control the size of the clusters, and apply it to food allergy data from a large allergy repository from an electronic health record, treating the distributions of food allergies in our population as random variables. Our method was able to identify previously known food cross-reaction groups (with an adjusted Rand index of 0.971, outperforming alternative clustering algorithms), in addition to proposing possible new groups. Our results demonstrate the viability of mutual information clustering as an approach for discovering possible food cross-reactions.
引用
收藏
页码:732 / 735
页数:4
相关论文
共 14 条
[1]  
Acker W.W., J ALLERGY C IN PRESS
[2]  
Cover T.M., 2006, ELEMENTS INFORM
[3]   COMPARING PARTITIONS [J].
HUBERT, L ;
ARABIE, P .
JOURNAL OF CLASSIFICATION, 1985, 2 (2-3) :193-218
[4]  
Jain AK, 1988, Algorithms for Clustering Data
[5]  
Kaufman L., 1987, Statistical Data Analysis Based on the L1-Norm and Related Methods. First International Conference, P405
[6]   Hierarchical clustering using mutual information [J].
Kraskov, A ;
Stögbauer, H ;
Andrzejak, RG ;
Grassberger, P .
EUROPHYSICS LETTERS, 2005, 70 (02) :278-284
[7]  
Kuperman Gilad J, 2003, AMIA Annu Symp Proc, P376
[8]   The similarity metric [J].
Li, M ;
Chen, X ;
Li, X ;
Ma, B ;
Vitányi, PMB .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2004, 50 (12) :3250-3264
[9]   A Bayesian Alternative to Mutual Information for the Hierarchical Clustering of Dependent Random Variables [J].
Marrelec, Guillaume ;
Messe, Arnaud ;
Bellec, Pierre .
PLOS ONE, 2015, 10 (09)
[10]   Food entries in a large allergy data repository [J].
Plasek, Joseph M. ;
Goss, Foster R. ;
Lai, Kenneth H. ;
Lau, Jason J. ;
Seger, Diane L. ;
Blumenthal, Kimberly G. ;
Wickner, Paige G. ;
Slight, Sarah P. ;
Chang, Frank Y. ;
Topaz, Maxim ;
Bates, David W. ;
Zhou, Li .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (E1) :E79-E87