Using Bayesian Latent Gaussian Graphical Models to Infer Symptom Associations in Verbal Autopsies

被引:0
作者
Li, Zehang Richard [1 ]
McComick, Tyler H. [2 ,3 ]
Clark, Samuel J. [4 ]
机构
[1] Yale Sch Publ Hlth, Dept Biostat, New Haven, CT USA
[2] Univ Washington, Dept Stat, Seattle, WA 98195 USA
[3] Univ Washington, Dept Sociol, Seattle, WA 98195 USA
[4] Ohio State Univ, Dept Sociol, Columbus, OH 43210 USA
来源
BAYESIAN ANALYSIS | 2020年 / 15卷 / 03期
关键词
cause of death; mixed data; high dimensional; spike-and-slab; parameter expansion; LASSO;
D O I
10.1214/19-BA1172
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Learning dependence relationships among variables of mixed types provides insights in a variety of scientific settings and is a well-studied problem in statistics. Existing methods, however, typically rely on copious, high quality data to accurately learn associations. In this paper, we develop a method for scientific settings where learning dependence structure is essential, but data are sparse and have a high fraction of missing values. Specifically, our work is motivated by survey-based cause of death assessments known as verbal autopsies (VAs). We propose a Bayesian approach to characterize dependence relationships using a latent Gaussian graphical model that incorporates informative priors on the marginal distributions of the variables. We demonstrate such information can improve estimation of the dependence structure, especially in settings with little training data. We show that our method can be integrated into existing probabilistic cause-of-death assignment algorithms and improves model performance while recovering dependence patterns between symptoms that can inform efficient questionnaire design in future data collection.
引用
收藏
页码:781 / 807
页数:27
相关论文
共 50 条
[1]   Variable Selection for Clustering and Classification [J].
Andrews, Jeffrey L. ;
McNicholas, Paul D. .
JOURNAL OF CLASSIFICATION, 2014, 31 (02) :136-153
[2]  
Barnard J, 2000, STAT SINICA, V10, P1281
[3]   Inferring network structure in non-normal and mixed discrete-continuous genomic data [J].
Bhadra, Anindya ;
Rao, Arvind ;
Baladandayuthapani, Veerabhadran .
BIOMETRICS, 2018, 74 (01) :185-195
[4]  
Bu Y., 2017, ARXIV170402739
[5]   A probabilistic approach to interpreting verbal autopsies: methodology and preliminary validation in Vietnam [J].
Byass, P ;
Huong, DL ;
Minh, HV .
SCANDINAVIAN JOURNAL OF PUBLIC HEALTH, 2003, 31 :32-37
[6]  
Clark S. J., 2018, ARXIV180307141
[7]   Profile: The Karonga Health and Demographic Surveillance System [J].
Crampin, Amelia C. ;
Dube, Albert ;
Mboma, Sebastian ;
Price, Alison ;
Chihana, Menard ;
Jahn, Andreas ;
Baschieri, Angela ;
Molesworth, Anna ;
Mwaiyeghele, Elnaeus ;
Branson, Keith ;
Floyd, Sian ;
McGrath, Nuala ;
Fine, Paul E. M. ;
French, Neil ;
Glynn, Judith R. ;
Zaba, Basia .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2012, 41 (03) :676-685
[8]  
Deshpande S. K., 2017, ARXIV170808911
[9]   COPULA GAUSSIAN GRAPHICAL MODELS AND THEIR APPLICATION TO MODELING FUNCTIONAL DISABILITY DATA [J].
Dobra, Adrian ;
Lenkoski, Alex .
ANNALS OF APPLIED STATISTICS, 2011, 5 (2A) :969-993
[10]   High dimensional semiparametric latent graphical model for mixed data [J].
Fan, Jianqing ;
Liu, Han ;
Ning, Yang ;
Zou, Hui .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2017, 79 (02) :405-421