MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

被引:0
|
作者
Gao, Erdun [1 ]
Ng, Ignavier [2 ]
Gong, Mingming [1 ]
Shen, Li [3 ]
Huang, Wei [1 ]
Liu, Tongliang [4 ]
Zhang, Kun [2 ,5 ]
Bondell, Howard [1 ]
机构
[1] Univ Melbourne, Parkville, Australia
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] JD Explore Acad, Beijing, Peoples R China
[4] Univ Sydney, Sydney, Australia
[5] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
美国国家卫生研究院; 澳大利亚研究理事会;
关键词
BAYESIAN NETWORKS; EM ALGORITHM; IMPUTATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art causal discovery methods usually assume that the observational data is complete. However, the missing data problem is pervasive in many practical scenarios such as clinical trials, economics, and biology. One straightforward way to address the missing data problem is first to impute the data using off-the-shelf imputation methods and then apply existing causal discovery methods. However, such a two-step method may suffer from suboptimality, as the imputation algorithm may introduce bias for modeling the underlying data distribution. In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations. Focusing mainly on the assumptions of ignorable missingness and the identifiable additive noise models (ANMs), MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization (EM) framework. In the E-step, in cases where computing the posterior distributions of parameters in closed-form is not feasible, Monte Carlo EM is leveraged to approximate the likelihood. In the M-step, MissDAG leverages the density transformation to model the noise distributions with simpler and specific formulations by virtue of the ANMs and uses a likelihood-based causal discovery algorithm with directed acyclic graph constraint. We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Clustering Dynamic Spatio-Temporal Patterns in the Presence of Noise and Missing Data
    Chen, Xi C.
    Faghmous, James H.
    Khandelwal, Ankush
    Kumar, Vipin
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 2575 - 2581
  • [42] Accountants' usage of causal business models in the presence of benchmark data:: A note
    Vera-Munoz, Sandra C.
    Shackell, Margaret
    Buehnr, Marc
    CONTEMPORARY ACCOUNTING RESEARCH, 2007, 24 (03) : 1015 - +
  • [43] Pattern mixture models for clinical validation of biomarkers in the presence of missing data
    Gao, Fei
    Dong, Jun
    Zeng, Donglin
    Rong, Alan
    Ibrahim, Joseph G.
    STATISTICS IN MEDICINE, 2017, 36 (19) : 2994 - 3004
  • [44] Recursive Identification of Continuous Two-Dimensional Systems in the Presence of Additive Colored Noise
    Shafieirad, Mohsen
    Shafiee, Masoud
    Abedi, Mehrdad
    IETE JOURNAL OF RESEARCH, 2014, 60 (01) : 74 - 84
  • [45] Continuous-time AR process parameter estimation in presence of additive white noise
    Fan, HH
    Söderström, T
    Zou, Y
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1999, 47 (12) : 3392 - 3398
  • [46] A comparison of methods to estimate the survivor average causal effect in the presence of missing data: a simulation study
    McGuinness, Myra B.
    Kasza, Jessica
    Karahalios, Amalia
    Guymer, Robyn H.
    Finger, Robert P.
    Simpson, Julie A.
    BMC MEDICAL RESEARCH METHODOLOGY, 2019, 19 (01)
  • [47] Causal inference in the presence of missing data using a random forest-based matching algorithm
    Hillis, Tristan
    Guarcello, Maureen A.
    Levine, Richard A.
    Fan, Juanjuan
    STAT, 2021, 10 (01):
  • [48] A comparison of methods to estimate the survivor average causal effect in the presence of missing data: a simulation study
    Myra B. McGuinness
    Jessica Kasza
    Amalia Karahalios
    Robyn H. Guymer
    Robert P. Finger
    Julie A. Simpson
    BMC Medical Research Methodology, 19
  • [49] Causal discovery of 1-factor measurement models in linear latent variable models with arbitrary noise distributions
    Xie, Feng
    Zeng, Yan
    Chen, Zhengming
    He, Yangbo
    Geng, Zhi
    Zhang, Kun
    NEUROCOMPUTING, 2023, 526 : 48 - 61
  • [50] Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models
    Choi, Jungjun
    Yuan, Ming
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,