Justifying Additive Noise Model-Based Causal Discovery via Algorithmic Information Theory

被引:19
作者
Janzing, Dominik [1 ]
Steudel, Bastian [2 ]
机构
[1] Max Planck Inst Biol Cybernet, Tubingen, Germany
[2] Max Planck Inst Math Sci, Leipzig, Germany
关键词
D O I
10.1142/S1230161210000126
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
A recent method for causal discovery is in many cases able to infer whether X causes Y or Y causes X for just two observed variables X and Y. It is based on the observation that there exist (non-Gaussian) joint distributions P(X,Y) for which Y may be written as a function of X up to an additive noise term that is independent of X and no such model exists from Y to X. When ever this is the case, one prefers the causal model X -> Y. Here we justify this method by showing that the causal hypothesis Y -> X is unlikely because it requires a specific tuning between P(Y) and P(X vertical bar Y) to generate a distribution that admits an additive noise model from X to Y. To quantify the amount of tuning, needed we derive lower bounds on the algorithmic information shared by P(Y) and P(X vertical bar Y). This way, our justification is consistent with recent approaches for using algorithmic information theory for causal reasoning. We extend this principle to the case where P(X,Y) almost admits an additive noise model. Our results suggest that the above conclusion is more reliable if the complexity of P(Y) is high.
引用
收藏
页码:189 / 212
页数:24
相关论文
共 20 条
[1]   Information geometry on hierarchy of probability distributions [J].
Amari, S .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2001, 47 (05) :1701-1711
[2]  
[Anonymous], 2000, CAUSALITY
[3]   THEORY OF PROGRAM SIZE FORMALLY IDENTICAL TO INFORMATION-THEORY [J].
CHAITIN, GJ .
JOURNAL OF THE ACM, 1975, 22 (03) :329-340
[4]   ON LENGTH OF PROGRAMS FOR COMPUTING FINITE BINARY SEQUENCES [J].
CHAITIN, GJ .
JOURNAL OF THE ACM, 1966, 13 (04) :547-+
[5]  
Cover T.M., 2006, ELEMENTS INFORM THEO, V2nd ed
[6]   Algorithmic statistics [J].
Gács, P ;
Tromp, JT ;
Vitányi, PMB .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2001, 47 (06) :2443-2463
[7]  
Hoyer P., 2009, P C NEUR INF PROC SY
[8]  
JANZING D, IEEE T INFO IN PRESS
[9]   3 APPROACHES TO QUANTITATIVE DEFINITION OF INFORMATION [J].
KOLMOGOROV, AN .
INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 1968, 2 (02) :157-+
[10]  
Lauritzen SL, 1996, GRAPHICAL MODELS