Out-of-Sample Tuning for Causal Discovery

被引:7
作者
Biza, Konstantina [1 ]
Tsamardinos, Ioannis [1 ]
Triantafillou, Sofia [2 ]
机构
[1] Univ Crete, Dept Comp Sci, Iraklion 70013, Greece
[2] Univ Crete, Dept Math & Appl Math, Iraklion 70013, Greece
基金
欧洲研究理事会;
关键词
Tuning; Markov processes; Data models; Stars; Task analysis; Predictive models; Estimation; Causal-based simulation; causal discovery; out-of-sample; tuning; DIRECTED ACYCLIC GRAPHS; MODEL; NETWORKS; LATENT;
D O I
10.1109/TNNLS.2022.3185842
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Causal discovery is continually being enriched with new algorithms for learning causal graphical probabilistic models. Each one of them requires a set of hyperparameters, creating a great number of combinations. Given that the true graph is unknown and the learning task is unsupervised, the challenge to a practitioner is how to tune these choices. We propose out-of-sample causal tuning (OCT) that aims to select an optimal combination. The method treats a causal model as a set of predictive models and uses out-of-sample protocols for supervised methods. This approach can handle general settings like latent confounders and nonlinear relationships. The method uses an information-theoretic approach to be able to generalize to mixed data types and a penalty for dense graphs to penalize for complexity. To evaluate OCT, we introduce a causal-based simulation method to create datasets that mimic the properties of real-world problems. We evaluate OCT against two other tuning approaches, based on stability and in-sample fitting. We show that OCT performs well in many experimental settings and it is an effective tuning method for causal discovery.
引用
收藏
页码:4963 / 4973
页数:11
相关论文
共 44 条
[1]  
Andrews B, 2019, PR MACH LEARN RES, V104, P4
[2]   Scoring Bayesian networks of mixed variables [J].
Andrews, Bryan ;
Ramsey, Joseph ;
Cooper, Gregory F. .
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2018, 6 (01) :3-18
[3]  
[Anonymous], 2021, IEEE Trans. Broadcast.
[4]  
Biza K, 2020, PR MACH LEARN RES, V138, P17
[5]  
Borboudakis G, 2019, J MACH LEARN RES, V20
[6]  
Chickering D. M., 2003, Journal of Machine Learning Research, V3, P507, DOI 10.1162/153244303321897717
[7]  
Colombo D, 2014, J MACH LEARN RES, V15, P3741
[8]   LEARNING HIGH-DIMENSIONAL DIRECTED ACYCLIC GRAPHS WITH LATENT AND SELECTION VARIABLES [J].
Colombo, Diego ;
Maathuis, Marloes H. ;
Kalisch, Markus ;
Richardson, Thomas S. .
ANNALS OF STATISTICS, 2012, 40 (01) :294-321
[9]   A BAYESIAN METHOD FOR THE INDUCTION OF PROBABILISTIC NETWORKS FROM DATA [J].
COOPER, GF ;
HERSKOVITS, E .
MACHINE LEARNING, 1992, 9 (04) :309-347
[10]  
Dua D., 2017, UCI MACHINE LEARNING