Computational Test for Conditional Independence

被引：0

作者：

Thorjussen, Christian B. H. ^{[1
,2
]}

Liland, Kristian Hovde ^{[2
]}

Mage, Ingrid ^{[1
]}

Solberg, Lars Erik ^{[1
]}

机构：

[1] Nofima As, Osloveien 1, N-1431 As, Norway

[2] Norwegian Univ Life Sci, Fac Sci & Technol, As, Norway

来源：

ALGORITHMS | 2024年 / 17卷 / 08期

关键词：

conditional independence; computational hypothesis testing; categorical variables; graphical models; causal inference; INFERENCE;

D O I：

10.3390/a17080323

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conditional Independence (CI) testing is fundamental in statistical analysis. For example, CI testing helps validate causal graphs or longitudinal data analysis with repeated measures in causal inference. CI testing is difficult, especially when testing involves categorical variables conditioned on a mixture of continuous and categorical variables. Current parametric and non-parametric testing methods are designed for continuous variables and can quickly fall short in the categorical case. This paper presents a computational approach for CI testing suited for categorical data types, which we call computational conditional independence (CCI) testing. The test procedure is based on permutation and combines machine learning prediction algorithms and Monte Carlo cross-validation. We evaluated the approach through simulation studies and assessed the performance against alternative methods: the generalized covariance measure test, the kernel conditional independence test, and testing with multinomial regression. We find that the computational approach to testing has utility over the alternative methods, achieving better control over type I error rates. We hope this work can expand the toolkit for CI testing for practitioners and researchers.

引用

页数：22

共 29 条

[1]

AGRESTI A, 2002, CATEGORICAL DATA ANA

[2] Testing Graphical Causal Models Using the R Package "dagitty" [J].

Ankan, Ankur ;

Wortel, Inge M. N. ;

Textor, Johannes .

CURRENT PROTOCOLS, 2021, 1 (02)

[3]

Baja A., 2023, Performance Metrics in Machine Learning Complete Guide

[4] Nonparametric Copula-Based Test for Conditional Independence with Applications to Granger Causality [J].

Bouezmarni, Taoufik ;

Rombouts, Jeroen V. K. ;

Taamouti, Abderrahim .

JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2012, 30 (02) :275-287

[5] Statistical inference for exploratory data analysis and model diagnostics [J].

Buja, Andreas ;

Cook, Dianne ;

Hofmann, Heike ;

Lawrence, Michael ;

Lee, Eun-Kyung ;

Swayne, Deborah F. ;

Wickham, Hadley .

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906) :4361-4383

[6]

Casella G., 2002, Statistical Inference, V2

[7]

Chen T., 2014, XGBOOST EXTREME GRAD, DOI [10.32614/CRAN.package.xgboost, DOI 10.32614/CRAN.PACKAGE.XGBOOST]

[8] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[9] A statistical problem for inference to regulatory structure from associations of gene expression measurements with microarrays [J].

Chu, TJ ;

Glymour, C ;

Scheines, R ;

Spirtes, P .

BIOINFORMATICS, 2003, 19 (09) :1147-1152

[10] A Primer on Quantitized Data Analysis and Permutation Testing [J].

Collingridge, Dave S. .

JOURNAL OF MIXED METHODS RESEARCH, 2013, 7 (01) :81-97

← 1 2 3 →