Computational Test for Conditional Independence

被引:0
作者
Thorjussen, Christian B. H. [1 ,2 ]
Liland, Kristian Hovde [2 ]
Mage, Ingrid [1 ]
Solberg, Lars Erik [1 ]
机构
[1] Nofima As, Osloveien 1, N-1431 As, Norway
[2] Norwegian Univ Life Sci, Fac Sci & Technol, As, Norway
关键词
conditional independence; computational hypothesis testing; categorical variables; graphical models; causal inference; INFERENCE;
D O I
10.3390/a17080323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conditional Independence (CI) testing is fundamental in statistical analysis. For example, CI testing helps validate causal graphs or longitudinal data analysis with repeated measures in causal inference. CI testing is difficult, especially when testing involves categorical variables conditioned on a mixture of continuous and categorical variables. Current parametric and non-parametric testing methods are designed for continuous variables and can quickly fall short in the categorical case. This paper presents a computational approach for CI testing suited for categorical data types, which we call computational conditional independence (CCI) testing. The test procedure is based on permutation and combines machine learning prediction algorithms and Monte Carlo cross-validation. We evaluated the approach through simulation studies and assessed the performance against alternative methods: the generalized covariance measure test, the kernel conditional independence test, and testing with multinomial regression. We find that the computational approach to testing has utility over the alternative methods, achieving better control over type I error rates. We hope this work can expand the toolkit for CI testing for practitioners and researchers.
引用
收藏
页数:22
相关论文
共 29 条
  • [1] Ankan A, 2021, CURR PROTOC, V1, DOI 10.1002/cpz1.45
  • [2] [Anonymous], 2014, Categorical Data Analysis, DOI DOI 10.1002/0471249688
  • [3] Baja A., 2023, Performance Metrics in Machine Learning Complete Guide
  • [4] Nonparametric Copula-Based Test for Conditional Independence with Applications to Granger Causality
    Bouezmarni, Taoufik
    Rombouts, Jeroen V. K.
    Taamouti, Abderrahim
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2012, 30 (02) : 275 - 287
  • [5] Statistical inference for exploratory data analysis and model diagnostics
    Buja, Andreas
    Cook, Dianne
    Hofmann, Heike
    Lawrence, Michael
    Lee, Eun-Kyung
    Swayne, Deborah F.
    Wickham, Hadley
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906): : 4361 - 4383
  • [6] Casella G., 2002, Statistical inference, V2nd
  • [7] Chen T., 2024, xgboost: extreme gradient boosting, DOI [10.32614/CRAN.package.xgboost, DOI 10.32614/CRAN.PACKAGE.XGBOOST]
  • [8] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [9] A statistical problem for inference to regulatory structure from associations of gene expression measurements with microarrays
    Chu, TJ
    Glymour, C
    Scheines, R
    Spirtes, P
    [J]. BIOINFORMATICS, 2003, 19 (09) : 1147 - 1152
  • [10] A Primer on Quantitized Data Analysis and Permutation Testing
    Collingridge, Dave S.
    [J]. JOURNAL OF MIXED METHODS RESEARCH, 2013, 7 (01) : 81 - 97