Software application profile: tpc and micd-R packages for causal discovery with incomplete cohort data

被引:0
作者
Andrews, Ryan M. [1 ,2 ]
Bang, Christine W. [2 ,3 ]
Didelez, Vanessa [2 ,3 ]
Witte, Janine [2 ]
Foraita, Ronja [2 ]
机构
[1] Boston Univ, Dept Epidemiol, Boston, MA USA
[2] Leibniz Inst Prevent Res & Epidemiol BIPS, Dept Biometry & Data Management, Achterstr 30, D-28359 Bremen, Germany
[3] Univ Bremen, Dept Math & Comp Sci, Bremen, Germany
基金
美国国家卫生研究院;
关键词
Causal discovery; R; cohort studies; missing data; longitudinal data; BAYESIAN NETWORKS; GRAPHICAL MODELS; INFERENCE; OBESITY;
D O I
10.1093/ije/dyae113
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Motivation The Peter Clark (PC) algorithm is a popular causal discovery method to learn causal graphs in a data-driven way. Until recently, existing PC algorithm implementations in R had important limitations regarding missing values, temporal structure or mixed measurement scales (categorical/continuous), which are all common features of cohort data. The new R packages presented here, micd and tpc, fill these gaps.Implementation micd and tpc packages are R packages.General features The micd package provides add-on functionality for dealing with missing values to the existing pcalg R package, including methods for multiple imputations relying on the Missing At Random assumption. Also, micd allows for mixed measurement scales assuming conditional Gaussianity. The tpc package efficiently exploits temporal information in a way that results in a more informative output that is less prone to statistical errors.Availability The tpc and micd packages are freely available on the Comprehensive R Archive Network (CRAN). Their source code is also available on GitHub (https://github.com/bips-hb/micd; https://github.com/bips-hb/tpc).
引用
收藏
页数:5
相关论文
共 27 条
  • [1] Understanding and preventing childhood obesity and related disorders -: IDEFICS:: A European multilevel epidemiological approach
    Ahrens, W.
    Bammann, K.
    de Henauw, S.
    Halford, J.
    Palou, A.
    Pigeot, I.
    Siani, A.
    Sjostrom, M.
    [J]. NUTRITION METABOLISM AND CARDIOVASCULAR DISEASES, 2006, 16 (04) : 302 - 308
  • [2] Cohort Profile: The transition from childhood to adolescence in European children-how I.Family extends the IDEFICS cohort
    Ahrens, W.
    Siani, A.
    Adan, R.
    De Henauw, S.
    Eiben, G.
    Gwozdz, W.
    Hebestreit, A.
    Hunsberger, M.
    Kaprio, J.
    Krogh, V.
    Lissner, L.
    Molnar, D.
    Moreno, L. A.
    Page, A.
    Pico, C.
    Reisch, L.
    Smith, R. M.
    Tornaritis, M.
    Veidebaum, T.
    Williams, G.
    Pohlabelnu, H.
    Pigeot, I.
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2017, 46 (05) : 1394 - +
  • [3] Andrews B, 2020, PR MACH LEARN RES, V108, P4002
  • [4] Scoring Bayesian networks of mixed variables
    Andrews, Bryan
    Ramsey, Joseph
    Cooper, Gregory F.
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2018, 6 (01) : 3 - 18
  • [5] Andrews RM, 2023, Arxiv, DOI [arXiv:2108.13395, 10.48550/arXiv.2108.13395, DOI 10.48550/ARXIV.2108.13395]
  • [6] Bang CW, 2024, Arxiv, DOI arXiv:2406.19503
  • [7] Bang CW, 2023, PR MACH LEARN RES, V216, P119
  • [8] Boettcher S.G., 2003, J. Stat. Software, V8, P1, DOI [DOI 10.18637/JSS.V008.I20, 10.18637/jss.v008.i20]
  • [9] Chen Q., 2023, tfci
  • [10] Invited commentary: where do the causal DAGS come from?
    Didelez, Vanessa
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2024, 193 (08) : 1075 - 1078