Adversarial Missingness Attacks on Causal Structure Learning

被引:0
作者
Koyuncu, Deniz [1 ]
Gittens, Alex [1 ]
Yener, Bulent [1 ]
Yung, Moti [2 ,3 ]
机构
[1] Rensselaer Polytech Inst, Troy, NY 12180 USA
[2] Google LLC, New York, NY USA
[3] Columbia Univ, New York, NY USA
关键词
Causal ML; Causal Structure Learning; Missing Data; Adversarial ML; Data Poisoning;
D O I
10.1145/3682065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Causality-informed machine learning has been proposed as an avenue for achieving many of the goals of modern machine learning, from ensuring generalization under domain shifts to attaining fairness, robustness, and interpretability. A key component of causal machine learning is the inference of causal structures from observational data; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate structural causal models (SCMs). However, when the data can be audited for correctness (e.g., it is cryptographically signed by its source), this adversarial mechanism is invalidated. This work introduces a novel attack methodology wherein the adversary deceptively omits a portion of the true training data to bias the learned causal structures in a desired manner (under strong signed sample input validation, this behavior seems to be the only strategy available to the adversary). Under this model, theoretically sound attack mechanisms are derived for the case of arbitrary SCMs, and a sample-efficient learning-based heuristic is given. Experimental validation of these approaches on real and synthetic datasets, across a range of SCMs from the family of additive noise models (linear Gaussian, linear non-Gaussian, and non-linear Gaussian), demonstrates the effectiveness of adversarial missingness attacks at deceiving popular causal structure learning algorithms.
引用
收藏
页数:60
相关论文
共 27 条
[1]  
Ahmed Ibrahim M., 2021, Advances in Cyber Security: Third International Conference, ACeS 2021, Penang, Malaysia, August 24-25, 2021, Revised Selected Papers. Communications in Computer and Information Science (1487), P586, DOI 10.1007/978-981-16-8059-5_36
[2]   Adversarial data poisoning attacks against the PC learning algorithm [J].
Alsuwat, Emad ;
Alsuwat, Hatim ;
Valtorta, Marco ;
Farkas, Csilla .
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2020, 49 (01) :3-31
[3]  
Alsuwat Emad, 2018, P JOINT EUR C MACH L, P159
[4]  
Bhattacharya Rohit, 2020, P MACHINE LEARNING R, P1149
[5]  
Cai Ruisi, P MACHINE LEARNING R
[6]   A Backdoor Attack Against LSTM-Based Text Classification Systems [J].
Dai, Jiazhu ;
Chen, Chuanshuai ;
Li, Yufeng .
IEEE ACCESS, 2019, 7 :138872-138878
[7]   A survey on missing data in machine learning [J].
Emmanuel, Tlamelo ;
Maupong, Thabiso ;
Mpoeleng, Dimane ;
Semong, Thabo ;
Mphago, Banyatsang ;
Tabona, Oteng .
JOURNAL OF BIG DATA, 2021, 8 (01)
[8]   Influence Function based Data Poisoning Attacks to Top-N Recommender Systems [J].
Fang, Minghong ;
Gong, Neil Zhenqiang ;
Liu, Jia .
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, :3019-3025
[9]   Sparse inverse covariance estimation with the graphical lasso [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Robert .
BIOSTATISTICS, 2008, 9 (03) :432-441
[10]  
Gao ER, 2022, Arxiv, DOI arXiv:2205.13869