Testing Causal Theories with Learned Proxies

被引：19

作者：

Knox, Dean ^{[1
]}

Lucas, Christopher ^{[2
,3
]}

Cho, Wendy K. Tam ^{[4
,5
,6
,7
,8
,9
,10
]}

机构：

[1] Univ Penn, Wharton Sch, Operat Informat & Decis Dept & Analyt Wharton, Philadelphia, PA 19104 USA

[2] Washington Univ, Dept Polit Sci, St Louis, MO 63110 USA

[3] Washington Univ, Div Computat & Data Sci, St Louis, MO 63110 USA

[4] Univ Illinois, Dept Polit Sci, Champaign, IL USA

[5] Univ Illinois, Dept Stat, Champaign, IL USA

[6] Univ Illinois, Dept Math, Champaign, IL USA

[7] Univ Illinois, Dept Comp Sci, Champaign, IL USA

[8] Univ Illinois, Dept Asian Amer Studies, Champaign, IL USA

[9] Univ Illinois, Coll Law, Champaign, IL USA

[10] Univ Illinois, Natl Ctr Supercomp Applicat, Champaign, IL USA

来源：

ANNUAL REVIEW OF POLITICAL SCIENCE | 2022年 / 25卷

关键词：

causal inference; machine learning; supervised learning; measurement; proxies; DIRECTED ACYCLIC GRAPHS; MODEL; DEMOCRACY; BIAS; TEXT;

D O I：

10.1146/annurev-polisci-051120-111443

中图分类号：

D0 [政治学、政治理论];

学科分类号：

0302 ; 030201 ;

摘要：

Social scientists commonly use computational models to estimate proxies of unobserved concepts, then incorporate these proxies into subsequent tests of their theories. The consequences of this practice, which occurs in over two-thirds of recent computational work in political science, are underappreciated. Imperfect proxies can reflect noise and contamination from other concepts, producing biased point estimates and standard errors. We demonstrate how analysts can use causal diagrams to articulate theoretical concepts and their relationships to estimated proxies, then apply straightforward rules to assess which conclusions are rigorously supportable. We formalize and extend common heuristics for "signing the bias"-a technique for reasoning about unobserved confounding-to scenarios with imperfect proxies. Using these tools, we demonstrate how, in often-encountered research settings, proxy-based analyses allow for valid tests for the existence and direction of theorized effects. We conclude with best-practice recommendations for the rapidly growing literature using learned proxies to test causal theories.

引用

页码：419 / 441

页数：23

共 49 条

[1] Measurement validity: A shared standard for qualitative and quantitative research
Adcock, R
Collier, D
[J]. AMERICAN POLITICAL SCIENCE REVIEW, 2001, 95 (03) : 529 - 546
[2] Angrist JD, 1996, J AM STAT ASSOC, V91, P444, DOI 10.2307/2291629
[3] [Anonymous], 2014, FREED WORLD 2014 ANN
[4] The orientation of newspaper endorsements in US elections, 1940-2002
Ansolabehere, Stephen
Lessem, Rebecca
Snyder, James M., Jr.
[J]. QUARTERLY JOURNAL OF POLITICAL SCIENCE, 2006, 1 (04) : 393 - 404
[5] A Unified Approach to Measurement Error and Missing Data: Overview and Applications
Blackwell, Matthew
Honaker, James
King, Gary
[J]. SOCIOLOGICAL METHODS & RESEARCH, 2017, 46 (03) : 303 - 341
[6] A CORRELATED TOPIC MODEL OF SCIENCE
Blei, David M.
Lafferty, John D.
[J]. ANNALS OF APPLIED STATISTICS, 2007, 1 (01) : 17 - 35
[7] Latent Dirichlet allocation
Blei, DM
Ng, AY
Jordan, MI
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
[8] A Pairwise Comparison Framework for Fast, Flexible, and Reliable Human Coding of Political Texts
Carlson, David
Montgomery, Jacob M.
[J]. AMERICAN POLITICAL SCIENCE REVIEW, 2017, 111 (04) : 835 - 843
[9] Stan: A Probabilistic Programming Language
Carpenter, Bob
Gelman, Andrew
Hoffman, Matthew D.
Lee, Daniel
Goodrich, Ben
Betancourt, Michael
Brubaker, Marcus A.
Guo, Jiqiang
Li, Peter
Riddell, Allen
[J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 76 (01): : 1 - 29
[10] Prediction, Proxies, and Power
Carroll, Robert J.
Kenkel, Brenton
[J]. AMERICAN JOURNAL OF POLITICAL SCIENCE, 2019, 63 (03) : 577 - 593

← 1 2 3 4 5 →