Optimal Data-Driven Regression Discontinuity Plots

被引:228
作者
Calonico, Sebastian [1 ]
Cattaneo, Matias D. [2 ]
Titiunik, Rocio [3 ]
机构
[1] Univ Miami, Dept Econ, Coral Gables, FL 33124 USA
[2] Univ Michigan, Dept Econ, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Dept Polit Sci, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
Binning; Partitioning; RD plots; Tuning parameter selection; ASYMPTOTIC NORMALITY; CONVERGENCE-RATES; INFERENCE; DESIGNS; LIFE;
D O I
10.1080/01621459.2015.1017578
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Exploratory data analysis plays a central role in applied statistics and econometrics. In the popular regression-discontinuity (RD) design, the use of graphical analysis has been strongly advocated because it provides both easy presentation and transparent validation of the design. RD plots are nowadays widely used in applications, despite its formal properties being unknown: these plots are typically presented employing ad hoc choices of tuning parameters, which makes these procedures less automatic and more subjective. In this article, we formally study the most common RD plot based on an evenly spaced binning of the data, and propose several (optimal) data-driven choices for the number of bins depending on the goal of the researcher. These RD plots are constructed either to approximate the underlying unknown regression functions without imposing smoothness in the estimator, or to approximate the underlying variability of the raw data while smoothing out the otherwise uninformative scatterplot of the data. In addition, we introduce an alternative RD plot based on quantile spaced binning, study its formal properties, and propose similar (optimal) data-driven choices for the number of bins. The main proposed data-driven selectors employ spacings estimators, which are simple and easy to implement in applications because they do not require additional choices of tuning parameters. Altogether, our results offer an array of alternative RD plots that are objective and automatic when implemented, providing a reliable benchmark for graphical analysis in RD designs. We illustrate the performance of our automatic RD plots using several empirical examples and a Monte Carlo study. All results are readily available in R and STATA using the software packages described in Calonico, Cattaneo, and Titiunik. Supplementary materials for this article are available online.
引用
收藏
页码:1753 / 1769
页数:17
相关论文
共 31 条
[1]   Large sample properties of matching estimators for average treatment effects [J].
Abadie, A ;
Imbens, GW .
ECONOMETRICA, 2006, 74 (01) :235-267
[2]  
[Anonymous], ANNAL EC STAT
[3]   GAUSSIAN LIMITS FOR GENERALIZED SPACINGS [J].
Baryshnikov, Yu. ;
Penrose, Mathew D. ;
Yukichi, J. E. .
ANNALS OF APPLIED PROBABILITY, 2009, 19 (01) :158-185
[4]   Some new asymptotic theory for least squares series: Pointwise and uniform results [J].
Belloni, Alexandre ;
Chernozhukov, Victor ;
Chetverikov, Denis ;
Kato, Kengo .
JOURNAL OF ECONOMETRICS, 2015, 186 (02) :345-366
[5]  
Bertanha M., 2014, 20773 NBER
[6]  
Calonico S., 2014, ECONOMETRICA S, V82
[7]  
Calonico S, 2015, R J, V7, P38
[8]   Robust data-driven inference in the regression-discontinuity design [J].
Calonico, Sebastian ;
Cattaneo, Matias D. ;
Titiunik, Rocio .
STATA JOURNAL, 2014, 14 (04) :909-946
[9]   ROBUST NONPARAMETRIC CONFIDENCE INTERVALS FOR REGRESSION-DISCONTINUITY DESIGNS [J].
Calonico, Sebastian ;
Cattaneo, Matias D. ;
Titiunik, Rocio .
ECONOMETRICA, 2014, 82 (06) :2295-2326
[10]   Randomization Inference in the Regression Discontinuity Design: An Application to Party Advantages in the U.S. Senate [J].
Cattaneo, Matias D. ;
Frandsen, Brigham R. ;
Titiunik, Rocio .
JOURNAL OF CAUSAL INFERENCE, 2015, 3 (01) :1-24