Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data

被引:155
作者
Muchlinski, David [1 ]
Siroky, David [2 ]
He, Jingrui [3 ]
Kocher, Matthew [4 ]
机构
[1] Univ Glasgow, Sch Social & Polit Sci, Glasgow, Lanark, Scotland
[2] Arizona State Univ, Dept Polit Sci, Tempe, AZ USA
[3] Arizona State Univ, Dept Comp Sci & Engn, Tempe, AZ 85287 USA
[4] Yale Univ, Dept Polit Sci, New Haven, CT USA
关键词
CONFLICT; MODEL; DEPENDENCIES; SEPARATION;
D O I
10.1093/pan/mpv024
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
The most commonly used statistical models of civil war onset fail to correctly predict most occurrences of this rare event in out-of-sample data. Statistical methods for the analysis of binary data, such as logistic regression, even in their rare event and regularized forms, perform poorly at prediction. We compare the performance of Random Forests with three versions of logistic regression (classic logistic regression, Firth rare events logistic regression, and L-1-regularized logistic regression), and find that the algorithmic approach provides significantly more accurate predictions of civil war onset in out-of-sample data than any of the logistic regression models. The article discusses these results and the ways in which algorithmic statistical methods like Random Forests can be useful to more accurately predict rare events in conflict data.
引用
收藏
页码:87 / 103
页数:17
相关论文
共 60 条
  • [1] [Anonymous], 1994, Designing Social Inquiry: Scientific Inference in Qualitative Research
  • [2] [Anonymous], 2001, MACH LEARN, DOI DOI 10.1023/A:1010933404324
  • [3] [Anonymous], 2015, PACKAGE RANDOMFOREST
  • [4] [Anonymous], 1996, OUT OF BAG ESTIMATIO
  • [5] [Anonymous], 2004, USING RANDOM FOREST
  • [6] Improving quantitative studies of international conflict: A conjecture
    Beck, N
    King, G
    Zeng, LC
    [J]. AMERICAN POLITICAL SCIENCE REVIEW, 2000, 94 (01) : 21 - 35
  • [7] Blair R., 2015, PREDICTING LOCAL VIO
  • [8] Evaluating forecasts of political conflict dynamics
    Brandt, Patrick T.
    Freeman, John R.
    Schrodt, Philip A.
    [J]. INTERNATIONAL JOURNAL OF FORECASTING, 2014, 30 (04) : 944 - 962
  • [9] Statistical modeling: The two cultures
    Breiman, L
    [J]. STATISTICAL SCIENCE, 2001, 16 (03) : 199 - 215
  • [10] Cederman LE, 2013, CAMB STUD CONTENT, P1