Does big data serve policy? Not without context. An experiment with in silico social science

被引:2
作者
Graziul, Chris [1 ]
Belikov, Alexander [1 ]
Chattopadyay, Ishanu [1 ]
Chen, Ziwen [1 ]
Fang, Hongbo [2 ]
Girdhar, Anuraag [1 ]
Jia, Xiaoshuang [3 ]
Krafft, P. M. [4 ]
Kleiman-Weiner, Max [5 ,6 ]
Lewis, Candice [1 ]
Liang, Chen [1 ]
Muchovej, John [5 ,6 ]
Vientos, Alejandro [5 ,7 ]
Young, Meg [8 ]
Evans, James [1 ,9 ]
机构
[1] Univ Chicago, Chicago, IL 60637 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] Sun Yat Sen Univ, Guangzhou, Peoples R China
[4] Univ Oxford, Oxford, England
[5] MIT, Cambridge, MA USA
[6] Harvard Univ, Cambridge, MA USA
[7] Rutgers State Univ, New Brunswick, NJ USA
[8] Cornell Univ, Ithaca, NY USA
[9] Santa Fe Inst, Santa Fe, NM 87501 USA
关键词
Computational social science; Simulated societies; Policy; Quantitative social science; Machine learning; Deep learning; Simulation; NETWORKS; TEAMWORK;
D O I
10.1007/s10588-022-09362-3
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The DARPA Ground Truth project sought to evaluate social science by constructing four varied simulated social worlds with hidden causality and unleashed teams of scientists to collect data, discover their causal structure, predict their future, and prescribe policies to create desired outcomes. This large-scale, long-term experiment of in silico social science, about which the ground truth of simulated worlds was known, but not by us, reveals the limits of contemporary quantitative social science methodology. First, problem solving without a shared ontology-in which many world characteristics remain existentially uncertain-poses strong limits to quantitative analysis even when scientists share a common task, and suggests how they could become insurmountable without it. Second, data labels biased the associations our analysts made and assumptions they employed, often away from the simulated causal processes those labels signified, suggesting limits on the degree to which analytic concepts developed in one domain may port to others. Third, the current standard for computational social science publication is a demonstration of novel causes, but this limits the relevance of models to solve problems and propose policies that benefit from the simpler and less surprising answers associated with most important causes, or the combination of all causes. Fourth, most singular quantitative methods applied on their own did not help to solve most analytical challenges, and we explored a range of established and emerging methods, including probabilistic programming, deep neural networks, systems of predictive probabilistic finite state machines, and more to achieve plausible solutions. However, despite these limitations common to the current practice of computational social science, we find on the positive side that even imperfect knowledge can be sufficient to identify robust prediction if a more pluralistic approach is applied. Applying competing approaches by distinct subteams, including at one point the vast TopCoder.com global community of problem solvers, enabled discovery of many aspects of the relevant structure underlying worlds that singular methods could not. Together, these lessons suggest how different a policy-oriented computational social science would be than the computational social science we have inherited. Computational social science that serves policy would need to endure more failure, sustain more diversity, maintain more uncertainty, and allow for more complexity than current institutions support.
引用
收藏
页码:188 / 219
页数:32
相关论文
共 18 条
  • [1] Does big data serve policy? Not without context. An experiment with in silico social science
    Chris Graziul
    Alexander Belikov
    Ishanu Chattopadyay
    Ziwen Chen
    Hongbo Fang
    Anuraag Girdhar
    Xiaoshuang Jia
    P. M. Krafft
    Max Kleiman-Weiner
    Candice Lewis
    Chen Liang
    John Muchovej
    Alejandro Vientós
    Meg Young
    James Evans
    Computational and Mathematical Organization Theory, 2023, 29 : 188 - 219
  • [2] Social Science in the Era of Big Data
    Gonzalez-Bailon, Sandra }
    POLICY AND INTERNET, 2013, 5 (02): : 147 - 160
  • [3] Big Data Research for Social Science and Social Impact
    Lytras, Miltiadis D.
    Visvizi, Anna
    SUSTAINABILITY, 2020, 12 (01)
  • [4] Big Data under the Microscope and Brains in Social Context: Integrating Methods from Computational Social Science and Neuroscience
    O'Donnell, Matthew Brook
    Falk, Emily B.
    ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE, 2015, 659 (01) : 274 - 289
  • [5] Perspectives on Policy and the Value of Nursing Science in a Big Data Era
    Gephart, Sheila M.
    Davis, Mary
    Shea, Kimberly
    NURSING SCIENCE QUARTERLY, 2018, 31 (01) : 78 - 81
  • [6] Ethical Issues in Social Science Research Employing Big Data
    Mohammad Hosseini
    Michał Wieczorek
    Bert Gordijn
    Science and Engineering Ethics, 2022, 28
  • [7] Ethical Issues in Social Science Research Employing Big Data
    Hosseini, Mohammad
    Wieczorek, Michal
    Gordijn, Bert
    SCIENCE AND ENGINEERING ETHICS, 2022, 28 (03)
  • [8] Big data in social and psychological science: theoretical and methodological issues
    Qiu L.
    Chan S.H.M.
    Chan D.
    Journal of Computational Social Science, 2018, 1 (1): : 59 - 66
  • [9] Sociology in the Era of Big Data: The Ascent of Forensic Social Science
    McFarland D.A.
    Lewis K.
    Goldberg A.
    The American Sociologist, 2016, 47 (1) : 12 - 35
  • [10] Understanding the paradigm shift to computational social science in the presence of big data
    Chang, Ray M.
    Kauffman, Robert J.
    Kwon, YoungOk
    DECISION SUPPORT SYSTEMS, 2014, 63 : 67 - 80