THE CRITICAL CHALLENGE OF USING LARGE-SCALE DIGITAL EXPERIMENT PLATFORMS FOR SCIENTIFIC DISCOVERY

被引：0

作者：

Abbasi, Ahmed ^{[1
]}

Somanchi, Sriram ^{[1
]}

Kelley, Ken ^{[1
]}

机构：

[1] Department of Information Technology, Analytics, Operations Mendoza College of Business, University of Notre Dame, South Bend, IN

来源：

MIS Quarterly: Management Information Systems | 2025年 / 49卷 / 01期

关键词：

causal inference; Large-scale digital experimentation; machine learning; online controlled experiments; research approach; type of research;

D O I：

10.25300/MISQ/2024/18201

中图分类号：

学科分类号：

摘要：

Robust digital experimentation platforms have become increasingly pervasive at major technology and e-commerce firms worldwide. They allow product managers to use data-driven decision-making through online controlled experiments that estimate the average treatment effect (ATE) relative to a status quo control setting and make associated inferences. As demand for experiments continues to grow, orthogonal test planes (OTPs) have become the industry standard for managing the assignment of users to multiple concurrent experimental treatments in companies using large-scale digital experimentation platforms. In recent years, firms have begun to recognize that test planes might be confounding experimental results, but nevertheless, the practical benefits outweigh the costs. However, the uptick in practitioner-led digital experiments has coincided with an increase in academic-industry research partnerships, where large-scale digital experiments are being used to scientifically answer research questions, validate design choices, and/or derive computational social science-based empirical insights. In such contexts, confounding and biased estimation may have much more pronounced implications for the validity of scientific findings, contributions to theory, building a cumulative literature, and ultimately practice. The purpose of this Issues and Opinions article is to shed light on OTPs—in our experience, most researchers are unaware of how such test planes can lead to incorrect inferences. We used a case study conducted at a major e-commerce company to illustrate the extent to which interactions in concurrent experiments can bias ATEs, often making them appear more positive than they actually are. We discuss implications for research, including the distinction between practical industry experiments and academic research, methodological best practices for mitigating such concerns, and transparency and reproducibility considerations stemming from the complexity and opacity of large-scale experimentation platforms. More broadly, we worry that confounding in scientific research due to reliance on large-scale digital experiments meant to serve a different purpose is a microcosm of a larger epistemological confounding regarding what constitutes a contribution to scientific knowledge. ©2025. The Authors.

引用

页码：1 / 28

页数：27

共 96 条

[41]

Harron K., Lee T., Ball T., Mok Q., Gamble C., Macrae D., Gilbert R., Making co-enrolment feasible for randomised controlled trials in paediatric intensive care, PLOS One, 7, 8, (2012)

[42]

Holtz D., Lobel R., Liskovich I., Aral S., Reducing interferencebiasinonlinemarketplacepricingexperiments, (2020)

[43]

Hunt S., Foundations of marketing theory: Toward a general theory of marketing, Causal inference in statistics, social, and biomedical sciences, (2002)

[44]

Jiang Y., Liu F., Guo J., Sun P., Chen Z., Li J., Cai L., Zhao H., Gao P., Ding Z., Wu X., Evaluating an intervention program using WeChat for patients with chronic obstructive pulmonary disease: randomized controlled trial, Journal of Medical Internet Research, 22, 4, (2020)

[45]

Johnson G., Inferno: A guide to field experiments in online display advertising, (2022)

[46]

Kamel Boulos M. N., Giustini D. M., Wheeler S., Instagram and WhatsApp in health and healthcare: An overview, Future internet, 8, 3, (2016)

[47]

Karahanna E., Benbasat I., Bapna R., Rai A., Editor’s comments: Opportunities and challenges for different types of online experiments, MIS Quarterly, 42, 4, pp. iii-iix, (2018)

[48]

Kaushik A., Web analytics 2.0: The art of online accountability and science of customer centricity, (2009)

[49]

Kelley K., Preacher K. J., On effect size, Psychological Methods, 17, 2, pp. 137-152, (2012)

[50]

Kharitonov E., Macdonald C., Serdyukov P., Ounis I., Optimised scheduling of online experiments, Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 453-462, (2015)

← 1 2 3 4 5 6 7 8 9 10 →