Passive data collection on Reddit: a practical approach

被引:7
作者
Rocha-Silva, Tiago [1 ,3 ]
Nogueira, Conceicao [2 ]
Rodrigues, Liliana [2 ]
机构
[1] Univ Porto, Fac Psicol & Ciencias Educ, Porto, Portugal
[2] Univ Porto, Fac Psicol & Ciencias Educ, Porto, Portugal
[3] Univ Porto, Univ Psicol & Ciencias Educ, Rua Alfredo Allen, P-4200135 Porto, Portugal
关键词
Online data collection; online research; Reddit; research ethics; social media; social science research; ANIMAL-WELFARE; ONE HEALTH; WILDLIFE CONSERVATION; ETHICAL PRINCIPLES; BEHAVIOR; SCIENCE; TEMPERAMENT; SUSTAINABILITY; CONSEQUENCES; STEWARDSHIP;
D O I
10.1177/17470161231210542
中图分类号
B82 [伦理学(道德学)];
学科分类号
摘要
Since its onset, scholars have characterized social media as a valuable source for data collection since it presents several benefits (e.g. exploring research questions with hard-to-reach populations). Nonetheless, methods of online data collection are riddled with ethical and methodological challenges that researchers must consider if they want to adopt good practices when collecting and analyzing online data. Drawing from our primary research project, where we collected passive online data on Reddit, we explore and detail the steps that researchers must consider before collecting online data: (1) planning online data collection; (2) ethical considerations; and (3) data collection. We also discuss two atypical questions that researchers should also consider: (1) how to handle deleted user-generated content; and (2) how to quote user-generated content. Moving on from the dichotomous discussion between what is public and private data, we present recommendations for good practices when collecting and analyzing qualitative online data.
引用
收藏
页码:453 / 470
页数:18
相关论文
共 39 条
[1]  
Alsinet T., 2021, Frontiers in Artificial Intelligence and Applications
[2]   New Data Sources in Social Science Research: Things to Know Before Working With Reddit Data [J].
Amaya, Ashley ;
Bach, Ruben ;
Keusch, Florian ;
Kreuter, Frauke .
SOCIAL SCIENCE COMPUTER REVIEW, 2021, 39 (05) :943-960
[3]   Pseudonymous Parents: Comparing Parenting Roles and Identities on the Mommit and Daddit Subreddits [J].
Ammari, Tawfiq ;
Schoenebeck, Sarita ;
Romero, Daniel M. .
PROCEEDINGS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2018), 2018,
[4]  
Baumgartner J., 2020, P INT AAAI C WEB SOC, P830, DOI DOI 10.48550/ARXIV.2001.08435
[5]   Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? [J].
Buhrmester, Michael ;
Kwang, Tracy ;
Gosling, Samuel D. .
PERSPECTIVES ON PSYCHOLOGICAL SCIENCE, 2011, 6 (01) :3-5
[7]  
Dym B., 2020, Transformative Works and Cultures, V33, P1
[8]   Exploring the ethical issues in research using digital data collection strategies with minors: A scoping review [J].
Facca, Danica ;
Smith, Maxwell J. ;
Shelley, Jacob ;
Lizotte, Daniel ;
Donelle, Lorie .
PLOS ONE, 2020, 15 (08)
[9]  
Franzke A. S., 2020, Internet research: Ethical guidelines 3.0
[10]   Caveat emptor, computational social science: Large-scale missing data in a widely-published Reddit corpus [J].
Gaffney, Devin ;
Matias, J. Nathan .
PLOS ONE, 2018, 13 (07)