How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation

被引:147
作者
Muller, Michael [1 ]
Lange, Ingrid [1 ]
Wang, Dakuo [1 ]
Piorkowski, David [1 ]
Tsay, Jason [1 ]
Liao, Q. Vera [1 ]
Dugan, Casey [1 ]
Erickson, Thomas
机构
[1] IBM Res, Cambridge, MA 02142 USA
来源
CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS | 2019年
关键词
Data science; work-practices; data discovery; data capture; data curation; data design; data creation; grounded theory; BIG DATA; MATERIALITY;
D O I
10.1145/3290605.3300356
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the rise of big data, there has been an increasing need for practitioners in this space and an increasing opportunity for researchers to understand their workflows and design new tools to improve it. Data science is often described as data-driven, comprising unambiguous data and proceeding through regularized steps of analysis. However, this view focuses more on abstract processes, pipelines, and workflows, and less on how data science workers engage with the data. In this paper, we build on the work of other CSCW and HCI researchers in describing the ways that scientists, scholars, engineers, and others work with their data, through analyses of interviews with 21 data science professionals. We set five approaches to data along a dimension of interventions: Data as given; as captured; as curated; as designed; and as created. Data science workers develop an intuitive sense of their data and processes, and actively shape their data. We propose new ways to apply these interventions analytically, to make sense of the complex activities around data practices.
引用
收藏
页数:14
相关论文
共 94 条
[1]  
Aalst Van Der W, 2014, P I ESA C, P13, DOI [DOI 10.1007/978-3-319-04948-9_2, DOI 10.1007/978-3-319-04948-9]
[2]  
Abt Sebastian, 2014, P AISEC 2014
[3]   Big Data, Data Science, and Analytics: The Opportunity and Challenge for IS Research [J].
Agarwal, Ritu ;
Dhar, Vasant .
INFORMATION SYSTEMS RESEARCH, 2014, 25 (03) :443-448
[4]   Assessing Human Error Against a Benchmark of Perfection [J].
Anderson, Ashton ;
Kleinberg, Jon ;
Mullainathan, Sendhil .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2017, 11 (04)
[5]  
[Anonymous], 2015, CONSTRUCTING GROUNDE
[6]  
[Anonymous], NATURE EXPERTISE
[7]  
[Anonymous], 1988, NATURE EXPERTISE
[8]  
[Anonymous], 2007, SAGE Publications
[9]  
[Anonymous], 2006, International Journal of qualitative methods, DOI DOI 10.1177/160940690600500304
[10]  
[Anonymous], 2013, Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application