What's Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities

被引:82
作者
Chattopadhyay, Souti [1 ]
Prasad, Ishita [2 ]
Henley, Austin Z. [3 ]
Sarma, Anita [1 ]
Barik, Titus [2 ]
机构
[1] Oregon State Univ, Corvallis, OR 97331 USA
[2] Microsoft, Redmond, WA USA
[3] Univ Tennessee, Knoxville, TN USA
来源
PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20) | 2020年
基金
美国国家科学基金会;
关键词
Computational notebooks; challenges; data science; interviews; pain points; survey;
D O I
10.1145/3313831.3376729
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Computational notebooks-such as Azure, Databricks, and Jupyter-are a popular, interactive paradigm for data scientists to author code, analyze data, and interleave visualizations, all within a single document. Nevertheless, as data scientists incorporate more of their activities into notebooks, they encounter unexpected difficulties, or pain points, that impact their productivity and disrupt their workflow. Through a systematic, mixed-methods study using semi-structured interviews (n = 20) and survey (n = 156) with data scientists, we catalog nine pain points when working with notebooks. Our findings suggest that data scientists face numerous pain points throughout the entire workflow-from setting up notebooks to deploying to production-across many notebook environments. Our data scientists report essential notebook requirements, such as supporting data exploration and visualization. The results of our study inform and inspire the design of computational notebooks.
引用
收藏
页数:12
相关论文
共 39 条
[1]  
[Anonymous], 2009, The coding manual for qualitative researchers
[2]  
[Anonymous], 1986, ASS COMPUT MACH, DOI DOI 10.1145/22339.22349
[3]   The Bones of the System: A Case Study of Logging and Telemetry at Microsoft [J].
Barik, Titus ;
DeLine, Robert ;
Drucker, Steven ;
Fisher, Danyel .
2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C), 2016, :92-101
[4]   Selection Bias in Web Surveys [J].
Bethlehem, Jelke .
INTERNATIONAL STATISTICAL REVIEW, 2010, 78 (02) :161-188
[5]  
Cypher Allen, 1993, Watch what I do: programming by demonstration
[6]  
DeLine R, 2015, PROCEEDINGS 2015 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING (VL/HCC), P137, DOI 10.1109/VLHCC.2015.7357208
[7]   R Melts Brains An IR for First-Class Environments and Lazy Effectful Arguments [J].
Fluckiger, Olivier ;
Chari, Guido ;
Jecmen, Jan ;
Yee, Ming-Ho ;
Hain, Jakob ;
Vitek, Jan .
PROCEEDINGS OF THE 15TH ACM SIGPLAN INTERNATIONAL SYMPOSIUM ON DYNAMIC LANGUAGES (DLS '19), 2019, :55-66
[8]   On the Design , Implementation, and Use of Laziness in R [J].
Goel, Aviral ;
Vitek, Jan .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (OOPSLA)
[9]   Programming by Examples [J].
Gulwani, Sumit .
DEPENDABLE SOFTWARE SYSTEMS ENGINEERING, 2016, 45 :137-158
[10]   Automating String Processing in Spreadsheets Using Input-Output Examples [J].
Gulwani, Sumit .
POPL 11: PROCEEDINGS OF THE 38TH ANNUAL ACM SIGPLAN-SIGACT SYMPOSIUM ON PRINCIPLES OF PROGRAMMING LANGUAGES, 2011, :317-329