The Story in the Notebook: Exploratory Data Science using a Literate Programming Tool

被引:122
作者
Kery, Mary Beth [1 ]
Radensky, Marissa [2 ]
Arya, Mahima [1 ]
John, Bonnie E. [3 ]
Myers, Brad A. [1 ]
机构
[1] Carnegie Mellon Univ, Human Comp Interact Inst, Pittsburgh, PA 15213 USA
[2] Amherst Coll, Amherst, MA 01002 USA
[3] Bloomberg LP, New York, NY USA
来源
PROCEEDINGS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2018) | 2018年
基金
美国国家科学基金会;
关键词
Literate Programming; Exploratory Programming; Data Science; End-User Programmers (EUP); End-User Software Engineering (FUSE);
D O I
10.1145/3173574.3173748
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Literate programming tools are used by millions of programmers today, and are intended to facilitate presenting data analyses in the form of a narrative. We interviewed 21 data scientists to study coding behaviors in a literate programming environment and how data scientists kept track of variants they explored. For participants who tried to keep a detailed history of their experimentation, both informal and formal versioning attempts led to problems, such as reduced notebook readability. During iteration, participants actively curated their notebooks into narratives, although primarily through cell structure rather than markdown explanations. Next, we surveyed 45 data scientists and asked them to envision how they might use their past history in a future version control system. Based on these results, we give design guidance for future literate programming tools, such as providing history search based on how programmers recall their explorations, through contextual details including images and parameters.
引用
收藏
页数:11
相关论文
共 33 条
  • [1] [Anonymous], SAGEMATH SAGE MATH S
  • [2] [Anonymous], 2015, JUPYTER NOTEBOOK 201
  • [3] [Anonymous], 2014, R IMPLEMENT REPROD R
  • [4] [Anonymous], 2015, PROJECT JUPYTER COMP
  • [5] Apache Software Foundation, 2017, AP ZEPP 0 7 0
  • [6] Bier E. A., 1993, Computer Graphics Proceedings, P73, DOI 10.1145/166117.166126
  • [7] Brandt RO, 2008, PROCEEDINGS OF THE ASME POWER CONFERENCE 2008, P1, DOI 10.1115/POWER2008-60002
  • [8] GROUNDED THEORY RESEARCH - PROCEDURES, CANONS AND EVALUATIVE CRITERIA
    CORBIN, J
    STRAUSS, A
    [J]. ZEITSCHRIFT FUR SOZIOLOGIE, 1990, 19 (06): : 418 - 427
  • [9] Fisher Danyel, 2014, Technical Report MSR-TR-2014-148
  • [10] Guo Philip J., 2012, THESIS STANFORD U