The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries

被引:17
|
作者
Geiger, R. Stuart [1 ]
Varoquaux, Nelle [1 ,2 ]
Mazel-Cabasse, Charlotte [1 ]
Holdgraf, Chris [1 ,3 ]
机构
[1] Univ Calif Berkeley, Berkeley Inst Data Sci, 190 Doe Lib, Berkeley, CA 94730 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley Inst Data Sci, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Helen Wills Neurosci Inst, Berkeley Inst Data Sci, Berkeley, CA 94720 USA
来源
COMPUTER SUPPORTED COOPERATIVE WORK-THE JOURNAL OF COLLABORATIVE COMPUTING AND WORK PRACTICES | 2018年 / 27卷 / 3-6期
关键词
Documentation; Standards; Invisible work; Motivations; Peer production; Collaboration; Infrastructure; Ethnography; Open source; ORGANIZATIONAL PROCESS; WORK;
D O I
10.1007/s10606-018-9333-1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Computational research and data analytics increasingly relies on complex ecosystems of open source software (OSS) "libraries" - curated collections of reusable code that programmers import to perform a specific task. Software documentation for these libraries is crucial in helping programmers/analysts know what libraries are available and how to use them. Yet documentation for open source software libraries is widely considered low-quality. This article is a collaboration between CSCW researchers and contributors to data analytics OSS libraries, based on ethnographic fieldwork and qualitative interviews. We examine several issues around the formats, practices, and challenges around documentation in these largely volunteer-based projects. There are many different kinds and formats of documentation that exist around such libraries, which play a variety of educational, promotional, and organizational roles. The work behind documentation is similarly multifaceted, including writing, reviewing, maintaining, and organizing documentation. Different aspects of documentation work require contributors to have different sets of skills and overcome various social and technical barriers. Finally, most of our interviewees do not report high levels of intrinsic enjoyment for doing documentation work (compared to writing code). Their motivation is affected by personal and project-specific factors, such as the perceived level of credit for doing documentation work versus more 'technical' tasks like adding new features or fixing bugs. In studying documentation work for data analytics OSS libraries, we gain a new window into the changing practices of data-intensive research, as well as help practitioners better understand how to support this often invisible and infrastructural work in their projects.
引用
收藏
页码:767 / 802
页数:36
相关论文
共 50 条
  • [31] Health-Analytics Data to Evidence Suite (HADES): Open-Source Software for Observational Research
    Schuemie, Martijn
    Reps, Jenna
    Black, Adam
    DeFalco, Frank
    Evans, Lee
    Fridgeirsson, Egill
    Gilbert, James P.
    Knoll, Chris
    Lavallee, Martin
    Rao, Gowtham A.
    Rijnbeek, Peter
    Sadowski, Katy
    Sena, Anthony
    Swerdel, Joel
    Williams, Ross D.
    Suchard, Marc
    MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 966 - 970
  • [32] Healthcare Analytics and Visualization Using SEMantic Open Source Software (SEMOSS)
    Baker, Claire
    Blackwood, Jeannie
    Hartless, Casey
    Pirro, Jeanne
    Flower, Abigail A.
    2017 SYSTEMS AND INFORMATION ENGINEERING DESIGN SYMPOSIUM (SIEDS), 2017, : 144 - 149
  • [33] Reusing open-source software and practices: The impact of open-source on commercial vendors
    Brown, AW
    Booch, G
    SOFTWARE REUSE: METHODS, TECHNIQUES, AND TOOLS, PROCEEDINGS, 2002, 2319 : 123 - 136
  • [34] A Situated Approach of Roles and Participation in Open Source Software Communities
    Barcellini, Flore
    Detienne, Francoise
    Burkhardt, Jean-Marie
    HUMAN-COMPUTER INTERACTION, 2014, 29 (03): : 205 - 255
  • [35] A study of software reliability on big data open source software
    Kumar, Ranjan
    Kumar, Subhash
    Tiwari, Sanjay K.
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2019, 10 (02) : 242 - 250
  • [36] A study of software reliability on big data open source software
    Ranjan Kumar
    Subhash Kumar
    Sanjay K. Tiwari
    International Journal of System Assurance Engineering and Management, 2019, 10 : 242 - 250
  • [37] The context and state of open source software adoption in US academic libraries
    Choi, Namjoo
    Pruett, Joseph A.
    LIBRARY HI TECH, 2019, 37 (04) : 641 - 659
  • [38] Open source software for the analysis of microarray data
    Dudoit, S
    Gendeman, RC
    Quackenbush, J
    BIOTECHNIQUES, 2003, : 45 - 51
  • [39] Towards a Marketplace of Open Source Software Data
    Parreiras, Fernando Silva
    Groener, Gerd
    Schwabe, Daniel
    Silva, Fernando de Freitas
    2015 48TH HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS), 2015, : 3651 - 3660
  • [40] Data Driven Testing of Open Source Software
    Yahav, Inbal
    Kenett, Ron S.
    Bai, Xiaoying
    LEVERAGING APPLICATIONS OF FORMAL METHODS, VERIFICATION AND VALIDATION: SPECIALIZED TECHNIQUES AND APPLICATIONS, PT II, 2014, 8803 : 309 - 321