The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries

被引:17
|
作者
Geiger, R. Stuart [1 ]
Varoquaux, Nelle [1 ,2 ]
Mazel-Cabasse, Charlotte [1 ]
Holdgraf, Chris [1 ,3 ]
机构
[1] Univ Calif Berkeley, Berkeley Inst Data Sci, 190 Doe Lib, Berkeley, CA 94730 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley Inst Data Sci, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Helen Wills Neurosci Inst, Berkeley Inst Data Sci, Berkeley, CA 94720 USA
来源
COMPUTER SUPPORTED COOPERATIVE WORK-THE JOURNAL OF COLLABORATIVE COMPUTING AND WORK PRACTICES | 2018年 / 27卷 / 3-6期
关键词
Documentation; Standards; Invisible work; Motivations; Peer production; Collaboration; Infrastructure; Ethnography; Open source; ORGANIZATIONAL PROCESS; WORK;
D O I
10.1007/s10606-018-9333-1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Computational research and data analytics increasingly relies on complex ecosystems of open source software (OSS) "libraries" - curated collections of reusable code that programmers import to perform a specific task. Software documentation for these libraries is crucial in helping programmers/analysts know what libraries are available and how to use them. Yet documentation for open source software libraries is widely considered low-quality. This article is a collaboration between CSCW researchers and contributors to data analytics OSS libraries, based on ethnographic fieldwork and qualitative interviews. We examine several issues around the formats, practices, and challenges around documentation in these largely volunteer-based projects. There are many different kinds and formats of documentation that exist around such libraries, which play a variety of educational, promotional, and organizational roles. The work behind documentation is similarly multifaceted, including writing, reviewing, maintaining, and organizing documentation. Different aspects of documentation work require contributors to have different sets of skills and overcome various social and technical barriers. Finally, most of our interviewees do not report high levels of intrinsic enjoyment for doing documentation work (compared to writing code). Their motivation is affected by personal and project-specific factors, such as the perceived level of credit for doing documentation work versus more 'technical' tasks like adding new features or fixing bugs. In studying documentation work for data analytics OSS libraries, we gain a new window into the changing practices of data-intensive research, as well as help practitioners better understand how to support this often invisible and infrastructural work in their projects.
引用
收藏
页码:767 / 802
页数:36
相关论文
共 50 条
  • [41] Software development risk model - Applied to data from open-source Mozilla project
    Fawcett, JW
    Gungor, MK
    SERP '05: Proceedings of the 2005 International Conference on Software Engineering Research and Practice, Vols 1 and 2, 2005, : 640 - 645
  • [42] SCT: Spinal Cord Toolbox, an open-source software for processing spinal cord MRI data
    De Leener, Benjamin
    Levy, Simon
    Dupont, Sara M.
    Fonov, Vladimir S.
    Stikov, Nikola
    Collins, D. Louis
    Callot, Virginie
    Cohen-Adad, Julien
    NEUROIMAGE, 2017, 145 : 24 - 43
  • [43] Testing the water: detecting artificial water points using freely available satellite data and open source software
    Owen, Harry Jon Foord
    Duncan, Clare
    Pettorelli, Nathalie
    REMOTE SENSING IN ECOLOGY AND CONSERVATION, 2015, 1 (01) : 61 - 72
  • [44] Analyzing static structure of large software systems - Based on data from Open-Source Mozilla Project
    Fawcett, JW
    Gungor, MK
    Iyer, AV
    SERP '05: Proceedings of the 2005 International Conference on Software Engineering Research and Practice, Vols 1 and 2, 2005, : 491 - 496
  • [45] More eyes on the prize: open-source data, software and hardware for advancing plant science through collaboration
    Coleman, Guy R. Y.
    Salter, William T.
    AOB PLANTS, 2023, 15 (02):
  • [46] Equation-based and data-driven modeling: Open-source software current state and future directions
    Gunnell, LaGrande
    Nicholson, Bethany
    Hedengren, John D.
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 181
  • [47] Energy hub optimization framework based on open-source software & data - review of frameworks and a concept for districts & industrial parks
    Groissböck M.
    International Journal of Sustainable Energy Planning and Management, 2021, 31 : 109 - 120
  • [48] An Efficient Workflow for Representing Real-world Urban Environments in Game Engines using Open-source Software and Data
    Badr, Arash Shahbaz
    De Amicis, Raffaele
    GRAPP: PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL 1: GRAPP, 2022, : 103 - 114
  • [49] Terrain characterization of small island using publicly available data and open- source software: a case study of Marinduque, Philippines
    Salvacion A.R.
    Modeling Earth Systems and Environment, 2016, 2 (1)
  • [50] The digital biomarker discovery pipeline: An open-source software platform for the development of digital biomarkers using mHealth and wearables data
    Bent, Brinnae
    Wang, Ke
    Grzesiak, Emilia
    Jiang, Chentian
    Qi, Yuankai
    Jiang, Yihang
    Cho, Peter
    Zingler, Kyle
    Ogbeide, Felix Ikponmwosa
    Zhao, Arthur
    Runge, Ryan
    Sim, Ida
    Dunn, Jessilyn
    JOURNAL OF CLINICAL AND TRANSLATIONAL SCIENCE, 2021, 5 (01)