A call for clean code of effectively communicate science

被引:18
作者
Filazzola, Alessandro [1 ,2 ]
Lortie, C. J. [3 ,4 ]
机构
[1] Apex Resource Management Solut, Ottawa, ON, Canada
[2] Univ Toronto Mississauga, Ctr Urban Environm, Mississauga, ON, Canada
[3] York Univ, Dept Biol, Toronto, ON, Canada
[4] UCSB, Natl Ctr Ecol Anal & Synth, Santa Barbara, CA USA
来源
METHODS IN ECOLOGY AND EVOLUTION | 2022年 / 13卷 / 10期
基金
加拿大自然科学与工程研究理事会;
关键词
open science; principles; programming; replication; reproducibility; science communication; transparency; MANAGEMENT; MODELS;
D O I
10.1111/2041-210X.13961
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
1. Effective coding is fundamental to the study of biology. Computation underpins most research, and reproducible science can be promoted through clean coding practices. Clean coding is crafting code design, syntax and nomenclature in a manner that maximizes the potential to communicate its intent with other scientists. However, computational biologists are not software engineers, and many of our coding practices have developed ad hoc without formal training, often creating difficult-to-read code for others. Hard-to-understand code can thus be limiting our efficiency and ability to communicate as scientists with one another. 2. The purpose of this paper is to provide a primer on some of the practices associated with crafting clean code by synthesizing a transformative text in software engineering along with recent articles on coding practices in computational biology. We review past recommendations to provide a series of best practices that transform coding into a human-accessible form of communication. 3. Three common themes shared in this synthesis are the following: (a) code has value and you are responsible for its organization to enable clear communication, (b) use a formatting style to guide writing code that is easily understandable and consistent and (c) apply abstraction to emphasize important elements and declutter. 4. While many of the provided practices and recommendations were developed with computational biologists in mind, we believe there is wider applicability to any biologist undertaking work in data management or statistical analyses. Clean code is thus a crucial step forward in resolving some of the crisis in reproducibility for science.
引用
收藏
页码:2119 / 2128
页数:10
相关论文
共 62 条
[1]  
Al-Fedaghi S, 2021, INT J ADV COMPUT SC, V12, P524
[2]  
[Anonymous], 2016, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
[3]  
[Anonymous], 2003, Agile Software Development: Principles, Patterns, and Practices
[4]  
[Anonymous], 1990, The Relational Model for Database Management: Version 2
[5]  
Appling A, 2020, SCIPIPER SUPPORT FUN
[6]  
Baker M, 2016, NATURE, V533, P452, DOI 10.1038/533452a
[7]   Using GNU Make to Manage the Workflow of Data Analysis Projects [J].
Baker, Peter .
JOURNAL OF STATISTICAL SOFTWARE, 2020, 94 (CN1) :1-46
[8]   Incorporating temperature and precipitation extremes into process-based models of African lepidoptera changes the predicted distribution under climate change [J].
Barton, Madeleine G. ;
Terblanche, John S. ;
Sinclair, Brent J. .
ECOLOGICAL MODELLING, 2019, 394 :53-65
[9]   A primer on python']python for life science researchers [J].
Bassi, Sebastian .
PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (11) :2052-2057
[10]  
Beck K., 2003, TEST DRIVEN DEV EXAM