Learning a Metric for Code Readability

被引:224
作者
Buse, Raymond P. L. [1 ]
Weimer, Westley R. [1 ]
机构
[1] Univ Virginia, Charlottesville, VA 22904 USA
关键词
Software readability; program understanding; machine learning; software maintenance; code metrics; FindBugs; PROGRAM READABILITY; SOFTWARE;
D O I
10.1109/TSE.2009.70
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we explore the concept of code readability and investigate its relation to software quality. With data collected from 120 human annotators, we derive associations between a simple set of local code features and human notions of readability. Using those features, we construct an automated readability measure and show that it can be 80 percent effective and better than a human, on average, at predicting readability judgments. Furthermore, we show that this metric correlates strongly with three measures of software quality: code changes, automated defect reports, and defect log messages. We measure these correlations on over 2.2 million lines of code, as well as longitudinally, over many releases of selected projects. Finally, we discuss the implications of this study on programming language design and engineering practice. For example, our data suggest that comments, in and of themselves, are less important than simple blank lines to local judgments of readability.
引用
收藏
页码:546 / 558
页数:13
相关论文
共 41 条
  • [1] An integrated measure of software maintainability
    Aggarwal, KK
    Singh, Y
    Chhabra, JK
    [J]. ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 2002 PROCEEDINGS, 2002, : 235 - 241
  • [2] Ambler S., 1997, SOFTWARE DEV, V5, P67
  • [3] [Anonymous], 1976, A discipline of programming
  • [4] [Anonymous], 1997, MACHINE LEARNING, MCGRAW-HILL SCIENCE/ENGINEERING/MATH
  • [5] Ordered and quantum treemaps: Making effective use of 2D space to display hierarchies
    Bederson, BB
    Shneiderman, B
    Wattenberg, M
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2002, 21 (04): : 833 - 854
  • [6] Software defeat reduction top 10 list
    Boehm, B
    Basili, VR
    [J]. COMPUTER, 2001, 34 (01) : 135 - 137
  • [8] Buse R.P., 2008, Proceedings of the 2008 International Symposium on Software Testing and Analysis, P121
  • [9] Cannon L.W., 1990, RECOMMENDED C STYLE
  • [10] Chen TY, 2004, QSIC 2004: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE, P146