An empirical study of the textual similarity between source code and source code summaries

被引:0
作者
Paul W. McBurney
Collin McMillan
机构
[1] University of Notre Dame,Department of Computer Science and Engineering
来源
Empirical Software Engineering | 2016年 / 21卷
关键词
Source code summarization; Documentation; Textual similarity; Automatic documentation generation;
D O I
暂无
中图分类号
学科分类号
摘要
Source code documentation often contains summaries of source code written by authors. Recently, automatic source code summarization tools have emerged that generate summaries without requiring author intervention. These summaries are designed for readers to be able to understand the high-level concepts of the source code. Unfortunately, there is no agreed upon understanding of what makes up a “good summary.” This paper presents an empirical study examining summaries of source code written by authors, readers, and automatic source code summarization tools. This empirical study examines the textual similarity between source code and summaries of source code using Short Text Semantic Similarity metrics. We found that readers use source code in their summaries more than authors do. Additionally, this study finds that accuracy of a human written summary can be estimated by the textual similarity of that summary to the source code.
引用
收藏
页码:17 / 42
页数:25
相关论文
共 50 条
  • [31] Automatic Comment Generation using only Source Code
    Yildiz, Eren
    Ekin, Emine
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [32] Evaluating Source Code Summarization Techniques: Replication and Expansion
    Eddy, Brian P.
    Robinson, Jeffrey A.
    Kraft, Nicholas A.
    Carver, Jeffrey C.
    2013 IEEE 21ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2013, : 13 - 22
  • [33] Retrieval-based Neural Source Code Summarization
    Zhang, Jian
    Wang, Xu
    Zhang, Hongyu
    Sun, Hailong
    Liu, Xudong
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 1385 - 1397
  • [34] Automatic source code summarization with graph attention networks
    Zhou, Yu
    Shen, Juanjuan
    Zhang, Xiaoqing
    Yang, Wenhua
    Han, Tingting
    Chen, Taolue
    JOURNAL OF SYSTEMS AND SOFTWARE, 2022, 188
  • [35] Commenting source code: is it worth it for small programming tasks?
    Nielebock, Sebastian
    Krolikowski, Dariusz
    Krueger, Jacob
    Leich, Thomas
    Ortmeier, Frank
    EMPIRICAL SOFTWARE ENGINEERING, 2019, 24 (03) : 1418 - 1457
  • [36] Enhancing source code summarization from structure and semantics
    Lu, Xurong
    Niu, Jun
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [37] Naturalness in Source Code Summarization. How Significant is it?
    Ferretti, Claudio
    Saletta, Martina
    2023 IEEE/ACM 31ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2023, : 125 - 134
  • [38] Automatically Redocumenting Source Code with Method and Class Stereotypes
    Guarnera, Drew T.
    Collard, Michael L.
    Dragan, Natalia
    Maletic, Jonathan, I
    Newman, Christian D.
    Decker, Michael J.
    2018 IEEE THIRD INTERNATIONAL WORKSHOP ON DYNAMIC SOFTWARE DOCUMENTATION (DYSDOC3), 2018, : 3 - 4
  • [39] Can method data dependencies support the assessment of traceability between requirements and source code?
    Kuang, Hongyu
    Maeder, Patrick
    Hu, Hao
    Ghabi, Achraf
    Huang, LiGuo
    Lue, Jian
    Egyed, Alexander
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2015, 27 (11) : 838 - 866
  • [40] FCSO: Source Code Summarization by Fusing Multiple Code Features and Ensuring Self-consistency Output
    Zhang, Donghua
    Lei, Gang
    Xiao, Jianmao
    Xu, Zhipeng
    Fan, Guodong
    Chen, Shizhan
    Cao, Yuanlong
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT II, 2024, 14488 : 112 - 129