An empirical study of the textual similarity between source code and source code summaries

被引:0
作者
Paul W. McBurney
Collin McMillan
机构
[1] University of Notre Dame,Department of Computer Science and Engineering
来源
Empirical Software Engineering | 2016年 / 21卷
关键词
Source code summarization; Documentation; Textual similarity; Automatic documentation generation;
D O I
暂无
中图分类号
学科分类号
摘要
Source code documentation often contains summaries of source code written by authors. Recently, automatic source code summarization tools have emerged that generate summaries without requiring author intervention. These summaries are designed for readers to be able to understand the high-level concepts of the source code. Unfortunately, there is no agreed upon understanding of what makes up a “good summary.” This paper presents an empirical study examining summaries of source code written by authors, readers, and automatic source code summarization tools. This empirical study examines the textual similarity between source code and summaries of source code using Short Text Semantic Similarity metrics. We found that readers use source code in their summaries more than authors do. Additionally, this study finds that accuracy of a human written summary can be estimated by the textual similarity of that summary to the source code.
引用
收藏
页码:17 / 42
页数:25
相关论文
共 50 条
  • [41] Improving Automated Source Code Summarization via an Eye-Tracking Study of Programmers
    Rodeghero, Paige
    McMillan, Collin
    McBurney, Paul W.
    Bosch, Nigel
    D'Mello, Sidney
    36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 390 - 401
  • [42] An Eye-Tracking Study of Java']Java Programmers and Application to Source Code Summarization
    Rodeghero, Paige
    Liu, Cheng
    McBurney, Paul W.
    McMillan, Collin
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (11) : 1038 - 1054
  • [43] Source Code Summarization with Structural Relative Position Guided Transformer
    Gong, Zi
    Gao, Cuiyun
    Wang, Yasheng
    Gu, Wenchao
    Peng, Yun
    Xu, Zenglin
    2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 13 - 24
  • [44] CloCom: Mining Existing Source Code for Automatic Comment Generation
    Wong, Edmund
    Liu, Taiyue
    Tan, Lin
    2015 22ND INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), 2015, : 380 - 389
  • [45] Integrating Runtime Values with Source Code to Facilitate Program Comprehension
    Sulir, Matus
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 743 - 748
  • [46] Maintaining traceability links between implementation-level restrictions and source code for program understanding
    Ohba, Masaru
    Gondow, Katsuhiko
    PROCEEDINGS OF THE 10TH IASTED INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND APPLICATIONS, 2006, : 20 - +
  • [47] Reading between the lines: Source code documentation as a conservation strategy for software-based art
    Engel, Deena
    Wharton, Glenn
    STUDIES IN CONSERVATION, 2014, 59 (06) : 404 - 415
  • [48] Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization
    Bansal A.
    Sharif B.
    McMillan C.
    Proceedings of the ACM on Human-Computer Interaction, 2023, 7 (ETRA)
  • [49] FMCF: A fusing multiple code features approach based on Transformer for Solidity smart contracts source code summarization
    Lei, Gang
    Zhang, Donghua
    Xiao, Jianmao
    Fan, Guodong
    Cao, Yuanlong
    Feng, Zhiyong
    APPLIED SOFT COMPUTING, 2024, 166
  • [50] READSUM: Retrieval-Augmented Adaptive Transformer for Source Code Summarization
    Choi, Yunseok
    Na, Cheolwon
    Kim, Hyojun
    Lee, Jee-Hyong
    IEEE ACCESS, 2023, 11 : 51155 - 51165