An empirical study of the textual similarity between source code and source code summaries

被引:0
|
作者
Paul W. McBurney
Collin McMillan
机构
[1] University of Notre Dame,Department of Computer Science and Engineering
来源
Empirical Software Engineering | 2016年 / 21卷
关键词
Source code summarization; Documentation; Textual similarity; Automatic documentation generation;
D O I
暂无
中图分类号
学科分类号
摘要
Source code documentation often contains summaries of source code written by authors. Recently, automatic source code summarization tools have emerged that generate summaries without requiring author intervention. These summaries are designed for readers to be able to understand the high-level concepts of the source code. Unfortunately, there is no agreed upon understanding of what makes up a “good summary.” This paper presents an empirical study examining summaries of source code written by authors, readers, and automatic source code summarization tools. This empirical study examines the textual similarity between source code and summaries of source code using Short Text Semantic Similarity metrics. We found that readers use source code in their summaries more than authors do. Additionally, this study finds that accuracy of a human written summary can be estimated by the textual similarity of that summary to the source code.
引用
收藏
页码:17 / 42
页数:25
相关论文
共 50 条
  • [1] An empirical study of the textual similarity between source code and source code summaries
    McBurney, Paul W.
    McMillan, Collin
    EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (01) : 17 - 42
  • [2] An Empirical Study to Evaluate Structural Similarity for Source Code Translation
    Yao, Xulu
    Yap, Moi Hoon
    Zhang, Yanlong
    2019 4TH TECHNOLOGY INNOVATION MANAGEMENT AND ENGINEERING SCIENCE INTERNATIONAL CONFERENCE (TIMES-ICON), 2019,
  • [3] Empirical Study of Transformers for Source Code
    Chirkova, Nadezhda
    Troshin, Sergey
    PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, : 703 - 715
  • [4] Mining traces between source code and textual documents
    Rasekh, Amir Hossein
    Fakhrahmad, Seyed Mostafa
    Sadreddini, Mohammad Hadi
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2019, 59 (01) : 43 - 52
  • [5] An empirical study on the maintenance of source code clones
    Suresh Thummalapenta
    Luigi Cerulo
    Lerina Aversano
    Massimiliano Di Penta
    Empirical Software Engineering, 2010, 15 : 1 - 34
  • [6] An empirical study on the maintenance of source code clones
    Thummalapenta, Suresh
    Cerulo, Luigi
    Aversano, Lerina
    Di Penta, Massimiliano
    EMPIRICAL SOFTWARE ENGINEERING, 2010, 15 (01) : 1 - 34
  • [7] An empirical study of the relationship between the concepts expressed in source code and dependence
    Binkley, David
    Gold, Nicolas
    Harman, Mark
    Li, Zheng
    Mahdavi, Kiarash
    JOURNAL OF SYSTEMS AND SOFTWARE, 2008, 81 (12) : 2287 - 2298
  • [8] Scalable Source Code Similarity Detection in Large Code Repositories
    Alomari, Firas
    Harbi, Muhammed
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2019, 6 (22) : 1 - 11
  • [9] Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study
    van Dam, Tim
    Izadi, Maliheh
    van Deursen, Arie
    2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 170 - 182
  • [10] An Empirical Study Assessing Source Code Readability in Comprehension
    Johnson, John C.
    Lubo, Sergio
    Yedla, Nishitha
    Aponte, Jairo
    Sharif, Bonita
    2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2019), 2019, : 513 - 523