Calculating Similarity of Java']Javadoc Comments

被引:0
作者
Koznov, D. V. [1 ]
Ledeneva, E. Yu. [2 ]
Luciv, D. V. [1 ]
Braslavski, P. I. [3 ]
机构
[1] St Petersburg State Univ, St Petersburg 199034, Russia
[2] Yandex LLC, Moscow 119021, Russia
[3] HSE Univ, Moscow 109028, Russia
关键词
software documentation; !text type='Java']Java[!/text]doc comments; similarity measure;
D O I
10.1134/S0361768824010043
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code comments are an essential part of software documentation. Many software projects suffer from the problem of low-quality comments that are often produced by copy-paste. In case of similar methods, classes, etc. copy-pasted comments with minor modifications are justified. However, in many cases this approach leads to degraded documentation quality and, subsequently, to problematic maintenance and development of the project. In this study, we address the problem of near-duplicate code comments detection, which can potentially improve software documentation. We have conducted a thorough evaluation of traditional string similarity metrics and modern machine learning methods. In our experiment, we use a collection of Javadoc comments from four industrial open-source Java projects. We have found out that LCS (Longest Common Subsequence) is the best similarity algorithm taking into account both quality (Precision 94%, Recall 74%) and performance.
引用
收藏
页码:85 / 89
页数:5
相关论文
共 25 条
[1]   Machine Learning and Conceptual Reasoning for Inconsistency Detection [J].
Al Otaibi, Jameela ;
Safi, Zeineb ;
Hassaine, Abdelaali ;
Islam, Fahad ;
Jaoua, Ali .
IEEE ACCESS, 2017, 5 :338-346
[2]  
[Anonymous], 2015, The Art and Science of Analyzing Software Data
[3]  
Basit HamidAbdul., 2007, P THE 6 JOINT M EURO, P513
[4]  
Blasi A., 2018, PROC ICPC
[5]   On the resemblance and containment of documents [J].
Broder, AZ .
COMPRESSION AND COMPLEXITY OF SEQUENCES 1997 - PROCEEDINGS, 1998, :21-29
[6]   On the Coherence Between Comments and Implementations in Source Code [J].
Corazza, Anna ;
Maggio, Valerio ;
Scanniello, Giuseppe .
PROCEEDINGS 41ST EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS SEAA 2015, 2015, :76-83
[7]   Deep Code-Commend Understanding and Assessment [J].
Wang, Deze ;
Guo, Yong ;
Dong, Wei ;
Wang, Zhiming ;
Liu, Haoran ;
Li, Shanshan .
IEEE ACCESS, 2019, 7 :174200-174209
[8]   Do code and comments co-evolve?: On the relation between source code and comment changes [J].
Fluri, Beat ;
Wuesch, Michael ;
Gall, Harald C. .
14TH WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, 2007, :70-79
[9]  
Gionis A, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P518
[10]  
Koznov D., 2015, International Andrei Ershov Memorial Conference on Perspectives of System Informatics, P170