Supporting concept location through identifier parsing and ontology extraction

被引:9
作者
Abebe, Surafel Lemma [1 ]
Alicante, Anita [2 ]
Corazza, Anna [2 ]
Tonella, Paolo [1 ]
机构
[1] Fdn Bruno Kessler, Trento, Italy
[2] Univ Naples Federico II, Naples, Italy
关键词
Program understanding; Concept location; Natural language parsing; SOURCE CODE; EVOLUTION;
D O I
10.1016/j.jss.2013.07.009
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Identifier names play a key role in program understanding and in particular in concept location. Programmers can easily "parse" identifiers and understand the intended meaning. This, however, is not trivial for tools that try to exploit the information in the identifiers to support program understanding. To address this problem, we resort to natural language analyzers, which parse tokenized identifier names and provide the syntactic relationships (dependencies) among the terms composing the identifiers. Such relationships are then mapped to semantic relationships. In this study, we have evaluated the use of off-the-shelf and trained natural language analyzers to parse identifier names, extract an ontology and use it to support concept location. In the evaluation, we assessed whether the concepts taken from the ontology can be used to improve the efficiency of queries used in concept location. We have also investigated if the use of different natural language analyzers has an impact on the ontology extracted and the support it provides to concept location. Results show that using the concepts from the ontology significantly improves the efficiency of concept location queries (e.g., in some cases, an improvement of 127% is observed). The results also indicate that the efficiency of concept location queries is not affected by the differences in the ontologies produced by different analyzers. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:2919 / 2938
页数:20
相关论文
共 45 条
[21]  
Haiduc S, 2012, IEEE INT CONF AUTOM, P90, DOI 10.1145/2351676.2351690
[22]  
Haiduc S, 2012, PROC INT CONF SOFTW, P1273, DOI 10.1109/ICSE.2012.6227101
[23]  
Hill E., 2008, P 2008 INT WORK C MI, P79, DOI DOI 10.1145/1370750.1370771
[24]   Automatically Capturing Source Code Context of NL-Queries for Software Maintenance and Reuse [J].
Hill, Emily ;
Pollock, Lori ;
Vijay-Shanker, K. .
2009 31ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2009, :232-242
[25]  
Joachims T, 1999, ADVANCES IN KERNEL METHODS, P169
[26]   Accurate unlexicalized parsing [J].
Klein, D ;
Manning, CD .
41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, :423-430
[27]  
Klein D., 2003, Proceedings of Advances in Neural Information Processing Systems, V15, P3
[28]   Enhancing maintainability of source programs through disabbreviation [J].
Laitinen, K ;
Taramaa, J ;
Heikkila, M ;
Rowe, NC .
JOURNAL OF SYSTEMS AND SOFTWARE, 1997, 37 (02) :117-128
[29]  
Lawrie D., 2011, 2011 IEEE 27th International Conference on Software Maintenance, P113, DOI 10.1109/ICSM.2011.6080778
[30]  
Lawrie Dawn, 2010, Proceedings 17th Working Conference on Reverse Engineering (WCRE 2010), P3, DOI 10.1109/WCRE.2010.10