Mining Analogical Libraries in Q&A Discussions - Incorporating Relational and Categorical Knowledge into Word Embedding

被引:52
作者
Chen, Chunyang [1 ]
Gao, Sa [1 ]
Xing, Zhenchang [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
来源
2016 IEEE 23RD INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), VOL 1 | 2016年
关键词
Analogical libraries; Word embedding; Knowledge graph; Relational knowledge; Categorical knowledge; SIMILARITY;
D O I
10.1109/SANER.2016.21
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Third-party libraries are an integral part of many software projects. It often happens that developers need to find analogical libraries that can provide comparable features to the libraries they are already familiar with. Existing methods to find analogical libraries are limited by the community-curated list of libraries, blogs, or Q&A posts, which often contain overwhelming or out-of-date information. In this paper, we present a new approach to recommend analogical libraries based on a knowledge base of analogical libraries mined from tags of millions of Stack Overflow questions. The novelty of our approach is to solve analogical-libraries questions by combining state-of-the-art word embedding technique and domain-specific relational and categorical knowledge mined from Stack Overflow. We implement our approach in a proof-of-concept web application (https://graphofknowledge.appspot.com/similartech). The evaluation results show that our approach can make accurate recommendation of analogical libraries (Precision@1=0.81 and Precision@5=0.67). Google Analytics of the website traffic provides initial evidence of the potential usefulness of our web application for software developers.
引用
收藏
页码:338 / 348
页数:11
相关论文
共 27 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]  
[Anonymous], 2013, P 9 JOINT M EUR SOFT
[3]  
[Anonymous], 2013, P 2013 C N AM CHAPTE
[4]  
[Anonymous], P INT C COMP LING
[5]  
[Anonymous], 2007, P 2007 JOINT C EMPIR
[6]  
[Anonymous], 2008, INTRO INFORM RETRIEV, DOI DOI 10.1017/CBO9780511809071
[7]  
[Anonymous], 2007, P 22 IEEE ACM INT C
[8]   What are developers talking about? An analysis of topics and trends in Stack Overflow [J].
Barua, Anton ;
Thomas, Stephen W. ;
Hassan, Ahmed E. .
EMPIRICAL SOFTWARE ENGINEERING, 2014, 19 (03) :619-654
[9]  
BIRD S, 2006, P COLING ACL INT PRE, P69, DOI DOI 10.3115/1225403.1225421
[10]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,