What's Spain's Paris? Mining analogical libraries from Q&A discussions

被引:20
作者
Chen, Chunyang [1 ]
Xing, Zhenchang [2 ]
Liu, Yang [3 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia
[2] Australian Natl Univ, Coll Engn & Comp Sci, Canberra, ACT, Australia
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
关键词
Analogical libraries; Word embedding; Knowledge graph; Relational knowledge; Categorical knowledge;
D O I
10.1007/s10664-018-9657-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Third-party libraries are an integral part of many software projects. It often happens that developers need to find analogical libraries that can provide comparable features to the libraries they are already familiar with for different programming languages or different mobile platforms. Existing methods to find analogical libraries are limited by the community-curated list of libraries, blogs, or Q&A posts, which often contain overwhelming or out-of-date information. In this paper, we present a new approach to recommend analogical libraries based on a knowledge base of analogical libraries mined from tags of millions of Stack Overflow questions. The novelty of our approach is to solve analogical-library questions by combining state-of-the-art word embedding technique and domain-specific relational and categorical knowledge mined from Stack Overflow. Given a library and a recommended analogical library, our approach further extracts questions and answer snippets in Stack Overflow about comparison of analogical libraries, which can potentially offer useful information scents for developers to further their investigation of the recommended analogical libraries. We implement our approach in a proof-of-concept web application and more than 34.8 thousands of users visited our website from November 2015 to August 2017. Our evaluation shows that our approach can make accurate recommendation of analogical libraries. We also demonstrate the usefulness of our analogical-library recommendations by using them to answer analogical-library questions in Stack Overflow. Google Analytics of our website traffic and analysis of the visitors' interaction with website contents provide the insights into the usage patterns and the system design of our web application.
引用
收藏
页码:1155 / 1194
页数:40
相关论文
共 57 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Agrawal R., P 20 INT C VERY LARG, DOI DOI 10.1055/S-2007-996789
[3]  
[Anonymous], 1908, BIOMETRIKA, V6, P1
[4]  
[Anonymous], 2013, P 9 JOINT M EUR SOFT
[5]  
[Anonymous], 2015, GOOGLE TRENDS
[6]  
[Anonymous], 2008, INTRO INFORM RETRIEV, DOI DOI 10.1017/CBO9780511809071
[7]   What are developers talking about? An analysis of topics and trends in Stack Overflow [J].
Barua, Anton ;
Thomas, Stephen W. ;
Hassan, Ahmed E. .
EMPIRICAL SOFTWARE ENGINEERING, 2014, 19 (03) :619-654
[8]  
Bird Steven., 2004, P ACL INT POST DEM S, P214
[9]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[10]  
Chan Wing-Kwan, 2012, P FSE, P1, DOI DOI 10.1145/2393596.2393606