Expert as a Service: Software Expert Recommendation via Knowledge Domain Embeddings in Stack Overflow

被引:18
作者
Huang, Chaoran [1 ]
Yao, Lina [1 ]
Wang, Xianzhi [2 ]
Benatallah, Boualem [1 ]
Sheng, Quan Z. [3 ]
机构
[1] UNSW Sydney, Sydney, NSW, Australia
[2] Singapore Management Univ, Singapore, Singapore
[3] Macquarie Univ, N Ryde, NSW, Australia
来源
2017 IEEE 24TH INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2017) | 2017年
关键词
Knowledge discovery; Stack Overflow; Expertise finding; Question answering; Expert as a Service;
D O I
10.1109/ICWS.2017.122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no postprocessing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.
引用
收藏
页码:317 / 324
页数:8
相关论文
共 26 条
[1]  
[Anonymous], 2007, NIPS
[2]  
[Anonymous], 2002, P 8 ACM SIGKDD INT C, DOI DOI 10.1145/775047.775110
[3]  
[Anonymous], 2006, P 15 INT C WORLD WID
[4]   Stable signal recovery from incomplete and inaccurate measurements [J].
Candes, Emmanuel J. ;
Romberg, Justin K. ;
Tao, Terence .
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2006, 59 (08) :1207-1223
[5]   Exploring latent browsing graph for question answering recommendation [J].
Chiang, Meng-Fen ;
Peng, Wen-Chih ;
Yu, Philip S. .
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2012, 15 (5-6) :603-630
[6]   Predicting Best Answerers for New Questions: An Approach Leveraging Distributed Representations of Words in Community Question Answering [J].
Dong, Hualei ;
Wang, Jian ;
Lin, Hongfei ;
Xu, Bo ;
Yang, Zhihao .
2015 NINTH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY FCST 2015, 2015, :13-18
[7]  
Donoho DavidL., 2004, NEIGHBORLY POLYTOPES
[8]  
Dumais S., 1998, Proceedings of the 1998 ACM CIKM International Conference on Information and Knowledge Management, P148, DOI 10.1145/288627.288651
[9]   Efficient stochastic algorithms for document clustering [J].
Forsati, Rana ;
Mahdavi, Mehrdad ;
Shamsfard, Mehrnoush ;
Meybodi, Mohammad Reza .
INFORMATION SCIENCES, 2013, 220 :269-291
[10]  
Guo J., 2008, P 17 ACM C INF KNOWL, P921, DOI [DOI 10.1145/1458082.1458204, 10.1145/1458082.1458204]