What are developers talking about? An analysis of topics and trends in Stack Overflow

被引:350
作者
Barua, Anton [1 ]
Thomas, Stephen W. [1 ]
Hassan, Ahmed E. [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON K7L 3N6, Canada
关键词
Q&A websites; Knowledge repository; Topic models; Trend analysis; Mining software repositories; Latent Dirichlet allocation; SOFTWARE;
D O I
10.1007/s10664-012-9231-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Programming question and answer (Q&A) websites, such as Stack Overflow, leverage the knowledge and expertise of users to provide answers to technical questions. Over time, these websites turn into repositories of software engineering knowledge. Such knowledge repositories can be invaluable for gaining insight into the use of specific technologies and the trends of developer discussions. Previous work has focused on analyzing the user activities or the social interactions in Q&A websites. However, analyzing the actual textual content of these websites can help the software engineering community to better understand the thoughts and needs of developers. In the article, we present a methodology to analyze the textual content of Stack Overflow discussions. We use latent Dirichlet allocation (LDA), a statistical topic modeling technique, to automatically discover the main topics present in developer discussions. We analyze these discovered topics, as well as their relationships and trends over time, to gain insights into the development community. Our analysis allows us to make a number of interesting observations, including: the topics of interest to developers range widely from jobs to version control systems to C# syntax; questions in some topics lead to discussions in other topics; and the topics gaining the most popularity over time are web development (especially jQuery), mobile applications (especially Android), Git, and MySQL.
引用
收藏
页码:619 / 654
页数:36
相关论文
共 44 条
[1]  
[Anonymous], 1997, READINGS INFORM RETR
[2]  
[Anonymous], 2008, Proceedings of the 17th International Conference on World Wide Web, DOI DOI 10.1145/1367497.1367587
[3]  
[Anonymous], 2012586 QUEENS U SCH
[4]  
[Anonymous], ESS TOOLS RUNN COMM
[5]  
[Anonymous], REPLICATION PACKAGE
[6]  
[Anonymous], MOB MED REP STAT MED
[7]  
[Anonymous], STACK OV CREAT COMM
[8]  
[Anonymous], SOFTW DEV PLATF 2011
[9]  
[Anonymous], ACM SE REG C
[10]  
[Anonymous], 2008, Introduction to information retrieval