What Do Concurrency Developers Ask About? A Large-scale Study Using Stack Overflow

被引:92
作者
Ahmed, Syed [1 ]
Bagherzadeh, Mehdi [1 ]
机构
[1] Oakland Univ, Rochester, MI 48063 USA
来源
PROCEEDINGS OF THE 12TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM 2018) | 2018年
关键词
Concurrency topics; concurrency topic hierarchy; concurrency topic difficulty; concurrency topic popularity; Stack Overflow;
D O I
10.1145/3239235.3239524
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Background Software developers are increasingly required to write concurrent code. However, most developers find concurrent programming difficult. To better help developers, it is imperative to understand their interest and difficulties in terms of concurrency topics they encounter often when writing concurrent code. Aims In this work, we conduct a large-scale study on the textual content of the entirety of Stack Overflow to understand the interests and difficulties of concurrency developers. Method First, we develop a set of concurrency tags to extract concurrency questions that developers ask. Second, we use latent Dirichlet allocation (LDA) topic modeling and an open card sort to manually determine the topics of these questions. Third, we construct a topic hierarchy by repeated grouping of similar topics into categories and lower level categories into higher level categories. Fourth, we investigate the coincidence of our concurrency topics with findings of previous work. Fifth, we measure the popularity and difficulty of our concurrency topics and analyze their correlation. Finally, we discuss the implications of our findings. Results A few findings of our study are the following. (1) Developers ask questions about a broad spectrum of concurrency topics ranging from multithreading to parallel computing, mobile concurrency to web concurrency and memory consistency to run-time speedup. (2) These questions can be grouped into a hierarchy with eight major categories: concurrency models, programming paradigms, correctness, debugging, basic concepts, persistence, performance and GUI. (3) Developers ask more about correctness of their concurrent programs than performance. (4) Concurrency questions about thread safety and database management systems are among the most popular and the most difficult, respectively. (5) Difficulty and popularity of concurrency topics are negatively correlated. Conclusions The results of our study can not only help concurrency developers but also concurrency educators and researchers to better decide where to focus their efforts, by trading off one concurrency topic against another.
引用
收藏
页数:10
相关论文
共 32 条
[1]  
Adamic Lada A., WWW 08
[2]  
Allamanis Miltiadis, MSR 13
[3]  
[Anonymous], Mallet: A machine learning for language toolkit
[4]  
Bagherzadeh Mehdi, MODULARITY 15
[5]  
Bagherzadeh Mehdi, AGERE 17
[6]  
Bajaj Kartik, MSR 2014
[7]  
Bajracharya Sushil Krishna, EMPIRICAL SOFTW ENGG, V17
[8]  
Barua Anton, EMPIRICAL SOFTW ENGG, V19
[9]  
Biggers Lauren R., EMPIRICAL SOFTW ENGG, V19
[10]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022