Predicting closed questions on community question answering sites using convolutional neural network

被引:20
作者
Roy, Pradeep Kumar [1 ,2 ]
Singh, Jyoti Prakash [1 ]
机构
[1] Natl Inst Technol Patna, Dept Comp Sci & Engn, Patna, Bihar, India
[2] Vellore Inst Technol, Dept Informat Technol, Vellore, Tamil Nadu, India
关键词
Community question answering; Closed questions; CNN; LSTM;
D O I
10.1007/s00521-019-04592-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Community questions answering sites receive a huge number of questions and answers everyday. It has been observed that a number of questions among them are marked as closed by the site moderators. Such questions increase overhead of the moderators and also create user dissatisfaction. This paper aims to predict whether a newly posted question would be marked as closed in the future or not and also give a tentative reason of being closed. Two models: (1) a baseline model based on traditional machine learning techniques and (2) deep learning models such as convolutional neural network (CNN) and long short-term memory (LSTM) network are used to classify a question into one of the five classes: (1) open, (2) off-topic, (3) not a real question, (4) too constructive and (5) too localized. The baseline model requires the handcrafted features and hence does not preserve semantics. However, CNN and LSTM networks are capable of preserving the semantics of question's word and extracting the hidden features from the textual content using multiple hidden layers. The LSTM network performs better compared to CNN and traditional machine learning models. The proposed model can be used as an initial filter to screen the closed question at the time of posting, which reduced the overheads of site moderators. To the best of our knowledge, this is the first work that predicts the closed question along with the reason the question will be closed. This helps the questioner to modify the question before posting. The experimental results with the dataset of Stack Overflow prove the effectiveness of the proposed model.
引用
收藏
页码:10555 / 10572
页数:18
相关论文
共 52 条
[1]  
Abric Durham, 2019, 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), P230, DOI 10.1109/MSR.2019.00046
[2]   Mining Duplicate Questions in Stack Overflow [J].
Ahasanuzzaman, Muhammad ;
Asaduzzaman, Muhammad ;
Roy, Chanchal K. ;
Schneider, Kevin A. .
13TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2016), 2016, :402-412
[3]  
[Anonymous], 1975, TECHNICAL REPORT
[4]  
[Anonymous], 2008, P 2008 INT C WEB SEA
[5]  
[Anonymous], 2018, 12 INT AAAI C WEB SO
[6]  
[Anonymous], 2011, AAAI
[7]  
Asaduzzaman M, 2013, IEEE WORK CONF MIN S, P97, DOI 10.1109/MSR.2013.6624015
[8]   Improving optimization of convolutional neural networks through parameter fine-tuning [J].
Becherer, Nicholas ;
Pecarina, John ;
Nykl, Scott ;
Hopkinson, Kenneth .
NEURAL COMPUTING & APPLICATIONS, 2019, 31 (08) :3469-3479
[9]  
Blooma Mohan John, 2010, Proceedings of the Seventh International Conference on Information Technology: New Generations (ITNG 2010), P534, DOI 10.1109/ITNG.2010.127
[10]  
Blooma MJ, 2008, APPLIED COMPUTING 2008, VOLS 1-3, P1107