New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection

被引:0
|
作者
Neysiani, Behzad Soleimani [1 ]
Babamir, Seyed Morteza [1 ]
机构
[1] Univ Kashan, Fac Comp & Elect Engn, Dept Software Engn, Kashan, Esfahan, Iran
来源
2019 5TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR) | 2019年
关键词
Information Retrieval; Natural Language Processing; Duplicate Detection; Bug Reports; Topic; Feature Expansion;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Duplicate bug report detection is one of the major problems in software triage systems like Bugzilla to deal with end user requests. User request contains some categorical and especially textual fields which need feature extraction for duplicate detection. Contextual and topical features are acquired using calculating cosine similarity between term frequency or inverse document frequency or BM25F technique from a pair of bug reports against some topics. This research proposes the individual Manhattan distance similarity approach instead of cosine distance similarity for every topic in contextual features to expand the feature dimension which can increase the accuracy of the duplicate bug report detection process. The four famous datasets of bug reports have used for evaluation of the proposed method including Android, Eclipse, Mozilla, and Open Office which the experimental results indicate performance improvement for four contextual features including general, cryptography, network, and Java topics.
引用
收藏
页码:178 / 183
页数:6
相关论文
共 44 条
  • [31] New labeled dataset of interconnected lexical typos for automatic correction in the bug reports
    Behzad Soleimani Neysiani
    Seyed Morteza Babamir
    SN Applied Sciences, 2019, 1
  • [32] Enhanced Duplicate Count Strategy: Towards New Algorithms to Improve Duplicate Detection
    Aassem, Youssef
    Hafidi, Imad
    Aboutabit, Noureddine
    3RD INTERNATIONAL CONFERENCE ON NETWORKING, INFORMATION SYSTEM & SECURITY (NISS'20), 2020,
  • [33] Mining Temporal Information to Improve Duplication Detection on Bug Reports
    Lee, Chao-Yuan
    Hu, Dan-Dan
    Feng, Zhong-Yi
    Yang, Cheng-Zen
    2015 IIAI 4TH INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2015, : 551 - 555
  • [34] Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches
    Neysiani, Behzad Soleimani
    Babamir, Seyed Morteza
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 288 - 293
  • [35] An Improved Discriminative Model for Duplication Detection on Bug Reports with Cluster Weighting
    Lin, Meng-Jie
    Yang, Cheng-Zen
    2014 IEEE 38TH ANNUAL INTERNATIONAL COMPUTERS, SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2014, : 117 - 122
  • [36] Sarcasm Detection Using Deep Learning With Contextual Features
    Razali, Md Saifullah
    Halin, Alfian Abdul
    Ye, Lei
    Doraisamy, Shyamala
    Norowi, Noris Mohd
    IEEE ACCESS, 2021, 9 : 68609 - 68618
  • [37] A Replication Package for It Takes Two to TANGO: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports
    Cooper, Nathan
    Bernal-Cardenas, Carlos
    Chaparro, Oscar
    Moran, Kevin
    Poshyvanyk, Denys
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 160 - 161
  • [38] Efficient Duplicate Detection on Cloud Using a New Signature Scheme
    Rong, Chuitian
    Lu, Wei
    Du, Xiaoyong
    Zhang, Xiao
    WEB-AGE INFORMATION MANAGEMENT, 2011, 6897 : 251 - 263
  • [39] Severity Prediction for Bug Reports Using Multi-Aspect Features: A Deep Learning Approach
    Dao, Anh-Hien
    Yang, Cheng-Zen
    MATHEMATICS, 2021, 9 (14)
  • [40] Duplication Detection for Software Bug Reports based on BM25 Term Weighting
    Yang, Cheng-Zen
    Du, Hung-Hsueh
    Wu, Sin-Sian
    Chen, Ing-Xiang
    2012 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2012, : 33 - 38