GitHub Issue Classification Using BERT-Style Models

被引：14

作者：

Bharadwaj, Shikhar ^{[1
]}

Kadam, Tushar ^{[1
]}

机构：

[1] Indian Inst Sci, Bengaluru, Karnataka, India

来源：

2022 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON NATURAL LANGUAGE-BASED SOFTWARE ENGINEERING (NLBSE 2022) | 2022年

关键词：

NLP; BERT; text classification;

D O I：

10.1145/3528588.3528663

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent innovations in natural language processing techniques have led to the development of various tools for assisting software developers. This paper provides a report of our proposed solution to the issue report classification task from the NL-Based Software Engineering workshop. We approach the task of classifying issues on GitHub repositories using BERT-style models [1, 2, 6, 8]. We propose a neural architecture for the problem that utilizes contextual embeddings for the text content in the GitHub issues. Besides, we design additional features for the classification task. We perform a thorough ablation analysis of the designed features and benchmark various BERT-style models for generating textual embeddings. Our proposed solution performs better than the competition organizer's method and achieves an F-1 score of 0.8653. Our code and trained models are available at https://github.com/Kadam-Tushar/Issue-Classifier.

引用

页码：40 / 43

页数：4

共 8 条

[1]

Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805]

[2]

Feng ZY, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P1536

[3] Predicting issue types on GitHub [J].

Kallis, Rafael ;

Di Sorbo, Andrea ;

Canfora, Gerardo ;

Panichella, Sebastiano .

SCIENCE OF COMPUTER PROGRAMMING, 2021, 205

[4] Ticket Tagger: Machine Learning Driven Issue Classification [J].

Kallis, Rafael ;

Di Sorbo, Andrea ;

Canfora, Gerardo ;

Panichella, Sebastiano .

2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2019), 2019, :406-409

[5]

Kallis Rafael, 2022, P 1 INT WORKSHOP NAT

[6]

Liu YH, 2019, Arxiv, DOI arXiv:1907.11692

[7]

van der Maaten L, 2008, J MACH LEARN RES, V9, P2579

[8]

Yang ZL, 2019, ADV NEUR IN, V32

← 1 →