Multi-Label Requirements Classification with Large Taxonomies

被引:0
作者
Abdeen, Waleed [1 ]
Unterkalmsteiner, Michael [1 ]
Wnuk, Krzysztof [1 ]
Chirtoglou, Alexandros [2 ]
Schimanski, Christoph [2 ]
Goli, Heja [2 ]
机构
[1] Blekinge Inst Technol, Software Engn, Karlskrona, Sweden
[2] HOCHTIEF ViCon GmbH, Essen, Germany
来源
32ND IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, RE 2024 | 2024年
关键词
requirements classification; domain-specific taxonomy; large-scale; multi-label; TRACEABILITY;
D O I
10.1109/RE59067.2024.00033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context and motivation: Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification. Question/problem: Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies. Principal ideas/results: We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based classifier had a significantly higher recall compared to the word-based classifier; however, the precision and F1-score did not improve significantly. (2) The hierarchical classification strategy did not always improve the performance of requirements classification. (3) The total and leaf nodes of the taxonomies have a strong negative correlation with the recall of the hierarchical sentence-based classifier. Contribution: We investigate the problem of multi-label requirements classification with large taxonomies, illustrate a systematic process to create a ground truth involving industry participants, and provide an analysis of different classification pipelines using zero-shot learning.
引用
收藏
页码:264 / 274
页数:11
相关论文
共 39 条
[1]  
Abdeen Waleed, 2024, Figshare, DOI 10.6084/M9.FIGSHARE.22736690
[2]   Zero-shot learning for requirements classification: An exploratory study [J].
Alhoshan, Waad ;
Ferrari, Alessio ;
Zhao, Liping .
INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 159
[3]  
[Anonymous], 2008, Guide to Advanced Empirical Software Engineering, DOI DOI 10.1007/978-1-84800-044-5_8
[4]   Automatic Multi-class Non-Functional Software Requirements Classification Using Neural Networks [J].
Baker, Cody ;
Deng, Lin ;
Chakraborty, Suranjan ;
Dehlinger, Josh .
2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 2, 2019, :610-615
[5]   Requirement or Not, That is the Question: A Case from the Railway Industry [J].
Bashir, Sarmad ;
Abbas, Muhammad ;
Saadatmand, Mehrdad ;
Enoiu, Eduard Paul ;
Bohlin, Markus ;
Lindberg, Pernilla .
REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2023, 2023, 13975 :105-121
[6]  
Binkhonain M., 2019, Expert Syst. Appl., V1, P100001
[7]   Identification of non-functional requirements in textual specifications: A semi-supervised learning approach [J].
Casamayor, Agustin ;
Godoy, Daniela ;
Campo, Marcelo .
INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (04) :436-445
[8]  
Gabrilovich E, 2007, 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1606
[9]   Metadata? Thesauri? Taxonomies? Topic maps! Making sense of it all [J].
Garshol, LM .
JOURNAL OF INFORMATION SCIENCE, 2004, 30 (04) :378-391
[10]   Topic recommendation for software repositories using multi-label classification algorithms [J].
Izadi, Maliheh ;
Heydarnoori, Abbas ;
Gousios, Georgios .
EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (05)