Requirement or Not, That is the Question: A Case from the Railway Industry

被引:10
作者
Bashir, Sarmad [1 ,2 ]
Abbas, Muhammad [1 ,2 ]
Saadatmand, Mehrdad [1 ]
Enoiu, Eduard Paul [2 ]
Bohlin, Markus [2 ]
Lindberg, Pernilla [3 ]
机构
[1] RISE Res Inst Sweden, Vasteras, Sweden
[2] Malardalen Univ, Vasteras, Sweden
[3] Alstom, Vasteras, Sweden
来源
REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2023 | 2023年 / 13975卷
关键词
Requirements identification; Requirements classification; tender documents; NLP;
D O I
10.1007/978-3-031-29786-1_8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
[Context and Motivation] Requirements in tender documents are often mixed with other supporting information. Identifying requirements in large tender documents could aid the bidding process and help estimate the risk associated with the project. [Question/problem] Manual identification of requirements in large documents is a resource-intensive activity that is prone to human error and limits scalability. This study compares various state-of-the-art approaches for requirements identification in an industrial context. For generalizability, we also present an evaluation on a real-world public dataset. [Principal ideas/results] We formulate the requirement identification problem as a binary text classification problem. Various state-of-the-art classifiers based on traditional machine learning, deep learning, and few-shot learning are evaluated for requirements identification based on accuracy, precision, recall, and F1 score. Results from the evaluation show that the transformer-based BERT classifier performs the best, with an average F1 score of 0.82 and 0.87 on industrial and public datasets, respectively. Our results also confirm that few-shot classifiers can achieve comparable results with an average F1 score of 0.76 on significantly lower samples, i.e., only 20% of the data. [Contribution] There is little empirical evidence on the use of large language models and few-shots classifiers for requirements identification. This paper fills this gap by presenting an industrial empirical evaluation of the state-of-the-art approaches for requirements identification in large tender documents. We also provide a running tool and a replication package for further experimentation to support future research in this area.
引用
收藏
页码:105 / 121
页数:17
相关论文
共 39 条
[11]   Evaluating classifiers in SE research: the ECSER pipeline and two replication studies [J].
Dell'Anna, Davide ;
Aydemir, Fatma Basak ;
Dalpiaz, Fabiano .
EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (01)
[12]   Are "Non-functional" Requirements really Non-functional? [J].
Eckhardt, Jonas ;
Vogelsang, Andreas ;
Fernandez, Daniel Mendez .
2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2016, :832-842
[13]   Identifying Requirements in Requests for Proposal: A Research Preview [J].
Falkner, Andreas ;
Palomares, Cristina ;
Franch, Xavier ;
Schenner, Gottfried ;
Aznar, Pablo ;
Schoerghuber, Alexander .
REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY (REFSQ 2019), 2019, 11412 :176-182
[14]   Natural Language Requirements Processing A 4D Vision [J].
Ferrari, Alessio ;
Dell'Orletta, Felice ;
Esuli, Andrea ;
Gervasi, Vincenzo ;
Gnesi, Stefania .
IEEE SOFTWARE, 2017, 34 (06) :28-35
[15]   A Named Entity Recognition Based Approach for Privacy Requirements Engineering [J].
Herwanto, Guntur Budi ;
Quirchmayr, Gerald ;
Tjoa, A. Min .
29TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW 2021), 2021, :406-411
[16]   NoRBERT: Transfer Learning for Requirements Classification [J].
Hey, Tobias ;
Keim, Jan ;
Koziolek, Anne ;
Tichy, Walter F. .
2020 28TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE'20), 2020, :169-179
[17]  
Honnibal M., 2017, APPEAR, V7
[18]  
Huang ZH, 2015, Arxiv, DOI [arXiv:1508.01991, DOI 10.48550/ARXIV.1508.01991]
[19]  
Hubert M., 2010, International Encyclopedia of Statistical Science
[20]  
Jindal R, 2016, 2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), P2027, DOI 10.1109/ICACCI.2016.7732349