Requirement or Not, That is the Question: A Case from the Railway Industry

被引:10
作者
Bashir, Sarmad [1 ,2 ]
Abbas, Muhammad [1 ,2 ]
Saadatmand, Mehrdad [1 ]
Enoiu, Eduard Paul [2 ]
Bohlin, Markus [2 ]
Lindberg, Pernilla [3 ]
机构
[1] RISE Res Inst Sweden, Vasteras, Sweden
[2] Malardalen Univ, Vasteras, Sweden
[3] Alstom, Vasteras, Sweden
来源
REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2023 | 2023年 / 13975卷
关键词
Requirements identification; Requirements classification; tender documents; NLP;
D O I
10.1007/978-3-031-29786-1_8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
[Context and Motivation] Requirements in tender documents are often mixed with other supporting information. Identifying requirements in large tender documents could aid the bidding process and help estimate the risk associated with the project. [Question/problem] Manual identification of requirements in large documents is a resource-intensive activity that is prone to human error and limits scalability. This study compares various state-of-the-art approaches for requirements identification in an industrial context. For generalizability, we also present an evaluation on a real-world public dataset. [Principal ideas/results] We formulate the requirement identification problem as a binary text classification problem. Various state-of-the-art classifiers based on traditional machine learning, deep learning, and few-shot learning are evaluated for requirements identification based on accuracy, precision, recall, and F1 score. Results from the evaluation show that the transformer-based BERT classifier performs the best, with an average F1 score of 0.82 and 0.87 on industrial and public datasets, respectively. Our results also confirm that few-shot classifiers can achieve comparable results with an average F1 score of 0.76 on significantly lower samples, i.e., only 20% of the data. [Contribution] There is little empirical evidence on the use of large language models and few-shots classifiers for requirements identification. This paper fills this gap by presenting an industrial empirical evaluation of the state-of-the-art approaches for requirements identification in large tender documents. We also provide a running tool and a replication package for further experimentation to support future research in this area.
引用
收藏
页码:105 / 121
页数:17
相关论文
共 39 条
[1]  
Abbas M., 2022, Requir. Eng., V28, P1
[2]   Automated Reuse Recommendation of Product Line Assets Based on Natural Language Requirements [J].
Abbas, Muhammad ;
Saadatmand, Mehrdad ;
Enoiu, Eduard ;
Sundamark, Daniel ;
Lindskog, Claes .
REUSE IN EMERGING SOFTWARE ENGINEERING PRACTICES, ICSR 2020, 2020, 12541 :173-189
[3]   Automated demarcation of requirements in textual specifications: a machine learning-based approach [J].
Abualhaija, Sallam ;
Arora, Chetan ;
Sabetzadeh, Mehrdad ;
Briand, Lionel C. ;
Traynor, Michael .
EMPIRICAL SOFTWARE ENGINEERING, 2020, 25 (06) :5454-5497
[4]   A Machine Learning-Based Approach for Demarcating Requirements in Textual Specifications [J].
Abualhaija, Sallam ;
Arora, Chetan ;
Sabetzadeh, Mehrdad ;
Briand, Lionel C. ;
Vaz, Eduardo .
2019 27TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE 2019), 2019, :51-62
[5]   A Zero-Shot Learning Approach to Classifying Requirements: A Preliminary Study [J].
Alhoshan, Waad ;
Zhao, Liping ;
Ferrari, Alessio ;
Letsholo, Keletso J. .
REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2022, 2022, 13216 :52-59
[6]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[7]   Empirical evaluation of tools for hairy requirements engineering tasks [J].
Berry, Daniel M. .
EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (06)
[8]  
Binkhonain M, 2019, Expert Systems with Applications X, V1, P100001, DOI [10.1016/j.eswax.2019.100001, 10.1016/j.eswax.2019.100001, DOI 10.1016/J.ESWAX.2019.100001]
[9]  
Bojanowski P, 2017, T ASSOC COMPUT LING, V5, P135, DOI [10.1162/tacl_a_00051, DOI 10.1162/TACL_A_00051, 10.1162/tacla00051, DOI 10.1162/TACLA00051]
[10]   Dronology: An Incubator for Cyber-Physical Systems Research [J].
Cleland-Huang, Jane ;
Vierhauser, Michael ;
Bayley, Sean .
2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING TECHNOLOGIES RESULTS (ICSE-NIER), 2018, :109-112