PET: An Annotated Dataset for Process Extraction from Natural Language Text Tasks

被引:15
作者
Bellan, Patrizio [1 ,2 ]
van der Aa, Han [3 ]
Dragoni, Mauro [1 ]
Ghidini, Chiara [1 ]
Ponzetto, Simone Paolo [3 ]
机构
[1] Fdn Bruno Kessler, Trento, Italy
[2] Free Univ Bozen Bolzano, Bolzano, Italy
[3] Univ Mannheim, Mannheim, Germany
来源
BUSINESS PROCESS MANAGEMENT WORKSHOPS, BPM 2022 INTERNATIONAL WORKSHOPS | 2023年 / 460卷
关键词
Process extraction from text; Business process management; Information extraction; Natural language processing; Dataset; Gold standard;
D O I
10.1007/978-3-031-25383-6_23
中图分类号
F [经济];
学科分类号
02 ;
摘要
Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, differently from other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions carefully annotated with all the entities and relationships of interest. This paper presents the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information. We present our new resource, including a variety of baselines to benchmark the difficulty and challenges of business process extraction from text. The PET dataset, annotation guidelines, and inception schema are freely available via huggingface.co/datasets/patriziobellan/PET.
引用
收藏
页码:315 / 321
页数:7
相关论文
共 8 条
[1]   Digging into Business Process Meta-models: A First Ontological Analysis [J].
Adamo, Greta ;
Di Francescomarino, Chiara ;
Ghidini, Chiara .
ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2020, 2020, 12127 :384-400
[2]  
Bellan P, 2020, P19
[3]  
Friedrich F., 2010, AUTOMATED GENERATION
[4]   Agreement, the F-measure, and reliability in information retrieval [J].
Hripcsak, G ;
Rothschild, AS .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2005, 12 (03) :296-298
[5]  
Klie J.-C., 2018, P 27 INT C COMPUTATI, P5
[6]   A Comprehensive Investigation of BPMN Models Generation from Textual Requirements-Techniques, Tools and Trends [J].
Maqbool, Bilal ;
Azam, Farooque ;
Anwar, Muhammad Waseem ;
Butt, Wasi Haider ;
Zeb, Jahan ;
Zafar, Iqra ;
Nazir, Aiman Khan ;
Umair, Zuneera .
INFORMATION SCIENCE AND APPLICATIONS 2018, ICISA 2018, 2019, 514 :543-557
[7]  
Nanni F., 2018, LREC.
[8]  
Padro, P 27 INT C COMP LING