Text mining tool for translating terms of contract into technical specifications: Development and application in the railway sector

被引:27
作者
Fantoni, G. [1 ]
Coli, E. [2 ]
Chiarello, F. [1 ]
Apreda, R. [3 ]
Dell'Orletta, F. [4 ]
Pratelli, G. [5 ]
机构
[1] Univ Pisa, Dept Civil & Ind Engn, Pisa, Italy
[2] Univ Pisa, Dept Informat Engn, Pisa, Italy
[3] Erre Quadro Srl, Florence, Italy
[4] CNR, Inst Linguist Computaz Antonio Zampolli ILC, Rome, Italy
[5] Hitachi Rail SpA, Pistoia, Italy
关键词
Contract terms; Technical requirements; Tendering; Computational science; Text mining; Natural language processing; DESIGN; MODEL;
D O I
10.1016/j.compind.2020.103357
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Tenders or technical terms contain a large quantity of both technical, legal, managerial information mixed in a nested and complex net of relationships. Extracting technical and design information from a document whose aim is both legal and technical, and that is written using several specific jargons, is not a trivial task: the purpose of the research is to try to detect, extract, split and assign information from the text of a tender in an automatic way. It means being able to understand technical and legal terms and organize them in multiple ways: according to product structure, internal organisational structure, etc. The focus is in providing a handy tool that could speed up and facilitate human analysis and allow tackling also the process of transforming customer's requirements into design specifications. The approach chosen to overcome the various issues is to support state-of-the-art Computational Linguistic tools with a wide Knowledge Base. The latter has been constructed both manually and automatically and comprises not only keywords but also concepts, relationships and regular expressions. The implementation of the methodology has been carried out during a project for AnsaldoBreda S.p.A. (now Hitachi Rail Europe). A case study about the tender for a high-speed train has been included to show the functioning and output of the entire software system. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 68 条
[1]  
Abualhaija S., 2019, 2019 IEEE 27 INT REQ
[2]  
Al Kilani N.A., 2019, 2019 6 INT C SOC NET
[3]   A survey on evolutionary machine learning [J].
Al-Sahaf, Harith ;
Bi, Ying ;
Chen, Qi ;
Lensen, Andrew ;
Mei, Yi ;
Sun, Yanan ;
Tran, Binh ;
Xue, Bing ;
Zhang, Mengjie .
JOURNAL OF THE ROYAL SOCIETY OF NEW ZEALAND, 2019, 49 (02) :205-228
[4]  
AlGeddawy T., 2011, P 4 INT C CHANG AG R
[5]   Optimum granularity level of modular product design architecture [J].
AlGeddawy, Tarek ;
ElMaraghy, Hoda .
CIRP ANNALS-MANUFACTURING TECHNOLOGY, 2013, 62 (01) :151-154
[6]  
[Anonymous], 2017, P 16 INT C ARTICIAL, DOI [10.1145/3086512.3086515, DOI 10.1145/3086512.3086515]
[7]   Extracting features from online software reviews to aid requirements reuse [J].
Bakar, Noor Hasrina ;
Kasirun, Zarinah M. ;
Salleh, Norsaremah ;
Jalab, Hamid A. .
APPLIED SOFT COMPUTING, 2016, 49 :1297-1315
[8]   SUBJECTIVE PERFORMANCE-MEASURES IN OPTIMAL INCENTIVE CONTRACTS [J].
BAKER, G ;
GIBBONS, R ;
MURPHY, KJ .
QUARTERLY JOURNAL OF ECONOMICS, 1994, 109 (04) :1125-1156
[9]  
Baldwin C. Y., 2000, DESIGN RULES POWER M
[10]  
Bonin F., 2010, PROC 7 INT C LANGUAG, p3222 3229