A platform-based Natural Language processing-driven strategy for digitalising regulatory compliance processes for the built environment

被引:7
作者
Kruiper, Ruben [1 ]
Kumar, Bimal [2 ]
Watson, Richard [3 ]
Sadeghineko, Farhad [4 ]
Gray, Alasdair [1 ]
Konstas, Ioannis [1 ]
机构
[1] Heriot Watt Univ, Dept Comp Sci, Edinburgh, Scotland
[2] Univ Strathclyde, Dept Architecture, Glasgow City, England
[3] Northumbria Univ, Dept Architecture & Built Environm, Newcastle Upon Tyne, England
[4] Glasgow Caledonian Univ, Sch Comp Engn & Built Environm, Glasgow, Scotland
关键词
Digital Regulatory Compliance; Natural Language Processing; Semantic Web; Machine Learning; Knowledge Graph; Automated Compliance Checking; RULE CHECKING; SYSTEM; WEB;
D O I
10.1016/j.aei.2024.102653
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The digitalisation of the regulatory compliance process has been an active area of research for several decades. However, more recently the level of activities in this area has increased considerably. In the UK, the tragic incident of Grenfell fire in 2017 has been a major catalyst for this as a result of the Hackitt report's recommendations pointing a lot of the blame on the broken regulatory regime in the country. The Hackitt report emphasises the need to overhaul the building regulations, but the approach to do so remains an open research question. Existing work in this space tends to overlook the processing of actual regulatory documents, or limits their scope to solving a relatively small subtask. This paper presents a new comprehensive platform approach to the digitalisation of the regulatory compliance processing. We present i-ReC (intelligent Regulatory Compliance), a platform approach to digitalisation of regulatory compliance that takes into consideration the enormous diversity of all the stakeholders' activities. A historical perspective on research in this area is first presented to put things in perspective which identifies the challenges in such an endeavour and identifies the gaps in state-of-theart. After enumerating all the challenges in implementing a platform-based approach to digitalising the regulatory compliance process, the implementation of some parts of the platform is described. Our research demonstrates that the identification and extraction of all relevant requirements from the corpus of several hundred regulatory documents is a key part of the whole process which underlies the entire process from authoring to eventually compliance checking of designs. Some of the issues that need addressing in this endeavour include ambiguous language, inconsistent use of terms, contradicting requirements and handling multi-word expressions. The implementation of these tools is driven by NLP, ML and Semantic Web technologies. A semantic search engine was developed and validated against other popular and comparable engines with a corpus of 420 (out of about 800) documents used in the UK for compliance checking of building designs. In every search scenario, our search engine performed better on all objective criteria. Limitations of the approach are discussed which includes the challenges around licensing for all the documents in the corpus. Further work includes improving the performance of SPaR.txt (the tool created to identify multi-word expressions) as well as the information retrieval engine by increasing the dataset and providing the model with examples from more diverse formats of regulations. There is also a need to develop and align strategies to collect a comprehensive set of domain vocabularies to be combined in a Knowledge Graph.
引用
收藏
页数:14
相关论文
共 109 条
[1]  
Agichtein E, 2001, SIGMOD RECORD, V30, P612
[2]  
Alani Y., 2020, P 37 CIB W78 INF TEC, P141
[3]   A SEMANTIC COMMON MODEL FOR PRODUCT DATA IN THE WATER INDUSTRY [J].
Alani, Yasir ;
Dawood, Nashwan ;
Patacas, Joao ;
Rodriguez, Sergio ;
Dawood, Huda .
JOURNAL OF INFORMATION TECHNOLOGY IN CONSTRUCTION, 2021, 26 :566-590
[4]   The promise of automated compliance checking [J].
Amor, Robert ;
Dimyadi, Johannes .
DEVELOPMENTS IN THE BUILT ENVIRONMENT, 2021, 5 (05)
[5]  
[Anonymous], 2014, P FRAM SEM NLP WORKS
[6]  
Artzi Yoav., 2013, Transactions of the Association for Computational Linguistics, V1, P49, DOI [10.1162/tacla00209, DOI 10.1162/TACLA00209, DOI 10.1162/TACL_A_00209, 10.1162/tacl_a_00209]
[7]  
Bach Nguyen., 2007, Literature review for Language and Statistics II
[8]  
Baldwin T, 2010, CH CRC MACH LEARN PA, P267
[9]  
Banko M, 2007, 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2670
[10]   Towards the adoption of automated regulatory compliance checking in the built environment [J].
Beach, Thomas H. ;
Hippolyte, Jean-Laurent ;
Rezgui, Yacine .
AUTOMATION IN CONSTRUCTION, 2020, 118