Automated Classification of Construction Claim Documents Using Text Mining

被引:0
作者
Malaeb, Zeina [1 ]
Momenifar, Samaneh [1 ]
Rehman, Tooba [1 ]
Biglari, Ava [1 ]
Mohammed, Yasser [1 ]
Karim, Mohammad Rezaul [1 ]
机构
[1] Univ Alberta, Edmonton, AB, Canada
来源
PROCEEDINGS OF THE CANADIAN SOCIETY FOR CIVIL ENGINEERING ANNUAL CONFERENCE 2023, VOL 5, CSCE 2023 | 2024年 / 499卷
关键词
Claims; Classification; Text mining; MANAGEMENT;
D O I
10.1007/978-3-031-61503-0_23
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Claims are inevitable on construction projects and proper claim management is crucial to avoid their escalation into disputes. Claim preparation relies heavily on documentation and involves the classification of large amounts of information. Nevertheless, this process is typically performed by human experts (contract administrators) and results in substantial time, effort, cost, and human error. In fact, researchers have concluded that the conventional documentation system is inefficient and must be enhanced. In view of the important role of text mining in the construction field, there is a possibility of using text mining in claim preparation for potential improvements. To the authors' best knowledge, no previous research has developed an automated tool to classify correspondences on construction projects for utilization in claims. Accordingly, this paper aims to improve information organization in claims management by developing a classification tool, using text mining that automatically classifies correspondences on a project as relevant or irrelevant to a claim. Four different classification algorithms, KNN, Naive Bayes, SVM, and random forest, are trained and tested using various parameters on a training dataset of 213 documents from a construction project to determine the optimized setting for each classifier. Then, the optimized classifiers are applied to a testing dataset of 205 documents from a second construction project to analyze their performance and applicability across projects with different characteristics. The results reveal that the random forest classifier has a high recall (93%) for claim-relevant documents and average accuracy (65%) while SVM presented acceptable recall (80%) with higher accuracy (79%).
引用
收藏
页码:313 / 325
页数:13
相关论文
共 38 条
[1]  
Abdul-Malak MA, 2020, CONSTRUCTION RESEARCH CONGRESS 2020: PROJECT MANAGEMENT AND CONTROLS, MATERIALS, AND CONTRACTS, P1284
[2]  
Abdul-Malak MAU, 2017, J LEG AFF DISPUTE RE, V9, DOI 10.1061/(ASCE)LA.1943-4170.0000229
[3]   Automatic Classification of Project Documents on the Basis of Text Content [J].
Al Qady, Mohammed ;
Kandil, Amr .
JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2015, 29 (03)
[4]   A database management system to document and analyse construction claims [J].
Al-Sabah, SSJA ;
Fereig, SM ;
Hoare, DJ .
ADVANCES IN ENGINEERING SOFTWARE, 2003, 34 (08) :477-491
[5]  
Alsubaey M, 2015, 2015 SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), P164, DOI 10.1109/IntelliSys.2015.7361140
[6]   Performance Prediction of Construction Projects Based on the Causes of Claims: A System Dynamics Approach [J].
Ansari, Ramin ;
Khalilzadeh, Mohammad ;
Taherkhani, Roohollah ;
Antucheviciene, Jurgita ;
Migilinskas, Darius ;
Moradi, Shohreh .
SUSTAINABILITY, 2022, 14 (07)
[7]  
Bhavsar H., 2012, Int. J. Adv. Res. Comput. Eng. Technol., V1, P185
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]  
BUCKLAND M, 1994, J AM SOC INFORM SCI, V45, P12, DOI 10.1002/(SICI)1097-4571(199401)45:1<12::AID-ASI2>3.0.CO
[10]  
2-L