Towards Automated Lithology Classification in NATM Tunnel: A Data-Driven Solution for Multi-dimensional Imbalanced Data

被引:0
作者
Li, Yang [1 ,2 ]
Chen, Jiayao [1 ,2 ,4 ]
Fang, Qian [1 ,2 ]
Zhang, Dingli [1 ,2 ]
Huang, Wengui [3 ]
机构
[1] Beijing Jiaotong Univ, Sch Civil Engn, Beijing 100044, Peoples R China
[2] Beijing Jiaotong Univ, Key Lab Urban Underground Engn, Minist Educ, Beijing 100044, Peoples R China
[3] Teesside Univ, Sch Comp Engn & Digital Technol, Middlesbrough TS1 3BA, England
[4] East China Jiaotong Univ, State Key Lab Performance Monitoring & Protecting, Nanchang, Jiangxi, Peoples R China
关键词
New Austrian tunneling method; Measurement-while-drilling; Lithology classification; Machine learning; Multi-dimensional imbalanced data; ROCK STRENGTH PARAMETERS; RANDOM FORESTS; PREDICTION; SYSTEM; RECOGNITION; TECHNOLOGY; TESTS; MODEL; INDEX;
D O I
10.1007/s00603-024-04287-6
中图分类号
P5 [地质学];
学科分类号
0709 ; 081803 ;
摘要
To fully grasp the lithology of unexcavated tunnel geology, a correlation database using measurement-while-drilling (MWD) information from the NATM tunnel excavation process was established, resulting in a multi-dimensional imbalanced dataset consisting of 7216 entries. By integrating borehole imaging and expert interpretation, drilling parameters were aligned with lithology data. A hybrid ensemble model, combining adaptive synthetic sampling (ADASYN), grid search (GS) hyperparameter optimization, and eXtreme gradient boosting (XGBoost), is proposed for intelligent lithology classification. Various machine learning models, incorporating hyperparameter optimization and oversampling algorithms, were employed, cumulatively generating 12 classifiers for Macro F1 performance comparison. Comprehensive analysis showed that the GS-ADASYN-XGBoost algorithm outperformed the other hybrid models in classifying different lithologies. Water pressure was identified as the key feature influencing lithology classification, followed by water flow. Setting the oversampling proportion to 0.2, the ADASYN method effectively optimized the data imbalance ratio, significantly enhancing classifier performance. This improvement was most notable for the least represented lithology category, chlorite, with an increase of 1.27 times compared to no oversampling. The proposed model provides valuable insights for geological interpretation of the tunnel face. A hybrid GS-ADASYN-XGBoost model is proposed for classifying lithologies.A database with 7216 MWD from NATM tunnel excavation is established.Borehole imaging and expert interpretation align drilling parameters with lithology.Multi-dimensional data imbalance is effectively optimized by ADASYN.
引用
收藏
页码:2349 / 2366
页数:18
相关论文
共 25 条
  • [1] Validating Data-Driven Approaches Towards Dimensional Phenotypes
    Eickhoff, Simon
    BIOLOGICAL PSYCHIATRY, 2020, 87 (09) : S27 - S27
  • [2] A Multi-Dimensional Data-Driven Study on the Emotional Attachment Characteristics of the Renovation of Beijing Traditional Quadrangles
    Zhang, Ruoshi
    BUILDINGS, 2024, 14 (07)
  • [3] Big data-driven TBM tunnel intelligent construction system with automated-compliance-checking (ACC) optimization
    Li, Xiaojun
    Zhao, Sicheng
    Shen, Yi
    Xue, Yadong
    Li, Tao
    Zhu, Hehua
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 244
  • [4] Multi-source data driven method for assessing the rock mass quality of a NATM tunnel face via hybrid ensemble learning models
    Zhou, Mingliang
    Chen, Jiayao
    Huang, Hongwei
    Zhang, Dongming
    Zhao, Shuai
    Shadabfar, Mahdi
    INTERNATIONAL JOURNAL OF ROCK MECHANICS AND MINING SCIENCES, 2021, 147
  • [5] Multi-dimensional features based data-driven state of charge estimation method for LiFePO4 batteries
    Liu, Mengmeng
    Xu, Jun
    Jiang, Yihui
    Mei, Xuesong
    ENERGY, 2023, 274
  • [6] Active Pattern Classification for Automatic Visual Exploration of Multi-Dimensional Data
    Li, Jie
    Tan, Huailian
    Huang, Wentao
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [7] MVPA-Light: A Classification and Regression Toolbox for Multi-Dimensional Data
    Treder, Matthias S.
    FRONTIERS IN NEUROSCIENCE, 2020, 14
  • [8] Towards an Automated Semantic Data-driven Decision Making Employing Human Brain
    Fensel, Anna
    2ND INTERNATIONAL CONFERENCE ON ADVANCED RESEARCH METHODS AND ANALYTICS (CARMA 2018), 2018, : 167 - 175
  • [9] LHist: Towards Learning Multi-dimensional Histogram for Massive Spatial Data
    Liu, Qiyu
    Shen, Yanyan
    Chen, Lei
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1188 - 1199
  • [10] A Novel Data-Driven Tropical Cyclone Track Prediction Model Based on CNN and GRU With Multi-Dimensional Feature Selection
    Lian, Jie
    Dong, Pingping
    Zhang, Yuping
    Pan, Jianguo
    Liu, Kehao
    IEEE ACCESS, 2020, 8 : 97114 - 97128