GRADIENT BOOSTED DECISION TREES FOR LITHOLOGY CLASSIFICATION

被引:58
作者
Dev, Vikrant A. [1 ]
Eden, Mario R. [1 ]
机构
[1] Auburn Univ, Dept Chem Engn, Auburn, AL 36849 USA
来源
PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON FOUNDATIONS OF COMPUTER-AIDED PROCESS DESIGN | 2019年 / 47卷
关键词
Gradient Boosted Decision Trees; Lithology Classification; XGBoost; LightGBM; CatBoost;
D O I
10.1016/B978-0-12-818597-1.50019-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The classification of underground formation lithology is crucial for petroleum exploration and engineering as it is the basis of geological research studies and reservoir parameter calculations. Hence, there have recently been increased efforts to automate lithology classification. This is due to the rising prowess of cheap computational devices and availability of open source machine learning software libraries. This has opened avenues for the efficient analysis of large volumes of well log data with much higher accuracy. In this regard, efforts were made recently to evaluate machine learning methods to classify formation lithology by using data from Daniudui gas field (DGF) and Hangjinqi gas field (HGF). Although the machine learning algorithms utilized in the studies performed well, there is still scope for improvement in the predictive ability and scalability. The results obtained from the boosted decision tree learners, in these studies, were encouraging. Hence, we tapped into the state of the art of the boosting approach to machine learning and implemented algorithms that are scalable to large datasets. Specifically, we applied, XGBoost, LightGBM and CatBoost, which belong to the family of gradient boosted decision trees (GBDTs). We compared their performance, after combining well log data obtained from DGF and HGF, with other tree-based machine algorithms, namely, decision trees (DTs), random forests (RFs), extremely randomized trees (ERTs), AdaBoost and gradient boosting machines (GBMs). We tuned the hyperparameters and then evaluated the generated models using metrics such as the micro average, macro average and weighted average of precision (Pr), recall (Re) and F1-score (F1) on the test set. In our analysis, amongst the applied algorithms, we found that LightGBM possessed the highest metrics. Our work identifies LightGBM and CatBoost as good first-choice algorithms for the supervised classification of lithology when utilizing well log data.
引用
收藏
页码:113 / 118
页数:6
相关论文
共 14 条
  • [1] [Anonymous], INT J PRECISION ENG
  • [2] [Anonymous], 2018, P SPE NIG INT C EXH
  • [3] [Anonymous], 2016, KDD16 P 22 ACM, DOI DOI 10.1145/2939672.2939785
  • [4] Dev V.A., 2018, Comput. Chem. Eng, V44, P1465
  • [5] Hyne N., 2014, Dictionary of petroleum exploration, drilling production
  • [6] Ke GL, 2017, ADV NEUR IN, V30
  • [7] Fast evaluation of well placements in heterogeneous reservoir models using machine learning
    Nwachukwu, Azor
    Jeong, Hoonyoung
    Pyrcz, Michael
    Lake, Larry W.
    [J]. JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2018, 163 : 463 - 475
  • [8] Prokhorenkova L, 2018, ADV NEUR IN, V31
  • [9] Decision forest: Twenty years of research
    Rokach, Lior
    [J]. INFORMATION FUSION, 2016, 27 : 111 - 125
  • [10] Ensemble learning: A survey
    Sagi, Omer
    Rokach, Lior
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 8 (04)