Imputation of missing well log data by random forest and its uncertainty analysis

被引:79
作者
Feng, Runhai [1 ]
Grana, Dario [2 ]
Balling, Niels [1 ]
机构
[1] Aarhus Univ, Dept Geosci, Hoegh Guldbergs Gade 2, DK-8000 Aarhus C, Denmark
[2] Univ Wyoming, Dept Geol & Geophys, 1000 E Univ Ave, Laramie, WY 82071 USA
关键词
Log imputation; Random forest; Feature importance; Prediction interval; SEISMIC DATA; PREDICTION; CLASSIFICATION; RESERVOIR; VELOCITY;
D O I
10.1016/j.cageo.2021.104763
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Well logs are commonly used by geoscientists to infer and extrapolate physical properties of subsurface rocks. However, at some depth intervals, well log values might be missing due to operational issues in the logging process. To overcome this problem, an innovative approach to reconstruct well logs is proposed using machine learning methods. Based on other complete logging features, the missing well log values are predicted by datadriven machine learning algorithms, namely random forest. A grid-searching scheme is applied to find a combination of hyper-parameters for the best cross-validation score. During the training process, the relative importance of different input features is analysed to remove weakly sensitive measurements and prioritize data with strong correlation with the target variables. Principal component analysis is applied to explore the multicollinearity in the input features, such that only few principal components in the new data vector are used to represent a large fraction of the variance in the original data. To quantify the uncertainty in the predictions, a quantile regression tree is used for determining prediction intervals. Well log data from the Volve Field are used for validation of the prediction obtained by random forest, in which a high correlation coefficient between prediction and reference is achieved. The prediction intervals of different percentiles are estimated, and show more accurate results at depth points where a small range of the prediction intervals exists.
引用
收藏
页数:9
相关论文
共 65 条
[1]   Principal component analysis [J].
Abdi, Herve ;
Williams, Lynne J. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04) :433-459
[2]   Artificial neural network modeling and cluster analysis for organic facies and burial history estimation using well log data: A case study of the South Pars Gas Field, Persian Gulf, Iran [J].
Alizadeh, Bahram ;
Najjari, Saeid ;
Kadkhodaie-Ilkhchi, Ali .
COMPUTERS & GEOSCIENCES, 2012, 45 :261-269
[3]  
[Anonymous], 2013, Pattern Recognition and Machine Learning, DOI [DOI 10.18637/JSS.V017.B05, 10.1117/1.2819119]
[4]  
[Anonymous], COMPUT GEOSCI
[5]   Probabilistic logging lithology characterization with random forest probability estimation [J].
Ao, Yile ;
Zhu, Liping ;
Guo, Shuang ;
Yang, Zhongguo .
COMPUTERS & GEOSCIENCES, 2020, 144
[6]   Recursive convolutional neural networks in a multiple-point statistics framework [J].
Avalos, Sebastian ;
Ortiz, Julian M. .
COMPUTERS & GEOSCIENCES, 2020, 141
[7]   Missing log data interpolation and semiautomatic seismic well ties using data matching techniques [J].
Bader, Sean ;
Wu, Xinming ;
FomeL, Sergey .
INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2019, 7 (02) :T347-T361
[8]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[9]  
Breiman L., 2001, IEEE Trans. Broadcast., V45, P5
[10]   Evaluation of machine learning methods for lithology classification using geophysical data [J].
Bressan, Thiago Santi ;
de Souza, Marcelo Kehl ;
Girelli, Tiago J. ;
Chemale Junior, Farid .
COMPUTERS & GEOSCIENCES, 2020, 139