Improved well-log classification using semisupervised label propagation and self-training, with comparisons to popular supervised algorithms

被引：19

作者：

Dunham, Michael W. ^{[1
]}

Malcolm, Alison ^{[1
]}

Welford, J. Kim ^{[1
]}

机构：

[1] Mem Univ Newfoundland, Dept Earth Sci, St John, NF A1B 3X5, Canada

来源：

GEOPHYSICS | 2020年 / 85卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

FACIES CLASSIFICATION; IDENTIFICATION;

D O I：

10.1190/GEO2019-0238.1

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Machine-learning techniques allow geoscientists to extract meaningful information from data in an automated fashion, and they are also an efficient alternative to traditional manual interpretation methods. Many geophysical problems have an abundance of unlabeled data and a paucity of labeled data, and the lithology classification of wireline data reflects this situation. Training supervised algorithms on small labeled data sets can lead to overtraining, and subsequent predictions for the numerous unlabeled data may be unstable. However, semisupervised algorithms are designed for classification problems with limited amounts of labeled data, and they are theoretically able to achieve better accuracies than supervised algorithms in these situations. We explore this hypothesis by applying two semisupervised techniques, label propagation (LP) and self-training, to a well-log data set and compare their performance to three popular supervised algorithms. LP is an established method, but our self-training method is a unique adaptation of existing implementations. The well-log data were made public through an SEG competition held in 2016. We simulate a semisupervised scenario with these data by assuming that only one of the 10 wells has labels (i.e., core samples), and our objective is to predict the labels for the remaining nine wells. We generate results from these data in two stages. The first stage is applying all the algorithms in question to the data as is (i.e., the global data), and the results from this motivate the second stage, which is applying all algorithms to the data when they are decomposed into two separate data sets. Overall, our findings suggest that LP does not outperform the supervised methods, but our self-training method coupled with LP can outperform the supervised methods by a notable margin if the assumptions of LP are met.

引用

页码：O1 / O15

页数：15

共 48 条

[1] LEARNING WITH A PROBABILISTIC TEACHER [J].

AGRAWALA, AK .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1970, 16 (04) :373-+

[2] Artificial neural networks workflow and its application in the petroleum industry [J].

Al-Bulushi, N. I. ;

King, P. R. ;

Blunt, M. J. ;

Kraaijveld, M. .

NEURAL COMPUTING & APPLICATIONS, 2012, 21 (03) :409-421

[3] Development of artificial neural network models for predicting water saturation and fluid distribution [J].

Al-Bulushi, Nabil ;

King, Peter R. ;

Blunt, Martin J. ;

Kraaijveld, Martin .

JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2009, 68 (3-4) :197-208

[4]

Aster R. C, 2005, PARAMETER ESTIMATION, P90

[5]

Avseth P., 2005, Quantitative Seismic Interpretation, DOI DOI 10.1017/CBO9780511600074

[6] Seismic facies analysis from well logs based on supervised classification scheme with different machine learning techniques [J].

Bagheri, Majid ;

Riahi, Mohammad Ali .

ARABIAN JOURNAL OF GEOSCIENCES, 2015, 8 (09) :7153-7161

[7]

Baldwin J.L., 1990, The Log Analyst, V3, P279

[8]

Bestagini P., 2017, SEG Technical Program Expanded Abstracts, V2017, P2137

[9] GTM: The generative topographic mapping [J].

Bishop, CM ;

Svensen, M ;

Williams, CKI .

NEURAL COMPUTATION, 1998, 10 (01) :215-234

[10]

Bishop CM, 2006, Just the Facts 101

← 1 2 3 4 5 →