A framework of active learning and semi-supervised learning for lithology identification based on improved naive Bayes

被引:39
作者
Ren, Quan [1 ]
Zhang, Hongbing [1 ]
Zhang, Dailu [1 ]
Zhao, Xiang [1 ]
Yan, Lizhi [1 ]
Rui, Jianwen [2 ]
Zeng, Fanxin [1 ]
Zhu, Xinyi [1 ]
机构
[1] Hohai Univ, Sch Earth Sci & Engn, Nanjing 211100, Peoples R China
[2] Nanjing Vocat, Sch Artificial Intelligence, Coll Informat Technol, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
Lithology identification; Active learning; Semi-supervised learning; Logging data; Naive Bayes; PERMEABILITY; PREDICTION; CLASSIFIER; RESERVOIR;
D O I
10.1016/j.eswa.2022.117278
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Lithology identification is the basis of energy exploration and reservoir evaluation, intelligent and accurate identification of underground lithology is a key issue. The establishment of a machine learning lithology identification model using logging data is a hot research direction in recent years. However, the logging data has a high degree of non-linearity and multi-response characteristics, and there are insufficient numbers of labeled samples in the training data set. These will eventually affect the modeling accuracy and may cause over-fitting. Therefore, a framework of active learning and semi-supervised learning for lithology identification based on improved naive Bayes (ALSLINB) is proposed. The contributions are fourfold: (i) The Gaussian mixture model (GMM) based on the EM algorithm is used to estimate the probability density of the log data, which fits the probability distribution of the nonlinear multi-response log data. (ii) A framework combining active learning (AL) and semi-supervised learning is proposed for the expansion of labeled samples in the training data set. (iii) The application of pseudo-labeling detection technology can effectively improve the authenticity of pseudo-label samples. (iv) Different from the general deterministic lithology identification method, the result of the ALSLINB algorithm corresponds to the probability score, which provides an auxiliary basis for the prediction result. Finally, the ALSLINB algorithm is applied to two different data sets for a large number of experiments and compared with the related baseline methods to verify its effectiveness and generalization ability. The result proves that the ALSLINB algorithm can complete the lithology recognition task well and has high accuracy and robustness, which provides a new direction for intelligent lithology identification.
引用
收藏
页数:14
相关论文
共 44 条
  • [1] Reservoir permeability prediction by neural networks combined with hybrid genetic algorithm and particle swarm optimization
    Ahmadi, Mohammad Ali
    Zendehboudi, Sohrab
    Lohi, Ali
    Elkamel, Ali
    Chatzis, Ioannis
    [J]. GEOPHYSICAL PROSPECTING, 2013, 61 (03) : 582 - 598
  • [2] Carbonate reservoir characterization with lithofacies clustering and porosity prediction
    Al Moqbel, Abdulrahman
    Wang, Yanghua
    [J]. JOURNAL OF GEOPHYSICS AND ENGINEERING, 2011, 8 (04) : 592 - 598
  • [3] Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1023/A:1022821128753
  • [4] Probabilistic logging lithology characterization with random forest probability estimation
    Ao, Yile
    Zhu, Liping
    Guo, Shuang
    Yang, Zhongguo
    [J]. COMPUTERS & GEOSCIENCES, 2020, 144
  • [5] Application of active learning in DNA microarray data for cancerous gene identification
    Begum, Shemim
    Sarkar, Ram
    Chakraborty, Debasis
    Sen, Sagnik
    Maulik, Ujjwal
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 177
  • [6] Burke J. A., 1969, ILL BAS REG M SPE, P268
  • [7] Chang J., 2020, IEEE GEOSCI REMOTE S, DOI [10.1109/LGRS.2020.3041960, DOI 10.1109/LGRS.2020.3041960]
  • [8] Active Domain Adaptation With Application to Intelligent Logging Lithology Identification
    Chang, Ji
    Kang, Yu
    Zheng, Wei Xing
    Cao, Yang
    Li, Zerui
    Lv, Wenjun
    Wang, Xing-Mou
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (08) : 8073 - 8087
  • [9] Automatic classification of carbonate rocks permeability from 1H NMR relaxation data
    da Silva, Pablo Nascimento
    Goncalves, Eduardo Correa
    Rios, Edmilson Helton
    Muhammad, Asif
    Moss, Adam
    Pritchard, Tim
    Glassborow, Brent
    Plastino, Alexandre
    de Vasconcellos Azeredo, Rodrigo Bagueira
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (09) : 4299 - 4309
  • [10] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38