Unsupervised learning on scientific ocean drilling datasets from the South China Sea

被引:4
作者
Tse, Kevin C. [1 ]
Chiu, Hon-Chim [2 ,3 ]
Tsang, Man-Yin [4 ]
Li, Yiliang [1 ]
Lam, Edmund Y. [5 ]
机构
[1] Univ Hong Kong, Dept Earth Sci, Pokfulam, Hong Kong, Peoples R China
[2] Hong Kong Baptist Univ, Dept Geog, Kowloon Tong, Hong Kong, Peoples R China
[3] Hong Kong Baptist Univ, Ctr Geocomputat Studies, Kowloon Tong, Hong Kong, Peoples R China
[4] Univ Toronto, Dept Earth Sci, Toronto, ON M5S 2M8, Canada
[5] Univ Hong Kong, Dept Elect & Elect Engn, Pokfulam, Hong Kong, Peoples R China
关键词
machine learning; unsupervised learning; ODP; IODP; clustering; SELF-ORGANIZING MAPS; MACHINE; CLASSIFICATION; LITHOLOGY; GREENLAND;
D O I
10.1007/s11707-018-0704-1
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Unsupervised learning methods were applied to explore data patterns in multivariate geophysical datasets collected from ocean floor sediment core samples coming from scientific ocean drilling in the South China Sea. Compared to studies on similar datasets, but using supervised learning methods which are designed to make predictions based on sample training data, unsupervised learning methods require no a priori information and focus only on the input data. In this study, popular unsupervised learning methods including K-means, self-organizing maps, hierarchical clustering and random forest were coupled with different distance metrics to form exploratory data clusters. The resulting data clusters were externally validated with lithologic units and geologic time scales assigned to the datasets by conventional methods. Compact and connected data clusters displayed varying degrees of correspondence with existing classification by lithologic units and geologic time scales. K-means and self-organizing maps were observed to perform better with lithologic units while random forest corresponded best with geologic time scales. This study sets a pioneering example of how unsupervised machine learning methods can be used as an automatic processing tool for the increasingly high volume of scientific ocean drilling data.
引用
收藏
页码:180 / 190
页数:11
相关论文
共 45 条
  • [1] Self-organizing maps as an approach to exploring spatiotemporal diffusion patterns
    Augustijn, Ellen-Wien
    Zurita-Milla, Raul
    [J]. INTERNATIONAL JOURNAL OF HEALTH GEOGRAPHICS, 2013, 12
  • [2] Baarsch J, 2012, P INT MULT ENG COMP
  • [3] Mapping alteration minerals at Malmbjerg molybdenum deposit, central East Greenland, by Kohonen self-organizing maps and matched filter analysis of HyMap data
    Bedini, Enton
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2012, 33 (04) : 939 - 961
  • [4] Mapping lithology of the Sarfartoq carbonatite complex, southern West Greenland, using HyMap imaging spectrometer data
    Bedini, Enton
    [J]. REMOTE SENSING OF ENVIRONMENT, 2009, 113 (06) : 1208 - 1219
  • [5] Inferring the lithology of borehole rocks by applying neural network classifiers to downhole logs: an example from the Ocean Drilling Program
    Benaouda, D
    Wadge, G
    Whitmarsh, RB
    Rothwell, RG
    MacLeod, C
    [J]. GEOPHYSICAL JOURNAL INTERNATIONAL, 1999, 136 (02) : 477 - 491
  • [6] Advanced methodologies for the analysis of databases of mineral deposits and major faults
    Bierlein, F. P.
    Fraser, S. J.
    Brown, W. M.
    Lees, T.
    [J]. AUSTRALIAN JOURNAL OF EARTH SCIENCES, 2008, 55 (01) : 79 - 99
  • [7] Landslide Susceptibility Assessment Using Bagging Ensemble Based Alternating Decision Trees, Logistic Regression and J48 Decision Trees Methods: A Comparative Study
    Pham B.T.
    Tien Bui D.
    Prakash I.
    [J]. Geotechnical and Geological Engineering, 2017, 35 (6) : 2597 - 2611
  • [8] Application and Comparison of Decision Tree-Based Machine Learning Methods in Landside Susceptibility Assessment at Pauri Garhwal Area, Uttarakhand, India
    Pham B.T.
    Khosravi K.
    Prakash I.
    [J]. Environmental Processes, 2017, 4 (3) : 711 - 730
  • [9] Landslide Hazard Assessment Using Random SubSpace Fuzzy Rules Based Classifier Ensemble and Probability Analysis of Rainfall Data: A Case Study at Mu Cang Chai District, Yen Bai Province (Viet Nam)
    Binh Thai Pham
    Dieu Tien Bui
    Ha Viet Pham
    Hung Quoc Le
    Prakash, Indra
    Dholakia, M. B.
    [J]. JOURNAL OF THE INDIAN SOCIETY OF REMOTE SENSING, 2017, 45 (04) : 673 - 683
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32