A hybrid and exploratory approach to knowledge discovery in metabolomic data

被引:9
|
作者
Grissa, Dhouha [1 ,4 ]
Comte, Blandine [1 ]
Petera, Melanie [2 ]
Pujos-Guillot, Estelle [1 ]
Napoli, Amedeo [3 ]
机构
[1] Univ Clermont Auvergne, INRA, UNH, Mapping, F-63000 Clermont Ferrand, France
[2] Univ Clermont Auvergne, INRA, UNH, Plateforme Explorat Metab,MetaboHUB Clermont, F-63000 Clermont Ferrand, France
[3] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
[4] Univ Copenhagen, Novo Nordisk Fdn, Ctr Prot Res, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
关键词
Hybrid knowledge discovery; Pattern mining; Formal concept analysis; Data and pattern exploration; Metabolomic data; Classification; Visualization; Interpretation; FORMAL CONCEPT ANALYSIS; FEATURE-SELECTION;
D O I
10.1016/j.dam.2018.11.025
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, we propose a hybrid and exploratory knowledge discovery approach for analyzing metabolomic complex data based on a combination of supervised classifiers, pattern mining and Formal Concept Analysis (FCA). The approach is based on three main operations, preprocessing, classification, and postprocessing. Classifiers are applied to datasets of the form individuals x features and produce sets of ranked features which are further analyzed. Pattern mining and FCA are used to provide a complementary analysis and support for visualization. A practical application of this framework is presented in the context of metabolomic data, where two interrelated problems are considered, discrimination and prediction of class membership. The dataset is characterized by a small set of individuals and a large set of features, in which predictive biomarkers of clinical outcomes should be identified. The problems of combining numerical and symbolic data mining methods, as well as discrimination and prediction, are detailed and discussed. Moreover, it appears that visualization based on FCA can be used both for guiding knowledge discovery and for interpretation by domain analysts. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 116
页数:14
相关论文
共 50 条
  • [21] A Systematic Mapping Study of Data Preparation in Heart Disease Knowledge Discovery
    Benhar, H.
    Idri, A.
    Fernandez-Aleman, J. L.
    JOURNAL OF MEDICAL SYSTEMS, 2019, 43 (01)
  • [22] Visualization Methods for Exploratory Subgroup Discovery on Time Series Data
    Hudson, Dan
    Wiltshire, Travis J.
    Atzmueller, Martin
    BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II, 2022, 13259 : 34 - 44
  • [23] Conjecturable knowledge discovery: A fuzzy clustering approach
    Huang, Tony Cheng-Kui
    Hsu, Wu-Hsien
    Chen, Yen-Liang
    FUZZY SETS AND SYSTEMS, 2013, 221 : 1 - 23
  • [24] Interactive knowledge discovery and data mining on genomic expression data with numeric formal concept analysis
    Gonzalez-Calabozo, Jose M.
    Valverde-Albacete, Francisco J.
    Pelaez-Moreno, Carmen
    BMC BIOINFORMATICS, 2016, 17
  • [25] A metabolomic data fusion approach to support gliomas grading
    Righi, Valeria
    Cavallini, Nicola
    Valentini, Antonella
    Pinna, Giampietro
    Pavesi, Giacomo
    Rossi, Maria Cecilia
    Puzzolante, Annette
    Mucci, Adele
    Cocchi, Marina
    NMR IN BIOMEDICINE, 2020, 33 (03)
  • [26] Knowledge discovery from imbalanced and noisy data
    Van Hulse, Jason
    Khoshgoftaar, Taghi
    DATA & KNOWLEDGE ENGINEERING, 2009, 68 (12) : 1513 - 1542
  • [27] Interactive knowledge discovery and data mining on genomic expression data with numeric formal concept analysis
    Jose M González-Calabozo
    Francisco J Valverde-Albacete
    Carmen Peláez-Moreno
    BMC Bioinformatics, 17
  • [28] An experiment in knowledge discovery using data dependencies
    McErlean, F
    Bell, DA
    KYBERNETES, 1997, 26 (8-9) : 908 - +
  • [29] A distributed evolutionary classifier for knowledge discovery in data mining
    Tan, KC
    Yu, Q
    Lee, TH
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2005, 35 (02): : 131 - 142
  • [30] A Hybrid Approach to Feature Ranking for Microarray Data Classification
    Popovic, Dusan
    Sifrim, Alejandro
    Moschopoulos, Charalampos
    Moreau, Yves
    De Moor, Bart
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT II, 2013, 384 : 241 - 248