A hybrid and exploratory approach to knowledge discovery in metabolomic data

被引:9
|
作者
Grissa, Dhouha [1 ,4 ]
Comte, Blandine [1 ]
Petera, Melanie [2 ]
Pujos-Guillot, Estelle [1 ]
Napoli, Amedeo [3 ]
机构
[1] Univ Clermont Auvergne, INRA, UNH, Mapping, F-63000 Clermont Ferrand, France
[2] Univ Clermont Auvergne, INRA, UNH, Plateforme Explorat Metab,MetaboHUB Clermont, F-63000 Clermont Ferrand, France
[3] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
[4] Univ Copenhagen, Novo Nordisk Fdn, Ctr Prot Res, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
关键词
Hybrid knowledge discovery; Pattern mining; Formal concept analysis; Data and pattern exploration; Metabolomic data; Classification; Visualization; Interpretation; FORMAL CONCEPT ANALYSIS; FEATURE-SELECTION;
D O I
10.1016/j.dam.2018.11.025
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, we propose a hybrid and exploratory knowledge discovery approach for analyzing metabolomic complex data based on a combination of supervised classifiers, pattern mining and Formal Concept Analysis (FCA). The approach is based on three main operations, preprocessing, classification, and postprocessing. Classifiers are applied to datasets of the form individuals x features and produce sets of ranked features which are further analyzed. Pattern mining and FCA are used to provide a complementary analysis and support for visualization. A practical application of this framework is presented in the context of metabolomic data, where two interrelated problems are considered, discrimination and prediction of class membership. The dataset is characterized by a small set of individuals and a large set of features, in which predictive biomarkers of clinical outcomes should be identified. The problems of combining numerical and symbolic data mining methods, as well as discrimination and prediction, are detailed and discussed. Moreover, it appears that visualization based on FCA can be used both for guiding knowledge discovery and for interpretation by domain analysts. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 116
页数:14
相关论文
共 50 条
  • [41] A New Parallel Memetic Algorithm to Knowledge Discovery in Data Mining
    Oualid, Dahmri
    Baba-Ali, Ahmed Riadh
    SWARM INTELLIGENCE BASED OPTIMIZATION, ICSIBO 2016, 2016, 10103 : 87 - 101
  • [42] Modeling of Distributed visual Knowledge Discovery from Data Process
    Ellouzi, Hamdi
    ben Ayed, Mounir
    Ltifi, Hela
    2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
  • [43] Applications of rough sets theory in data preprocessing for knowledge discovery
    Coaquira, Frida
    Acuna, Edgar
    WCECS 2007: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, 2007, : 707 - +
  • [44] Neurofuzzy and EUFABES as tools for knowledge discovery in visual field data
    Zahlmann, G
    Scherf, M
    Wegner, A
    PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 20, PTS 1-6: BIOMEDICAL ENGINEERING TOWARDS THE YEAR 2000 AND BEYOND, 1998, 20 : 1360 - 1362
  • [45] Knowledge discovery and variable scale evaluation for long series data
    Zhai, Yanwei
    Lv, Zheng
    Zhao, Jun
    Wang, Wei
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (04) : 3157 - 3180
  • [46] The center for causal discovery of biomedical knowledge from big data
    Cooper, Gregory F.
    Bahar, Ivet
    Becich, Michael J.
    Benos, Panayiotis V.
    Berg, Jeremy
    Espino, Jeremy U.
    Glymour, Clark
    Jacobson, Rebecca Crowley
    Kienholz, Michelle
    Lee, Adrian V.
    Lu, Xinghua
    Scheines, Richard
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2015, 22 (06) : 1132 - 1136
  • [47] An overview of interactive visual data mining techniques for knowledge discovery
    Stahl, Frederic
    Gabrys, Bogdan
    Gaber, Mohamed Medhat
    Berendsen, Monika
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (04) : 239 - 256
  • [48] Visualization and Visual Knowledge Discovery from Big Uncertain Data
    Leung, Carson K.
    Madill, Evan W. R.
    Pazdor, Adam
    2022 26TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV), 2022, : 330 - 335
  • [49] Multivariate classification analysis of metabolomic data for candidate biomarker discovery in type 2 diabetes mellitus
    Yang Qiu
    Dilip Rajagopalan
    Susan C. Connor
    Doris Damian
    Lei Zhu
    Amir Handzel
    Guanghui Hu
    Arshad Amanullah
    Steve Bao
    Nathaniel Woody
    David MacLean
    Kwan Lee
    Dana Vanderwall
    Terence Ryan
    Metabolomics, 2008, 4
  • [50] Exploratory functional data analysis
    Qu, Zhuo
    Dai, Wenlin
    Euan, Carolina
    Sun, Ying
    Genton, Marc G.
    TEST, 2024,