A hybrid and exploratory approach to knowledge discovery in metabolomic data

被引:9
|
作者
Grissa, Dhouha [1 ,4 ]
Comte, Blandine [1 ]
Petera, Melanie [2 ]
Pujos-Guillot, Estelle [1 ]
Napoli, Amedeo [3 ]
机构
[1] Univ Clermont Auvergne, INRA, UNH, Mapping, F-63000 Clermont Ferrand, France
[2] Univ Clermont Auvergne, INRA, UNH, Plateforme Explorat Metab,MetaboHUB Clermont, F-63000 Clermont Ferrand, France
[3] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
[4] Univ Copenhagen, Novo Nordisk Fdn, Ctr Prot Res, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
关键词
Hybrid knowledge discovery; Pattern mining; Formal concept analysis; Data and pattern exploration; Metabolomic data; Classification; Visualization; Interpretation; FORMAL CONCEPT ANALYSIS; FEATURE-SELECTION;
D O I
10.1016/j.dam.2018.11.025
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, we propose a hybrid and exploratory knowledge discovery approach for analyzing metabolomic complex data based on a combination of supervised classifiers, pattern mining and Formal Concept Analysis (FCA). The approach is based on three main operations, preprocessing, classification, and postprocessing. Classifiers are applied to datasets of the form individuals x features and produce sets of ranked features which are further analyzed. Pattern mining and FCA are used to provide a complementary analysis and support for visualization. A practical application of this framework is presented in the context of metabolomic data, where two interrelated problems are considered, discrimination and prediction of class membership. The dataset is characterized by a small set of individuals and a large set of features, in which predictive biomarkers of clinical outcomes should be identified. The problems of combining numerical and symbolic data mining methods, as well as discrimination and prediction, are detailed and discussed. Moreover, it appears that visualization based on FCA can be used both for guiding knowledge discovery and for interpretation by domain analysts. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 116
页数:14
相关论文
共 50 条
  • [31] A hybrid approach to feature ranking for microarray data classification
    Popovic, Dusan
    Sifrim, Alejandro
    Moschopoulos, Charalampos
    Moreau, Yves
    De Moor, Bart
    Communications in Computer and Information Science, 2013, 384 : 241 - 248
  • [32] FNETVision: A WAMS Big Data Knowledge Discovery System
    Wang, Weikang
    Zhao, Jiecheng
    Yu, Wenpeng
    Liu, Yilu
    2018 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2018,
  • [33] Knowledge Discovery in Spectral Data by Means of Complex Networks
    Zanin, Massimiliano
    Papo, David
    Gonzalez Solis, Jose Luis
    Martinez Espinosa, Juan Carlos
    Frausto-Reyes, Claudio
    Anda, Pascual Palomares
    Sevilla-Escoboza, Ricardo
    Jaimes-Reategui, Rider
    Boccaletti, Stefano
    Menasalvas, Ernestina
    Sousa, Pedro
    METABOLITES, 2013, 3 (01) : 155 - 167
  • [34] Knowledge Discovery from Categorical Data based on Structured Partial Ordered Attribute Diagram
    Meng, Hui
    Song, Jialin
    Hong, Wenxue
    Li, Shaoxiong
    2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 695 - 699
  • [35] Foreign exchange data crawling and analysis for knowledge discovery leading to informative decision making
    Addam, Omar
    Chen, Alan
    Hoang, Winsor
    Rokne, Jon
    Alhajj, Reda
    KNOWLEDGE-BASED SYSTEMS, 2016, 102 : 1 - 19
  • [36] Data Mining Technique for Knowledge Discovery from Engineering Materials Data Sets
    Doreswamy
    Hemanth, K. S.
    Vastrad, Channabasayya M.
    Nagaraju, S.
    ADVANCES IN COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, PT I, 2011, 131 : 512 - +
  • [37] Multivariate classification analysis of metabolomic data for candidate biomarker discovery in type 2 diabetes mellitus
    Qiu, Yang
    Rajagopalan, Dilip
    Connor, Susan C.
    Damian, Doris
    Zhu, Lei
    Handzel, Amir
    Hu, Guanghui
    Amanullah, Arshad
    Bao, Steve
    Woody, Nathaniel
    MacLean, David
    Lee, Kwan
    Vanderwall, Dana
    Ryan, Terence
    METABOLOMICS, 2008, 4 (04) : 337 - 346
  • [38] Meme Media and Knowledge Federation for Exploratory Visual Analytics of Big Data
    Tanaka, Yuzuru
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, 2014, 8505 : 3 - 17
  • [39] Understanding effects of cognitive rehabilitation under a knowledge discovery approach
    Garcia-Rudolph, Alejandro
    Gibert, Karina
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 55 : 165 - 185
  • [40] Knowledge Discovery of Complex Data Using Gaussian Mixture Models
    Zhou, Linfei
    Ye, Wei
    Plant, Claudia
    Boehm, Christian
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2017, 2017, 10440 : 409 - 423