Learning to classify organic and conventional wheat - a machine learning driven approach using the MeltDB 2.0 metabolomics analysis platform

被引:21
作者
Kessler, Nikolas [1 ,2 ]
Bonte, Anja [3 ]
Albaum, Stefan P. [2 ]
Maeder, Paul [4 ]
Messmer, Monika [5 ]
Goesmann, Alexander [6 ]
Niehaus, Karsten [7 ]
Langenkaemper, Georg [3 ]
Nattkemper, Tim W. [1 ]
机构
[1] Bielefeld Univ, Fac Technol, Biodata Min Grp, Bielefeld, Germany
[2] Bielefeld Univ, Ctr Biotechnol, Bioinformat Resource Facil, Bielefeld, Germany
[3] Max Rubner Inst, Dept Safety & Qual Cereals, Detmold, Germany
[4] Res Inst Organ Agr FiBL, Dept Soil Sci, Frick, Switzerland
[5] Res Inst Organ Agr FiBL, Dept Crop Sci, Frick, Switzerland
[6] Justus Liebig Univ Giessen, Bioinformat & Syst Biol, Giessen, Germany
[7] Bielefeld Univ, Fac Biol, Ctr Biotechnol, Dept Proteome & Metabolome Res, Bielefeld, Germany
来源
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY | 2015年 / 3卷
基金
欧盟第七框架计划;
关键词
metabolome informatics; statistics; metabolomics; computational metabolomics; organic farming; food authentication; machine learning; AUTHENTICATION;
D O I
10.3389/fbioe.2015.00035
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We present results of our machine learning approach to the problem of classifying GC-MS data originating from wheat grains of different farming systems. The aim is to investigate the potential of learning algorithms to classify GC-MS data to be either from conventionally grown or from organically grown samples and considering different cultivars. The motivation of our work is rather obvious nowadays: increased demand for organic food in post-industrialized societies and the necessity to prove organic food authenticity. The background of our data set is given by up to 11 wheat cultivars that have been cultivated in both farming systems, organic and conventional, throughout 3 years. More than 300 GC-MS measurements were recorded and subsequently processed and analyzed in the MeltDB 2.0 metabolomics analysis platform, being briefly outlined in this paper. We further describe how unsupervised (t-SNE, PCA) and supervised (SVM) methods can be applied for sample visualization and classification. Our results clearly show that years have most and wheat cultivars have second-most influence on the metabolic composition of a sample. We can also show that for a given year and cultivar, organic and conventional cultivation can be distinguished by machine-learning algorithms.
引用
收藏
页数:10
相关论文
共 20 条
  • [1] Automatic Generic Registration of Mass Spectrometry Imaging Data to Histology Using Nonlinear Stochastic Embedding
    Abdelmoula, Walid M.
    Skraskova, Karolina
    Balluff, Benjamin
    Carreira, Ricardo J.
    Tolner, Else A.
    Lelieveldt, Boudewijn P. F.
    van der Maaten, Laurens
    Morreau, Hans
    van den Maagdenberg, Arn M. J. M.
    Heeren, Ron M. A.
    McDonnell, Liam A.
    Dijkstra, Jouke
    [J]. ANALYTICAL CHEMISTRY, 2014, 86 (18) : 9204 - 9211
  • [2] Metabolite profiling on wheat grain to enable a distinction of samples from organic and conventional farming systems
    Bonte, Anja
    Neuweger, Heiko
    Goesmann, Alexander
    Thonar, Cecile
    Maeder, Paul
    Langenkaemper, Georg
    Niehaus, Karsten
    [J]. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE, 2014, 94 (13) : 2605 - 2612
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Breiman L., 2001, J. Clin. Microbiol, V45, P5
  • [5] An intuitive graphical visualization technique for the interrogation of transcriptome data
    Bushati, Natascha
    Smith, James
    Briscoe, James
    Watkins, Christopher
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 (17) : 7380 - 7389
  • [6] Analytical authentication of organic products: an overview of markers
    Capuano, Edoardo
    Boerrigter-Eenling, Rita
    van der Veer, Grishja
    van Ruth, Saskia M.
    [J]. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE, 2013, 93 (01) : 12 - 28
  • [7] Review on metabolomics for food authentication
    Cubero-Leon, Elena
    Penalver, Rosa
    Maquet, Alain
    [J]. FOOD RESEARCH INTERNATIONAL, 2014, 60 : 95 - 107
  • [8] Yield and baking quality of winter wheat cultivars in different farming systems of the DOK long-term trials
    Hildermann, Isabell
    Thommen, Andreas
    Dubois, David
    Boller, Thomas
    Wiemken, Andres
    Maeder, Paul
    [J]. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE, 2009, 89 (14) : 2477 - 2491
  • [9] Exploring nonlinear feature space dimension reduction and data representation in breast CADx with Laplacian eigenmaps and t-SNE
    Jamieson, Andrew R.
    Giger, Maryellen L.
    Drukker, Karen
    Li, Hui
    Yuan, Yading
    Bhooshan, Neha
    [J]. MEDICAL PHYSICS, 2010, 37 (01) : 339 - 351
  • [10] Karatzoglou A., 2004, J.stat. softw, V11, P1, DOI [10.18637/jss.v011.i09, DOI 10.18637/JSS.V011.I09]