Controlling the false discovery rate for feature selection in high-resolution NMR spectra

被引:18
|
作者
Kim, Seoung Bum [1 ]
Chen, Victoria C. P. [1 ]
Park, Youngja [2 ]
Ziegler, Thomas R. [2 ]
Jones, Dean P. [2 ]
机构
[1] Department of Industrial and Manufacturing Systems Engineering, University of Texas at Arlington, Arlington, TX
[2] Clinical Biomarker Laboratory, Center for Clinical and Molecular Nutrition, Department of Medicine, Emory University, Atlanta, GA
来源
Statistical Analysis and Data Mining | 2008年 / 1卷 / 02期
关键词
False discovery rate; Feature selection; Metabolomics; Nuclear magnetic resonance; Orthogonal signal correction;
D O I
10.1002/sam.10005
中图分类号
学科分类号
摘要
Successful implementation of feature selection in nuclear magnetic resonance (NMR) spectra not only improves classification ability, but also simplifies the entire modeling process and, thus, reduces computational and analytical efforts. Principal component analysis (PCA) and partial least squares (PLS) have been widely used for feature selection in NMR spectra. However, extracting meaningful metabolite features from the reduced dimensions obtained through PCA or PLS is complicated because these reduced dimensions are linear combinations of a large number of the original features. In this paper, we propose a multiple testing procedure controlling false discovery rate (FDR) as an efficient method for feature selection in NMR spectra. The procedure clearly compensates for the limitation of PCA and PLS and identifies individual metabolite features necessary for classification. In addition, we present orthogonal signal correction to improve classification and visualization by removing unnecessary variations in NMR spectra. Our experimental results with real NMR spectra showed that classification models constructed with the features selected by our proposed procedure yielded smaller misclassification rates than those with all features. © 2008 Wiley Periodicals, Inc.
引用
收藏
页码:57 / 66
页数:9
相关论文
共 50 条
  • [31] Controlling the False Discovery Rate of the Association/Causality Structure Learned with the PC Algorithm
    Li, Junning
    Wang, Z. Jane
    JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 475 - 514
  • [32] Stepwise normal theory multiple test procedures controlling the false discovery rate
    Troendle, JF
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2000, 84 (1-2) : 139 - 158
  • [33] A False Discovery Rate approach to optimal volatility forecasting model selection☆
    Hassanniakalager, Arman
    Baker, Paul L.
    Platanakis, Emmanouil
    INTERNATIONAL JOURNAL OF FORECASTING, 2024, 40 (03) : 881 - 902
  • [34] False Discovery Rate Control in Cancer Biomarker Selection Using Knockoffs
    Shen, Arlina
    Fu, Han
    He, Kevin
    Jiang, Hui
    CANCERS, 2019, 11 (06):
  • [35] Estimating a positive false discovery rate for variable selection in pharmacogenetic studies
    Li, Lang
    Hui, Sin
    Pennello, Gene
    Desta, Zeruesenay
    Todd, Skaar
    Nguyen, Anne
    Flockhart, David
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2007, 17 (05) : 883 - 902
  • [36] High-resolution NMR spectra in inhomogeneous and unstable fields via the three-pulse method
    Peng, Ling
    Zheng, Zhenyao
    Huang, Yuqing
    Zhang, Zhenmin
    Cai, Shuhui
    Chen, Zhong
    MOLECULAR PHYSICS, 2010, 108 (14) : 1869 - 1875
  • [37] Intermolecular single-quantum coherence sequences for high-resolution NMR spectra in inhomogeneous fields
    Huang, Yuqing
    Cai, Shuhui
    Chen, Xi
    Chen, Zhong
    JOURNAL OF MAGNETIC RESONANCE, 2010, 203 (01) : 100 - 107
  • [38] High-resolution NMR spectroscopy in inhomogeneous fields
    Chen, Zhong
    Cai, Shuhui
    Huang, Yuqing
    Lin, Yulan
    PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY, 2015, 90-91 : 1 - 31
  • [39] A roadmap to high-resolution standard microcoil MAS NMR spectroscopy for metabolomics
    Wong, Alan
    NMR IN BIOMEDICINE, 2023, 36 (04)
  • [40] A robust false discovery rate controlling procedure using the empirical likelihood with a fast algorithm
    Park, Hoyoung
    Park, Junyong
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (05) : 1097 - 1120