Benefits of dimension reduction in penalized regression methods for high-dimensional grouped data: a case study in low sample size

被引:16
作者
Ajana, Soufiane [1 ]
Acar, Niyazi [2 ]
Bretillon, Lionel [2 ]
Hejblum, Boris P. [3 ,4 ]
Jacqmin-Gadda, Helene [5 ]
Delcourt, Cecile [1 ]
Berdeaux, Olivier [2 ]
Bouton, Sylvain [6 ]
Bron, Alain [2 ,7 ]
Buaud, Benjamin [8 ]
Cabaret, Stephanie [2 ]
Cougnard-Gregorie, Audrey [1 ]
Creuzot-Garcher, Catherine [2 ,7 ]
Delyfer, Marie-Noelle [1 ,9 ]
Feart-Couret, Catherine [1 ]
Febvret, Valerie [2 ]
Gregoire, Stephane [2 ]
He, Zhiguo [10 ]
Korobelnik, Jean-Francois [1 ,9 ]
Martine, Lucy [2 ]
Merle, Benedicte [1 ]
Vaysse, Carole [8 ]
机构
[1] Univ Bordeaux, INSERM, Bordeaux Populat Hlth Res Ctr, Team LEHA,UMR 1219, F-33000 Bordeaux, France
[2] Univ Bourgogne Franche Comte, AgroSup Dijon, Ctr Sci Gout & Alimentat, CNRS,INRA, Dijon, France
[3] Univ Bordeaux, INSERM, Bordeaux Populat Hlth Res Ctr 1219, ISPED,Inria SISTM, F-33000 Bordeaux, France
[4] Hop Henri Mondor, VRI, Creteil, France
[5] Univ Bordeaux, Team Biostat, UMR 1219, INSERM,Bordeaux Populat Hlth Res Ctr, F-33000 Bordeaux, France
[6] Lab Thea, Clermont Ferrand, France
[7] Univ Hosp, Dept Ophthalmol, Dijon, France
[8] Equipe Nutr Metab & Sante, ITERG, Bordeaux, France
[9] CHU Bordeaux, Serv Ophtalmol, F-33000 Bordeaux, France
[10] Univ Jean Monnet, Fac Med, EA2521, Lab Biol Imaging & Engn Corneal Grafts, St Etienne, France
关键词
LEAST-SQUARES REGRESSION; VARIABLE SELECTION; CROSS-VALIDATION; FATTY-ACID; REGULARIZATION; IDENTIFICATION; RETINA; ERROR; GENE; PART;
D O I
10.1093/bioinformatics/btz135
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In some prediction analyses, predictors have a natural grouping structure and selecting predictors accounting for this additional information could be more effective for predicting the outcome accurately. Moreover, in a high dimension low sample size framework, obtaining a good predictive model becomes very challenging. The objective of this work was to investigate the benefits of dimension reduction in penalized regression methods, in terms of prediction performance and variable selection consistency, in high dimension low sample size data. Using two real datasets, we compared the performances of lasso, elastic net, group lasso, sparse group lasso, sparse partial least squares (PLS), group PLS and sparse group PLS. Results: Considering dimension reduction in penalized regression methods improved the prediction accuracy. The sparse group PLS reached the lowest prediction error while consistently selecting a few predictors from a single group.
引用
收藏
页码:3628 / 3634
页数:7
相关论文
共 51 条
[1]   Lipid Composition of the Human Eye: Are Red Blood Cells a Good Mirror of Retinal and Optic Nerve Fatty Acids? [J].
Acar, Niyazi ;
Berdeaux, Olivier ;
Gregoire, Stephane ;
Cabaret, Stephanie ;
Martine, Lucy ;
Gain, Philippe ;
Thuret, Gilles ;
Creuzot-Garcher, Catherine P. ;
Bron, Alain M. ;
Bretillon, Lionel .
PLOS ONE, 2012, 7 (04)
[2]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[3]  
[Anonymous], 2004, Statistical Applications in Genetics and Molecular Biology, DOI [10.2202/1544-6115.1075, DOI 10.2202/1544-6115.1075]
[4]  
[Anonymous], 2013, METABOLOMICS, DOI DOI 10.4172/2153-0769.1000126
[5]  
[Anonymous], 2001, The elements of statistical learning: data mining, inference, and prediction
[6]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[7]   Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data [J].
Bastien, Philippe ;
Bertrand, Frederic ;
Meyer, Nicolas ;
Maumy-Bertrand, Myriam .
BIOINFORMATICS, 2015, 31 (03) :397-404
[8]   Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation [J].
Baumann, Desiree ;
Baumann, Knut .
JOURNAL OF CHEMINFORMATICS, 2014, 6
[9]   High-Dimensional Cox Models: The Choice of Penalty as Part of the Model Building Process [J].
Benner, Axel ;
Zucknick, Manuela ;
Hielscher, Thomas ;
Ittrich, Carina ;
Mansmann, Ulrich .
BIOMETRICAL JOURNAL, 2010, 52 (01) :50-69
[10]   Identification and quantification of phosphatidylcholines containing very-long-chain polyunsaturated fatty acid in bovine and human retina using liquid chromatography/tandem mass spectrometry [J].
Berdeaux, Olivier ;
Juaneda, Pierre ;
Martine, Lucy ;
Cabaret, Stephanie ;
Bretillon, Lionel ;
Acar, Niyazi .
JOURNAL OF CHROMATOGRAPHY A, 2010, 1217 (49) :7738-7748