Evaluation of Sparse Proximal Multi-Task Learning for Genome-Wide Prediction

被引：2

作者：

Fan, Yuhua ^{[1
]}

Launonen, Ilkka ^{[1
]}

Sillanpaa, Mikko J. ^{[1
]}

Waldmann, Patrik ^{[1
]}

机构：

[1] Univ Oulu, Res Unit Math Sci, Oulu 90014, Finland

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Multi-task learning; genome-wide prediction; regularization; proximal algorithm; sparsity; Bayesian optimization; RACHFORD SPLITTING METHOD; VARIABLE SELECTION; ALGORITHMS; LASSO; REGULARIZATION; SHRINKAGE; MODELS;

D O I：

10.1109/ACCESS.2024.3386093

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-task learning (MTL) is a learning paradigm whose aim is to leverage information shared across related tasks to improve the generalization of models. Motivated by the success of proximal optimization algorithms and single-task learning regression models, sparse proximal multi-task learning (SPMTL) for genome-wide prediction (GWP) should be explored. This study investigates proximal gradient descent splitting algorithms with five non-smooth sparsity-inducing norm regularizers, including the novel L-2,(1)/2norm for GWP. Additionally, two popular methods based on Markov chain Monte Carlo (MCMC) are examined. To improve the computational efficiency, parallel Bayesian optimization strategy is employed for efficient hyperparameter tuning. Evaluation is conducted on three different real-world genomic datasets from mice, pigs and wheat, each associated with two, five, and four traits, respectively. Performance is assessed using mean squared error (MSE) and correlation coefficient between predicted and observed trait values in test sets. Experimental results reveal that the L-2,(1)/2 regularizer consistently achieves the best out-of-sample prediction across all datasets, demonstrating the effectiveness of SPMTL in leveraging shared information for improved GWP accuracy. Furthermore, the influence of different regularizers on sparsity and other properties of the SPMTL model are also explored.

引用

页码：51665 / 51675

页数：11

共 62 条

[1] Bayesian quantitative trait loci mapping for multiple traits
Banerjee, Samprit
Yandell, Brian S.
Yi, Nengjun
[J]. GENETICS, 2008, 179 (04) : 2275 - 2289
[2] Beck A, 2017, MOS-SIAM SER OPTIMIZ, P1, DOI 10.1137/1.9781611974997
[3] Consciousness is not a property of states: A reply to Wilberg
Berger, Jacob
[J]. PHILOSOPHICAL PSYCHOLOGY, 2014, 27 (06) : 829 - 842
[4] Bergstra J., 2011, NIPS 11 P 24 INT C N, V24, P2546
[5] Boursier Etienne, 2022, C LEARNING THEORY, V178, P1303
[6] Distributed optimization and statistical learning via the alternating direction method of multipliers
Boyd S.
Parikh N.
Chu E.
Peleato B.
Eckstein J.
[J]. Foundations and Trends in Machine Learning, 2010, 3 (01): : 1 - 122
[7] A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION
Cai, Jian-Feng
Candes, Emmanuel J.
Shen, Zuowei
[J]. SIAM JOURNAL ON OPTIMIZATION, 2010, 20 (04) : 1956 - 1982
[8] Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise
Cai, T. Tony
Wang, Lie
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2011, 57 (07) : 4680 - 4688
[9] Accuracy of multi-trait genomic selection using different methods
Calus, Mario P. L.
Veerkamp, Roel F.
[J]. GENETICS SELECTION EVOLUTION, 2011, 43
[10] Multitask learning
Caruana, R
[J]. MACHINE LEARNING, 1997, 28 (01) : 41 - 75

← 1 2 3 4 5 6 7 →