Evaluation of Sparse Proximal Multi-Task Learning for Genome-Wide Prediction

被引:2
作者
Fan, Yuhua [1 ]
Launonen, Ilkka [1 ]
Sillanpaa, Mikko J. [1 ]
Waldmann, Patrik [1 ]
机构
[1] Univ Oulu, Res Unit Math Sci, Oulu 90014, Finland
关键词
Multi-task learning; genome-wide prediction; regularization; proximal algorithm; sparsity; Bayesian optimization; RACHFORD SPLITTING METHOD; VARIABLE SELECTION; ALGORITHMS; LASSO; REGULARIZATION; SHRINKAGE; MODELS;
D O I
10.1109/ACCESS.2024.3386093
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-task learning (MTL) is a learning paradigm whose aim is to leverage information shared across related tasks to improve the generalization of models. Motivated by the success of proximal optimization algorithms and single-task learning regression models, sparse proximal multi-task learning (SPMTL) for genome-wide prediction (GWP) should be explored. This study investigates proximal gradient descent splitting algorithms with five non-smooth sparsity-inducing norm regularizers, including the novel L-2,(1)/2norm for GWP. Additionally, two popular methods based on Markov chain Monte Carlo (MCMC) are examined. To improve the computational efficiency, parallel Bayesian optimization strategy is employed for efficient hyperparameter tuning. Evaluation is conducted on three different real-world genomic datasets from mice, pigs and wheat, each associated with two, five, and four traits, respectively. Performance is assessed using mean squared error (MSE) and correlation coefficient between predicted and observed trait values in test sets. Experimental results reveal that the L-2,(1)/2 regularizer consistently achieves the best out-of-sample prediction across all datasets, demonstrating the effectiveness of SPMTL in leveraging shared information for improved GWP accuracy. Furthermore, the influence of different regularizers on sparsity and other properties of the SPMTL model are also explored.
引用
收藏
页码:51665 / 51675
页数:11
相关论文
共 62 条
  • [1] Bayesian quantitative trait loci mapping for multiple traits
    Banerjee, Samprit
    Yandell, Brian S.
    Yi, Nengjun
    [J]. GENETICS, 2008, 179 (04) : 2275 - 2289
  • [2] Beck A, 2017, MOS-SIAM SER OPTIMIZ, P1, DOI 10.1137/1.9781611974997
  • [3] Consciousness is not a property of states: A reply to Wilberg
    Berger, Jacob
    [J]. PHILOSOPHICAL PSYCHOLOGY, 2014, 27 (06) : 829 - 842
  • [4] Bergstra J., 2011, NIPS 11 P 24 INT C N, V24, P2546
  • [5] Boursier Etienne, 2022, C LEARNING THEORY, V178, P1303
  • [6] Distributed optimization and statistical learning via the alternating direction method of multipliers
    Boyd S.
    Parikh N.
    Chu E.
    Peleato B.
    Eckstein J.
    [J]. Foundations and Trends in Machine Learning, 2010, 3 (01): : 1 - 122
  • [7] A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION
    Cai, Jian-Feng
    Candes, Emmanuel J.
    Shen, Zuowei
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2010, 20 (04) : 1956 - 1982
  • [8] Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise
    Cai, T. Tony
    Wang, Lie
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2011, 57 (07) : 4680 - 4688
  • [9] Accuracy of multi-trait genomic selection using different methods
    Calus, Mario P. L.
    Veerkamp, Roel F.
    [J]. GENETICS SELECTION EVOLUTION, 2011, 43
  • [10] Multitask learning
    Caruana, R
    [J]. MACHINE LEARNING, 1997, 28 (01) : 41 - 75