Bayesian penalized cumulative logit model for high-dimensional data with an ordinal response

被引:5
|
作者
Zhang, Yiran [1 ]
Archer, Kellie J. [1 ]
机构
[1] Ohio State Univ, Coll Publ Hlth, Columbus, OH 43210 USA
关键词
Bayesian; gene expression; genomics; LASSO; proportional odds; VARIABLE SELECTION; HEPATOCELLULAR-CARCINOMA; CALDESMON; ALBUMIN; LASSO; REGRESSION; ASSOCIATION; EXPRESSION; CIRRHOSIS; SHRINKAGE;
D O I
10.1002/sim.8851
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many previous studies have identified associations between gene expression, measured using high-throughput genomic platforms, and quantitative or dichotomous traits. However, we note that health outcome and disease status measurements frequently appear on an ordinal scale, that is, the outcome is categorical but has inherent ordering. Identification of important genes may be useful for developing novel diagnostic and prognostic tools to predict or classify stage of disease. Gene expression data are usually high-dimensional, meaning that the number of genes is much larger than the sample size or number of patients. Herein we describe some existing frequentist methods for modeling an ordinal response in a high-dimensional predictor space. Following Tibshirani (1996), who described the LASSO estimate as the Bayesian posterior mode when the regression coefficients have independent Laplace priors, we propose a new approach for high-dimensional data with an ordinal response that is rooted in the Bayesian paradigm. We show that our proposed Bayesian approach outperforms existing frequentist methods through simulation studies. We then compare the performance of frequentist and Bayesian approaches using a study evaluating progression to hepatocellular carcinoma in hepatitis C infected patients.
引用
收藏
页码:1453 / 1481
页数:29
相关论文
共 50 条
  • [1] A high-dimensional multinomial logit model
    Nibbering, Didier
    JOURNAL OF APPLIED ECONOMETRICS, 2024, 39 (03) : 481 - 497
  • [2] Penalized Bayesian forward continuation ratio model with application to high-dimensional data with discrete survival outcomes
    Seffernick, Anna Eames
    Archer, Kellie J.
    PLOS ONE, 2024, 19 (03):
  • [3] L 1 penalized continuation ratio models for ordinal response prediction using high-dimensional datasets
    Archer, K. J.
    Williams, A. A. A.
    STATISTICS IN MEDICINE, 2012, 31 (14) : 1464 - 1474
  • [4] ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R
    Archer, Kellie J.
    Seffernick, Anna Eames
    Sun, Shuai
    Zhang, Yiran
    STATS, 2022, 5 (02): : 371 - 384
  • [5] Improving Penalized Logistic Regression Model with Missing Values in High-Dimensional Data
    Alharthi, Aiedh Mrisi
    Lee, Muhammad Hisyam
    Algamal, Zakariya Yahya
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2022, 18 (02) : 40 - 54
  • [6] A New Generalized Ordinal Logit Model for Multicategory Response Data
    Jamroenpinyo, Somsri
    O'Brien, Timothy E.
    Bumrungsup, Chinnaphong
    THAILAND STATISTICIAN, 2012, 10 (01): : 87 - 105
  • [7] Penalized Ordinal Regression Methods for Predicting Stage of Cancer in High-Dimensional Covariate Spaces
    Gentry, Amanda Elswick
    Jackson-Cook, Colleen K.
    Lyon, Debra E.
    Archer, Kellie J.
    CANCER INFORMATICS, 2015, 14 : 201 - 208
  • [8] Regularization method for predicting an ordinal response using longitudinal high-dimensional genomic data
    Hou, Jiayi
    Archer, Kellie J.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2015, 14 (01) : 93 - 111
  • [9] DOUBLY PENALIZED ESTIMATION IN ADDITIVE REGRESSION WITH HIGH-DIMENSIONAL DATA
    Tan, Zhiqiang
    Zhang, Cun-Hui
    ANNALS OF STATISTICS, 2019, 47 (05): : 2567 - 2600
  • [10] Ensemble of penalized logistic models for classification of high-dimensional data
    Ijaz, Musarrat
    Asghar, Zahid
    Gul, Asma
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 2072 - 2088