An enriched approach to combining high-dimensional genomic and low-dimensional phenotypic data

被引:0
作者
Cabrera, Javier [1 ]
Emir, Birol [2 ]
Cheng, Ge [1 ]
Duan, Yajie [1 ]
Alemayehu, Demissie [2 ]
Cherkas, Yauheniya [3 ]
机构
[1] Rutgers State Univ, Dept Stat, Piscataway, NJ 08854 USA
[2] Pfizer Inc, Pfizer Res & Dev, Stat Res & Data Sci Ctr, New York, NY USA
[3] Janssen R&D, Stat & Decis Sci, Lower Gwynedd Township, PA USA
关键词
Model selection; penalized regression; dimension reduction; precision medicine; SELECTION;
D O I
10.1080/10543406.2024.2330203
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
We describe an approach for combining and analyzing high-dimensional genomic and low-dimensional phenotypic data. The approach leverages a scheme of weights applied to the variables instead of observations and, hence, permits incorporation of the information provided by the low dimensional data source. It can also be incorporated into commonly used downstream techniques, such as random forest or penalized regression. Finally, the simulated lupus studies involving genetic and clinical data are used to illustrate the overall idea and show that the proposed enriched penalized method can select significant genetic variables while keeping several important clinical variables in the final model.
引用
收藏
页码:1026 / 1032
页数:7
相关论文
共 13 条
  • [1] Amaratunga D., 2003, EXPLORATION ANAL DNA
  • [2] Amaratunga D., 2014, Exploration and analysis of microarray and other high dimensional data, V2nd
  • [3] Enriched random forests
    Amaratunga, Dhammika
    Cabrera, Javier
    Lee, Yung-Seop
    [J]. BIOINFORMATICS, 2008, 24 (18) : 2010 - 2014
  • [4] Combined clinical and genomic signatures for the prognosis of early stage non-small cell lung cancer based on gene copy number alterations
    Aramburu, Ander
    Zudaire, Isabel
    Pajares, Maria J.
    Agorreta, Jackeline
    Orta, Alberto
    Lozano, Maria D.
    Gurpide, Alfonso
    Gomez-Roman, Javier
    Martinez-Climent, Jose A.
    Jassem, Jacek
    Skrzypski, Marcin
    Suraokar, Milind
    Behrens, Carmen
    Wistuba, Ignacio I.
    Pio, Ruben
    Rubio, Angel
    Montuenga, Luis M.
    [J]. BMC GENOMICS, 2015, 16
  • [5] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [6] Cabrera J., 2007, I MATH STAT LECT NOT, V54, P92
  • [7] CONFIDENCE INTERVALS FOR HIGH-DIMENSIONAL LINEAR REGRESSION: MINIMAX RATES AND ADAPTIVITY
    Cai, T. Tony
    Guo, Zijian
    [J]. ANNALS OF STATISTICS, 2017, 45 (02) : 615 - 646
  • [8] Hastie T., 2008, ELEMENTS STAT LEARNI, P1, DOI 10.1007/978-0-387-84858-7
  • [9] A direct approach to false discovery rates
    Storey, JD
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 : 479 - 498