High-Dimensional Overdispersed Generalized Factor Model With Application to Single-Cell Sequencing Data Analysis
被引:0
作者:
Nie, Jinyu
论文数: 0引用数: 0
h-index: 0
机构:
Southwestern Univ Finance & Econ, Ctr Stat Res, Chengdu, Peoples R China
Southwestern Univ Finance & Econ, Sch Stat, Chengdu, Peoples R ChinaSouthwestern Univ Finance & Econ, Ctr Stat Res, Chengdu, Peoples R China
Nie, Jinyu
[1
,2
]
Qin, Zhilong
论文数: 0引用数: 0
h-index: 0
机构:
Southwestern Univ Finance & Econ, Inst Western China Econ Res, Chengdu, Peoples R ChinaSouthwestern Univ Finance & Econ, Ctr Stat Res, Chengdu, Peoples R China
Qin, Zhilong
[3
]
Liu, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Sch Math, Chengdu, Peoples R ChinaSouthwestern Univ Finance & Econ, Ctr Stat Res, Chengdu, Peoples R China
Liu, Wei
[4
]
机构:
[1] Southwestern Univ Finance & Econ, Ctr Stat Res, Chengdu, Peoples R China
[2] Southwestern Univ Finance & Econ, Sch Stat, Chengdu, Peoples R China
[3] Southwestern Univ Finance & Econ, Inst Western China Econ Res, Chengdu, Peoples R China
[4] Sichuan Univ, Sch Math, Chengdu, Peoples R China
The current high-dimensional linear factor models fail to account for the different types of variables, while high-dimensional nonlinear factor models often overlook the overdispersion present in mixed-type data. However, overdispersion is prevalent in practical applications, particularly in fields like biomedical and genomics studies. To address this practical demand, we propose an overdispersed generalized factor model (OverGFM) for performing high-dimensional nonlinear factor analysis on overdispersed mixed-type data. Our approach incorporates an additional error term to capture the overdispersion that cannot be accounted for by factors alone. However, this introduces significant computational challenges due to the involvement of two high-dimensional latent random matrices in the nonlinear model. To overcome these challenges, we propose a novel variational EM algorithm that integrates Laplace and Taylor approximations. This algorithm provides iterative explicit solutions for the complex variational parameters and is proven to possess excellent convergence properties. We also develop a criterion based on the singular value ratio to determine the optimal number of factors. Numerical results demonstrate the effectiveness of this criterion. Through comprehensive simulation studies, we show that OverGFM outperforms state-of-the-art methods in terms of estimation accuracy and computational efficiency. Furthermore, we demonstrate the practical merit of our method through its application to two datasets from genomics. To facilitate its usage, we have integrated the implementation of OverGFM into the R package GFM.
机构:
Osaka Univ, Grad Sch Human Sci, 1-2 Yamadaoka, Suita, Osaka 5650871, JapanOsaka Univ, Grad Sch Human Sci, 1-2 Yamadaoka, Suita, Osaka 5650871, Japan
Cai, Jingyu
Adachi, Kohei
论文数: 0引用数: 0
h-index: 0
机构:
Osaka Univ, Grad Sch Human Sci, 1-2 Yamadaoka, Suita, Osaka 5650871, JapanOsaka Univ, Grad Sch Human Sci, 1-2 Yamadaoka, Suita, Osaka 5650871, Japan
机构:
Southwestern Univ Finance & Econ, Ctr Stat Res & Sch Stat, 555 Liutai Rd, Chengdu 611130, Peoples R ChinaSouthwestern Univ Finance & Econ, Ctr Stat Res & Sch Stat, 555 Liutai Rd, Chengdu 611130, Peoples R China
Liu, Wei
Lin, Huazhen
论文数: 0引用数: 0
h-index: 0
机构:
Southwestern Univ Finance & Econ, Ctr Stat Res & Sch Stat, 555 Liutai Rd, Chengdu 611130, Peoples R ChinaSouthwestern Univ Finance & Econ, Ctr Stat Res & Sch Stat, 555 Liutai Rd, Chengdu 611130, Peoples R China
Lin, Huazhen
Zheng, Shurong
论文数: 0引用数: 0
h-index: 0
机构:
Northeast Normal Univ, Sch Math & Stat, Changchun, Peoples R ChinaSouthwestern Univ Finance & Econ, Ctr Stat Res & Sch Stat, 555 Liutai Rd, Chengdu 611130, Peoples R China
Zheng, Shurong
Liu, Jin
论文数: 0引用数: 0
h-index: 0
机构:
Duke NUS Med Sch, Ctr Quantitat Med, Program Hlth Serv & Syst Res, Singapore, SingaporeSouthwestern Univ Finance & Econ, Ctr Stat Res & Sch Stat, 555 Liutai Rd, Chengdu 611130, Peoples R China
机构:
AstraZeneca, IMED Biotech Unit, Early Clin Dev, Cardiovas Renal & Metab Translat Med Unit, Pepparedsleden 1, S-43150 Molndal, SwedenKarolinska Inst, Dept Med, ICMC, SE-14157 Huddinge, Sweden
Gan, Li-Ming
Bjorkegren, Johan L. M.
论文数: 0引用数: 0
h-index: 0
机构:
Karolinska Inst, Dept Med, ICMC, SE-14157 Huddinge, Sweden
Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, Icahn Inst Genom & Multiscale Biol, One Gustave L Levy Pl, New York, NY 10029 USAKarolinska Inst, Dept Med, ICMC, SE-14157 Huddinge, Sweden
Bjorkegren, Johan L. M.
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION,
2019,
机构:
Univ Technol Sydney, UTS Business Sch, POB 123, Broadway, NSW 2007, AustraliaUniv Technol Sydney, UTS Business Sch, POB 123, Broadway, NSW 2007, Australia
Li, Mengheng
Scharth, Marcel
论文数: 0引用数: 0
h-index: 0
机构:
Univ Sydney, Business Sch, Discipline Business Analyt, Sydney, NSW, AustraliaUniv Technol Sydney, UTS Business Sch, POB 123, Broadway, NSW 2007, Australia