Optimal subsampling for double generalized linear models with heterogeneous massive data

被引:0
|
作者
Xiong, Zhengyu [1 ,2 ]
Jin, Haoyu [1 ,2 ]
Wu, Liucang [1 ,2 ]
Yang, Lanjun [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Sci, Kunming, Yunnan, Peoples R China
[2] Kunming Univ Sci & Technol, Ctr Appl Stat, Kunming, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Heterogeneous massive data; double generalized linear models; optimality criterion; optimal subsampling; asymptotic properties; QUASI-LIKELIHOOD;
D O I
10.1080/03610926.2025.2467199
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
With the development of information technology, massive data under heterogeneous characteristics are generated in the economic, financial, and other fields. Traditional statistical models and existing statistical methods are often inadequate for handling dispersion modeling problems with heterogeneous massive data. In this article, the optimal subsampling of double generalized linear models is studied in heterogeneous massive data environments. Under certain conditions, the optimal subsampling probabilities of the double generalized linear models with heterogeneous data are derived based on the A-optimality criterion and L-optimality criterion, respectively. Furthermore, a two-step algorithm based on uniform sampling is developed, and the asymptotic properties of the subsample estimator from this algorithm are discussed. The results of numerical simulations and a real example show that the algorithm can improve estimation accuracy and decrease computational costs to some extent.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] Optimal decorrelated score subsampling for generalized linear models with massive data
    Gao, Junzhuo
    Wang, Lei
    Lian, Heng
    SCIENCE CHINA-MATHEMATICS, 2024, 67 (02) : 405 - 430
  • [2] Optimal decorrelated score subsampling for generalized linear models with massive data
    Junzhuo Gao
    Lei Wang
    Heng Lian
    Science China Mathematics, 2024, 67 : 405 - 430
  • [3] Optimal decorrelated score subsampling for generalized linear models with massive data
    Junzhuo Gao
    Lei Wang
    Heng Lian
    Science China(Mathematics), 2024, 67 (02) : 405 - 430
  • [4] Outcome dependent subsampling divide and conquer in generalized linear models for massive data
    Yin, Jie
    Ding, Jieli
    Yang, Changming
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2025, 237
  • [5] Functional L-Optimality Subsampling for Functional Generalized Linear Models with Massive Data
    Liu, Hua
    You, Jinhong
    Cao, Jiguo
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [6] Fast Optimal Subsampling Probability Approximation for Generalized Linear Models
    Lee, JooChul
    Schifano, Elizabeth D.
    Wang, HaiYing
    ECONOMETRICS AND STATISTICS, 2024, 29 : 224 - 237
  • [7] Optimal Subsampling Bootstrap for Massive Data
    Ma, Yingying
    Leng, Chenlei
    Wang, Hansheng
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2024, 42 (01) : 174 - 186
  • [8] Corrected optimal subsampling for a class of generalized linear measurement error models
    Chang, Ruiyuan
    Wang, Xiuli
    Wang, Mingqiu
    AIMS MATHEMATICS, 2025, 10 (02): : 4412 - 4440
  • [9] Optimal subsampling for modal regression in massive data
    Chao, Yue
    Huang, Lei
    Ma, Xuejun
    Sun, Jiajun
    METRIKA, 2024, 87 (04) : 379 - 409
  • [10] Optimal subsampling for multiplicative regression with massive data
    Wang, Tianzhen
    Zhang, Haixiang
    STATISTICA NEERLANDICA, 2022, 76 (04) : 418 - 449