Estimating Multilevel Models on Data Streams

被引:0
作者
L. Ippel
M. C. Kaptein
J. K. Vermunt
机构
[1] Maastricht University,Institute of Data Science
[2] Tilburg University,undefined
来源
Psychometrika | 2019年 / 84卷
关键词
Data streams; expectation maximization algorithm; multilevel models; machine (online) learning; SEMA; nested data;
D O I
暂无
中图分类号
学科分类号
摘要
Social scientists are often faced with data that have a nested structure: pupils are nested within schools, employees are nested within companies, or repeated measurements are nested within individuals. Nested data are typically analyzed using multilevel models. However, when data sets are extremely large or when new data continuously augment the data set, estimating multilevel models can be challenging: the current algorithms used to fit multilevel models repeatedly revisit all data points and end up consuming much time and computer memory. This is especially troublesome when predictions are needed in real time and observations keep streaming in. We address this problem by introducing the Streaming Expectation Maximization Approximation (SEMA) algorithm for fitting multilevel models online (or “row-by-row”). In an extensive simulation study, we demonstrate the performance of SEMA compared to traditional methods of fitting multilevel models. Next, SEMA is used to analyze an empirical data stream. The accuracy of SEMA is competitive to current state-of-the-art methods while being orders of magnitude faster.
引用
收藏
页码:41 / 64
页数:23
相关论文
共 50 条
[21]   An empirical study of on-line models for relational data streams [J].
Ashwin Srinivasan ;
Michael Bain .
Machine Learning, 2017, 106 :243-276
[22]   Infinite Dropout for training Bayesian models from data streams [J].
Van-Son Nguyen ;
Duc-Tung Nguyen ;
Linh Ngo Van ;
Khoat Than .
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, :125-134
[23]   An empirical study of on-line models for relational data streams [J].
Srinivasan, Ashwin ;
Bain, Michael .
MACHINE LEARNING, 2017, 106 (02) :243-276
[24]   Estimating adjusted associations between random effects from multilevel models: The reffadjust package [J].
Palmer, Tom M. ;
Macdonald-Wallis, Corrie M. ;
Lawlor, Debbie A. ;
Tilling, Kate .
STATA JOURNAL, 2014, 14 (01) :119-140
[25]   Applications of Multilevel Structured Additive Regression Models to Insurance Data [J].
Lang, Stefan ;
Umlauf, Nikolaus .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :155-164
[26]   Estimating and Testing Random Intercept Multilevel Structural Equation Models with Model Implied Instrumental Variables [J].
Giordano, Michael L. ;
Bollen, Kenneth A. ;
Jin, Shaobo .
STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2022, 29 (04) :584-599
[27]   Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects [J].
Wei, Jianfeng ;
Yang, Jian ;
Cheng, Xuewen ;
Ding, Jie ;
Li, Shengquan .
MATHEMATICS, 2023, 11 (24)
[28]   Approximate mining of maximal frequent itemsets in data streams with different window models [J].
Li, Hua-Fu ;
Lee, Suh-Yin .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (03) :781-789
[29]   Self-Healing Data Streams Using Multiple Models of Analytical Redundancy [J].
Imai, Shigeru ;
Hole, Frederick ;
Varela, Carlos A. .
2019 IEEE/AIAA 38TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2019,
[30]   On elliptical multilevel models [J].
Manghi, Roberto F. ;
Paula, Gilberto A. ;
Cysneiros, Francisco Jose A. .
JOURNAL OF APPLIED STATISTICS, 2016, 43 (12) :2150-2171