Statistical Modelling for Big and Little Data

被引:0
|
作者
Henderson, Robin [1 ]
机构
[1] Newcastle Univ, Newcastle Upon Tyne, Tyne & Wear, England
来源
DEVELOPMENTS IN STATISTICAL MODELLING, IWSM 2024 | 2024年
关键词
Data science; Extrapolation; Inference; Smoothing; Two cultures;
D O I
10.1007/978-3-031-65723-8_38
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
While the difference between "Data Science" and "Statistics" disciplines is, at best, blurred, many people associate machine learning methods and big data with the former, and modelling and inference for small samples (little data) with the latter. We present a big data application where no sophisticated method at all is needed, a small data application where a partial modelling approach seems useful, and a big-and-little data application where we can borrow strength from limited information in a large sample, to improve estimation based on more detailed data in a small sample.
引用
收藏
页码:246 / 254
页数:9
相关论文
共 50 条
  • [1] Statistical science in the world of big data
    Reid, Nancy
    STATISTICS & PROBABILITY LETTERS, 2018, 136 : 42 - 45
  • [2] Statistical modelling of functional data
    Besse, PC
    Cardot, H
    Faivre, R
    Goulard, M
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2005, 21 (02) : 165 - 173
  • [3] Big Data, Little Data, or No Data? Why Human Interaction with Data is a Hard Problem
    Borgman, Christine L.
    CHIIR'20: PROCEEDINGS OF THE 2020 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2020, : 1 - 1
  • [4] A Survey of Bayesian Statistical Approaches for Big Data
    Jahan, Farzana
    Ullah, Insha
    Mengersen, Kerrie L.
    CASE STUDIES IN APPLIED BAYESIAN DATA SCIENCE: CIRM JEAN-MORLET CHAIR, FALL 2018, 2020, 2259 : 17 - 44
  • [5] Modelling non-stationary 'Big Data'
    Castle, Jennifer L.
    Doornik, Jurgen A.
    Hendry, David F.
    INTERNATIONAL JOURNAL OF FORECASTING, 2021, 37 (04) : 1556 - 1575
  • [6] A Bayesian perspective of statistical machine learning for big data
    Sambasivan, Rajiv
    Das, Sourish
    Sahu, Sujit K.
    COMPUTATIONAL STATISTICS, 2020, 35 (03) : 893 - 930
  • [7] Scaling by subsampling for big data, with applications to statistical learning
    Bertail, Patrice
    Bouchouia, Mohammed
    Jelassi, Ons
    Tressou, Jessica
    Zetlaoui, Melanie
    JOURNAL OF NONPARAMETRIC STATISTICS, 2024, 36 (01) : 78 - 117
  • [8] Why Not to Trust Big Data: Discussing Statistical Paradoxes
    Sharma, Rahul
    Kaushik, Minakshi
    Peious, Sijo Arakkal
    Shahin, Mahtab
    Vidyarthi, Ankit
    Tiwari, Prayag
    Draheim, Dirk
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS. DASFAA 2022 INTERNATIONAL WORKSHOPS, 2022, 13248 : 50 - 63
  • [9] STATISTICAL PARADISES AND PARADOXES IN BIG DATA (I): LAW OF LARGE POPULATIONS, BIG DATA PARADOX, AND THE 2016 US PRESIDENTIAL ELECTION
    Meng, Xiao-Li
    ANNALS OF APPLIED STATISTICS, 2018, 12 (02): : 685 - 726
  • [10] Integrating Systems Modelling and Data Science: The Joint Future of Simulation and 'Big Data' Science
    Pruyt, Erik
    INTERNATIONAL JOURNAL OF SYSTEM DYNAMICS APPLICATIONS, 2016, 5 (01) : 1 - 16