A Hierarchical Model for Aggregated Functional Data

被引:8
作者
Dias, Ronaldo [1 ]
Garcia, Nancy L. [1 ]
Schmidt, Alexandra M. [2 ]
机构
[1] Univ Estadual Campinas, Dept Stat IMECC, BR-10083859 Campinas, SP, Brazil
[2] Univ Fed Rio de Janeiro, Dept Stat IM, BR-21941909 Rio De Janeiro, RJ, Brazil
关键词
Bayes' theorem; B-splines; Covariance function; Gaussian process; CURVES;
D O I
10.1080/00401706.2013.765316
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many areas of science, one aims to estimate latent subpopulation mean curves based only on observations of aggregated population curves. By aggregated curves we mean linear combination of functional data that cannot be observed individually. We assume that several aggregated curves with linearly independent coefficients are available, and each aggregated curve is an independent partial realization of a Gaussian process with mean modeled through a weighted linear combination of the disaggregated curves. The mean of the Gaussian process is modeled using B-splines basis expansion methods. We propose a semiparametric, valid covariance function that is modeled as the product of a nonparametric variance function by a correlation function. The variance function is described as the square of a function that is expanded using B-splines basis functions. This results in a nonstationary covariance function and includes constant variance models as special cases. Inference is performed following the Bayesian paradigm allowing experts' opinion, when available, to be accounted for. Moreover, it naturally provides the uncertainty associated with the parameters' estimates and fitted values. We analyze artificial datasets and discuss how to choose among the different covariance models. We focus on two different real examples: a calibration problem for NIR spectroscopy data and an analysis of distribution of energy among different types of consumers. In the latter example, our proposed covariance function captures interesting features of the data. Further analysis of different artificial datasets, as well as computer code and data is available as supplementary material online.
引用
收藏
页码:321 / 334
页数:14
相关论文
共 31 条
[1]  
Anderson T. W., 1984, An introduction to multivariate statistical analysis, V2nd
[2]  
[Anonymous], 1999, WILEY SER PROB STAT
[3]  
[Anonymous], 2000, BRAZ J PROBAB STAT
[4]  
Brereton R.G., 2003, Chemometrics: Data Analysis for the Laboratory and Chemical Plant
[5]   Bayesian wavelet regression on curves with application to a spectroscopic calibration problem [J].
Brown, PJ ;
Fearn, T ;
Vannucci, M .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) :398-408
[6]  
Burns D.A., 2008, Handbook of Near-Infrared Analysis, V35
[7]   Blind signal separation: Statistical principles [J].
Cardoso, JF .
PROCEEDINGS OF THE IEEE, 1998, 86 (10) :2009-2025
[8]  
Choi S., 2005, NEURAL INFORM PROCES, V6, P1
[9]  
Comon P, 2010, HANDBOOK OF BLIND SOURCE SEPARATION: INDEPENDENT COMPONENT ANALYSIS AND APPLICATIONS, P1
[10]  
De Boor C., 2001, Applied Mathematical Sciences, DOI DOI 10.1007/978-1-4612-6333-3