Time series modeling of histogram-valued data: The daily histogram time series of S&P500 intradaily returns

被引:27
作者
Gonzalez-Rivera, Gloria [1 ]
Arroyo, Javier [2 ]
机构
[1] Univ Calif Riverside, Dept Econ, Riverside, CA 92521 USA
[2] Univ Complutense Madrid, Dept Comp Sci & Artificial Intelligence, E-28040 Madrid, Spain
关键词
Symbolic data; Interval-valued data; Histogram-valued data; Autocorrelation; Intradaily returns;
D O I
10.1016/j.ijforecast.2011.02.007
中图分类号
F [经济];
学科分类号
02 ;
摘要
Histogram time series (HTS) and interval time series (ITS) are examples of symbolic data sets. Though there have been methodological developments in a cross-sectional environment, they have been scarce in a time series setting. Arroyo, Gonzalez-Rivera, and Mate (2011) analyze various forecasting methods for HTS and ITS, adapting smoothing filters and nonparametric algorithms such as the k-NN. Though these methods are very flexible, they may not be the true underlying data generating process (DGP). We present the first step in the search for a DGP by focusing on the autocorrelation functions (ACFs) of HTS and ITS. We analyze the ACF of the daily histogram of 5-minute intradaily returns to the S&P500 index in 2007 and 2008. There are clusters of high/low activity that generate a strong, positive, and persistent autocorrelation, pointing towards some autoregressive process for HTS. Though smoothing and k-NN may not be the true DGPs, we find that they are very good approximations because they are able to capture almost all of the original autocorrelation. However, there seems to be some structure left in the data that will require new modelling techniques. As a byproduct, we also analyze the [90,100%] quantile interval. By using all of the information contained in the histogram, we find that there are advantages in the estimation and prediction of a specific interval. (C) 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:20 / 33
页数:14
相关论文
共 12 条
[1]  
[Anonymous], 2006, Symbolic Data Analysis: Conceptual Statistics and Data Mining
[2]  
Arroyo J., 2009, DESCRIPTIVE DISTANCE
[3]  
Arroyo J., 2011, HDB EMPIRICAL EC FIN, P247
[4]   Forecasting histogram time series with k-nearest neighbours methods [J].
Arroyo, Javier ;
Mate, Carlos .
INTERNATIONAL JOURNAL OF FORECASTING, 2009, 25 (01) :192-207
[5]   RANGEFINDER BOX PLOTS - A NOTE [J].
BECKETTI, S ;
GOULD, W .
AMERICAN STATISTICIAN, 1987, 41 (02) :149-149
[6]  
Bertrand P, 2000, ST CLASS DAT ANAL, P106
[7]  
Diday E., 2008, SYMBOLIC DATA SODAS
[8]  
Gonzalez Abril L., 2004, INTELL ARTIF, V8, P111, DOI [10.4114/ia.v8i23.798, DOI 10.4114/IA.V8I23.798]
[9]   Interval Time Series Analysis with an Application to the Sterling-Dollar Exchange Rate [J].
Han, Ai ;
Hong, Yongmiao ;
Lai, K. K. ;
Wang, Shouyang .
JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2008, 21 (04) :558-573
[10]   A new Wasserstein based distance for the hierarchical clustering of histogram symbolic data [J].
Irpino, Antonio ;
Verde, Rosanna .
DATA SCIENCE AND CLASSIFICATION, 2006, :185-+