Linear regression of interval-valued data based on complete information in hypercubes

被引:0
作者
Huiwen Wang
Rong Guan
Junjie Wu
机构
[1] Beihang University,Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operations, School of Economics and Management
来源
Journal of Systems Science and Systems Engineering | 2012年 / 21卷
关键词
Interval-valued data; linear regression; complete information method (CIM); hypercubes;
D O I
暂无
中图分类号
学科分类号
摘要
Recent years have witnessed an increasing interest in interval-valued data analysis. As one of the core topics, linear regression attracts particular attention. It attempts to model the relationship between one or more explanatory variables and a response variable by fitting a linear equation to the interval-valued observations. Despite of the well-known methods such as CM, CRM and CCRM proposed in the literature, further study is still needed to build a regression model that can capture the complete information in interval-valued observations. To this end, in this paper, we propose the novel Complete Information Method (CIM) for linear regression modeling. By dividing hypercubes into informative grid data, CIM defines the inner product of interval-valued variables, and transforms the regression modeling into the computation of some inner products. Experiments on both the synthetic and real-world data sets demonstrate the merits of CIM in modeling interval-valued data, and avoiding the mathematical incoherence introduced by CM and CRM.
引用
收藏
页码:422 / 442
页数:20
相关论文
共 28 条
[1]  
Billard L.(2003)From the statistics of data to the statistics of knowledge: symbolic data analysis Journal of the American Statistical Association 98 470-487
[2]  
Diday E.(1997)Extension de l’analyse en composantes principales à des donnés de type intervalle Revue de Statistique Apliquée 45 5-24
[3]  
Cazes P.(2006)Dynamic clustering for interval data based on L2 distance Computational Statistics 21 231-250
[4]  
Chouakria A.(2006)Adaptive Hausdorff distances and dynamic clustering of symbolic interval data Pattern Recognition Letters 27 167-179
[5]  
Diday E.(2004)Clustering of interval data based on city-block distances Pattern Recognition Letters 25 353-365
[6]  
Schektman Y.(1989)Introduction à l’approche symbolique en analyse des données Revue Francaise d’automatique, d’informatique et de Recherche Opérationnelle: Recherche Opérationnelle 23 193-236
[7]  
de Carvalho F.A.T.(2005)Basic statistical methods for interval data Statistical Application 17 1-29
[8]  
Brito P.(2006)Principal component analysis on interval data Computational Statistics 21 343-363
[9]  
Bock H.H.(2008)Centre and Range method for fitting a linear regression model to symbolic interval data Computational Statistics & Data Analysis 52 1500-1515
[10]  
de Carvalho F.D.T.(2010)Constrained linear regression models for symbolic interval-valued variables Computational Statistics & Data Analysis 54 333-347