Identifying key variables and interactions in statistical models of building energy consumption using regularization

被引:74
作者
Hsu, David
机构
[1] 210 S 34th Street, Philadelphia, 19104, PA
关键词
Energy consumption; Buildings; Variable selection; Statistical models; MULTIVARIATE-ANALYSIS; REGRESSION SHRINKAGE; SIMULATION PROGRAMS; RESIDENTIAL SECTOR; STOCK MODELS; SELECTION; LASSO; UNCERTAINTY; PERFORMANCE; CALIBRATION;
D O I
10.1016/j.energy.2015.02.008
中图分类号
O414.1 [热力学];
学科分类号
摘要
Statistical models can only be as good as the data put into them. Data about energy consumption continues to grow, particularly its non-technical aspects, but these variables are often interpreted differently among disciplines, datasets, and contexts. Selecting key variables and interactions is therefore an important step in achieving more accurate predictions, better interpretation, and identification of key subgroups for further analysis. This paper therefore makes two main contributions to the modeling and analysis of energy consumption of buildings. First, it introduces regularization, also known as penalized regression, for principled selection of variables and interactions. Second, this approach is demonstrated by application to a comprehensive dataset of energy consumption for commercial office and multifamily buildings in New York City. Using cross-validation, this paper finds that a newly-developed method, hierarchical grouplasso regularization, significantly outperforms ridge, lasso, elastic net and ordinary least squares approaches in terms of prediction accuracy; develops a parsimonious model for large New York City buildings; and identifies several interactions between technical and non-technical parameters for further analysis, policy development and targeting. This method is generalizable to other local contexts, and is likely to be useful for the modeling of other sectors of energy consumption as well. (C) 2015 The Author. Published by Elsevier Ltd.
引用
收藏
页码:144 / 155
页数:12
相关论文
共 57 条
[1]  
[Anonymous], 2013, R: A language and environment for statistical computing
[2]  
[Anonymous], GLINTERNET LEARNING
[3]   A LASSO FOR HIERARCHICAL INTERACTIONS [J].
Bien, Jacob ;
Taylor, Jonathan ;
Tibshirani, Robert .
ANNALS OF STATISTICS, 2013, 41 (03) :1111-1141
[4]   Handling uncertainty in housing stock models [J].
Booth, A. T. ;
Choudhary, R. ;
Spiegelhalter, D. J. .
BUILDING AND ENVIRONMENT, 2012, 48 :35-47
[5]   NEAR-IDEAL MODEL SELECTION BY l1 MINIMIZATION [J].
Candes, Emmanuel J. ;
Plan, Yaniv .
ANNALS OF STATISTICS, 2009, 37 (5A) :2145-2177
[6]   Review of building energy-use performance benchmarking methodologies [J].
Chung, William .
APPLIED ENERGY, 2011, 88 (05) :1470-1479
[7]   Contrasting the capabilities of building energy performance simulation programs [J].
Crawley, Drury B. ;
Hand, Jon W. ;
Kurnmert, Michal ;
Griffith, Brent T. .
BUILDING AND ENVIRONMENT, 2008, 43 (04) :661-673
[9]   Identifying important variables of energy use in low energy office building by using multivariate analysis [J].
Djuric, Natasa ;
Novakovic, Vojislav .
ENERGY AND BUILDINGS, 2012, 45 :91-98
[10]   Development of prediction models for next-day building energy consumption and peak power demand using data Mining techniques [J].
Fan, Cheng ;
Xiao, Fu ;
Wang, Shengwei .
APPLIED ENERGY, 2014, 127 :1-10