Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times

被引:32
作者
Mora-Garcia, Raul-Tomas [1 ]
Cespedes-Lopez, Maria-Francisca [1 ]
Perez-Sanchez, V. Raul [1 ]
机构
[1] Univ Alicante, Bldg Sci & Urbanism Dept, San Vicente Del Raspeig 03690, Spain
关键词
machine learning; mass appraisal; real estate market; partial dependence plots; COVID-19; MASS APPRAISAL; RANDOM FOREST; VALUATION; MARKETS; POLICY;
D O I
10.3390/land11112100
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Machine learning algorithms are being used for multiple real-life applications and in research. As a consequence of digital technology, large structured and georeferenced datasets are now more widely available, facilitating the use of these algorithms to analyze and identify patterns, as well as to make predictions that help users in decision making. This research aims to identify the best machine learning algorithms to predict house prices, and to quantify the impact of the COVID-19 pandemic on house prices in a Spanish city. The methodology addresses the phases of data preparation, feature engineering, hyperparameter training and optimization, model evaluation and selection, and finally model interpretation. Ensemble learning algorithms based on boosting (Gradient Boosting Regressor, Extreme Gradient Boosting, and Light Gradient Boosting Machine) and bagging (random forest and extra-trees regressor) are used and compared with a linear regression model. A case study is developed with georeferenced microdata of the real estate market in Alicante (Spain), before and after the pandemic declaration derived from COVID-19, together with information from other complementary sources such as the cadastre, socio-demographic and economic indicators, and satellite images. The results show that machine learning algorithms perform better than traditional linear models because they are better adapted to the nonlinearities of complex data such as real estate market data. Algorithms based on bagging show overfitting problems (random forest and extra-trees regressor) and those based on boosting have better performance and lower overfitting. This research contributes to the literature on the Spanish real estate market by being one of the first studies to use machine learning and microdata to explore the incidence of the COVID-19 pandemic on house prices.
引用
收藏
页数:32
相关论文
共 87 条
[1]   Age norms on leaving home: multilevel evidence from the European Social Survey [J].
Aassve, Arnstein ;
Arpino, Bruno ;
Billari, Francesco C. .
ENVIRONMENT AND PLANNING A-ECONOMY AND SPACE, 2013, 45 (02) :383-401
[2]  
Afonso B., 2019, 16 ENC NAC INT ART E, P389, DOI DOI 10.5753/ENIAC.2019.9300
[3]   A Fully Automated Adjustment of Ensemble Methods in Machine Learning for Modeling Complex Real Estate Systems [J].
Alfaro-Navarro, Jose-Luis ;
Cano, Emilio L. ;
Alfaro-Cortes, Esteban ;
Garcia, Noelia ;
Gamez, Matias ;
Larraz, Beatriz .
COMPLEXITY, 2020, 2020
[4]   The potential impact of Covid-19 on the Irish housing sector [J].
Allen-Coghlan, Matthew ;
McQuinn, Kieran Michael .
INTERNATIONAL JOURNAL OF HOUSING MARKETS AND ANALYSIS, 2021, 14 (04) :636-651
[5]  
Alves Alvarez P.A., 2021, BOLET N EC MICO BANC, V2021, P1
[6]  
[Anonymous], 2014, EarthExplorer
[7]   Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics [J].
Antipov, Evgeny A. ;
Pokryshevskaya, Elena B. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (02) :1772-1778
[8]  
Aparicio Fenoll A., 2012, FOSTERING EMANCIPATI, V6651, P1, DOI [10.2139/ssrn.2089700, DOI 10.2139/SSRN.2089700]
[9]  
Banerjee D, 2017, 2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), P2998, DOI 10.1109/ICPCSI.2017.8392275
[10]  
Battistini N., 2021, EUR CENT BANC EC B, V2021, P115