A Novel Hybrid House Price Prediction Model

被引:7
作者
Akyuz, Sureyya Ozogur [1 ]
Erdogan, Birsen Eygi [2 ]
Yildiz, Ozlem [3 ]
Atas, Pinar Karadayi [4 ]
机构
[1] Bahcesehir Univ, Fac Engn & Nat Sci, Dept Math, Istanbul, Turkey
[2] Marmara Univ, Fac Sci, Dept Stat, Istanbul, Turkey
[3] Bahcesehir Univ, Inst Sci, Big Data Analyt Program, Istanbul, Turkey
[4] Arel Univ, Fac Engn & Architecture, Dept Comp Engn, Istanbul, Turkey
关键词
Housing pricing; Support vector regression; K-means clustering; K-NN classification; DETERMINANTS; REGRESSION;
D O I
10.1007/s10614-022-10298-8
中图分类号
F [经济];
学科分类号
02 ;
摘要
The real estate sector is evolving and changing rapidly with the increase in housing demand, and new luxury housing projects appear every day. The reliability of housing market investments is largely dependent on accurate pricing.The aim of this study is to introduce a dynamic pricing procedure that estimates house prices using the most important characteristics of a house. For this purpose, a hybrid algorithm using linear regression, clustering analysis, nearest neighbor classification and Support Vector Regression (SVR) method is proposed. Our hybrid algorithm involves using the output of one method as the input of another method for home price prediction to deal with the heteroscedastic nature of the housing data. In other words, the aim of this study is to present a hybrid algorithm that will create different housing clusters from the available data set, classify the houses to which the cluster is unknown, and make price predictions by creating separate prediction models for each class. Housing data collected through manual web scraping of Kadikoy district in Istanbul were used for training and validation of the proposed algorithm. In addition to these data, we validated our algorithm on the KAGGLE house dataset, which covers a wide range of features. The results of the hybrid algorithm were compared using multiple linear regression, Lasso, ridge regression, Support Vector Regression (SVR), AdaBoost, decision tree, random forest and XGBoost regression. Experimental results show that the proposed hybrid model is superior in terms of both Residual Mean Square Error (RMSE), Mean Absolute Value Percent Error (MAPE) and adjusted Rsquare measures for both Kadikoy and KAGGLE housing dataset.
引用
收藏
页码:1215 / 1232
页数:18
相关论文
共 35 条
[1]   AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION [J].
ALTMAN, NS .
AMERICAN STATISTICIAN, 1992, 46 (03) :175-185
[2]  
[Anonymous], 2003, J. Prop. Investig. Financ, DOI [DOI 10.1108/14635780310483656, 10.1108/14635780310483656]
[3]   Estimation and inference for spatial models with heterogeneous coefficients: An application to US house prices [J].
Aquaro, Michele ;
Bailey, Natalia ;
Pesaran, M. Hashem .
JOURNAL OF APPLIED ECONOMETRICS, 2021, 36 (01) :18-44
[4]   A prediction comparison of housing sales prices by parametric versus semi-parametric regressions [J].
Bin, O .
JOURNAL OF HOUSING ECONOMICS, 2004, 13 (01) :68-84
[5]  
BLASHFIELD RK, 1991, J CLASSIF, V8, P277
[6]   Measuring House Price Bubbles [J].
Bourassa, Steven C. ;
Hoesli, Martin ;
Oikarinen, Elias .
REAL ESTATE ECONOMICS, 2019, 47 (02) :534-563
[7]  
Bowen Y, 2018, BIG GEOSPATIAL DATA, V1
[8]   Boosting [J].
Buhlmann, Peter ;
Yu, Bin .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (01) :69-74
[9]  
CASE KE, 1990, AREUEA J, V18, P253
[10]   The impact of economic growth oriented development policies on landscape changes in Istanbul Province in Turkey [J].
Cengiz, Serhat ;
Atmis, Erdogan ;
Gormus, Sevgi .
LAND USE POLICY, 2019, 87