Analyzing driving factors of land values in urban scale based on big data and non-linear machine learning techniques

被引:85
|
作者
Ma, Jun [1 ]
Cheng, Jack C. P. [1 ]
Jiang, Feifeng [2 ]
Chen, Weiwei [1 ]
Zhang, Jingcheng [3 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Civil & Environm Engn, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Architecture & Civil Engn, Hong Kong, Peoples R China
[3] Hong Kong Univ Sci & Technol, Sch Engn, Hong Kong, Peoples R China
关键词
Big data; Land values per square foot; Machine learning; Place of interest; Recursive feature elimination; ENERGY USE INTENSITY; RANDOM FOREST; PRICE; CREDITS; SELECTION; IMPACT; CITY;
D O I
10.1016/j.landusepol.2020.104537
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Land value plays a vital role in the real estate market. It is a critical reference for urban planners to reallocate land resources and introduce valid policies. Studying the influential factors on land value can help better understand the spatial-temporal variation of land values and design effective control policies. This attracted a number of scholars to study the spatial and temporal relationships between land value and its possible influential factors from the perspective of macro and micro. However, the majority of the existing studies have the problems of linear assumption and multicollinearity in research models. Limited features and the lack of feature selection procedure are another two commonly seen limitations. To overcome the gaps, this paper adopts non-linear machine learning (ML) methods to investigate the influential factors on land values per square foot based on "big data" in New York City. More than one thousand potential factors are considered, covering from the land attribute, point of interest, demographics, housing, to economic, education, and social. They are further selected using a feature extraction model named Recursive Feature Elimination (RFE). Six ML algorithms, including Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Multi Linear Regression (MLR), Linear Support Vector Regression (SVR), Multilayer Perceptron (MLP) Regression, and K-Nearest Neighbor (KNN) Regression are evaluated and compared. The optimal one with an R-square value of 0.933 is used to calculate the feature importance further. Several important impact features are disclosed, including the number of newsstands, and the vacant housing percentage.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Comparison of Linear and Non-Linear Machine Learning Techniques (PCA and SOM) for Characterizing Urban Nutrient Runoff
    Gorgoglione, Angela
    Castro, Alberto
    Iacobellis, Vito
    Gioia, Andrea
    SUSTAINABILITY, 2021, 13 (04) : 1 - 19
  • [2] Auto Associative Extreme Learning Machine based Non-Linear Principal Component Regression for Big Data Applications
    Tejasviram, V.
    Solanki, H.
    Ravi, V.
    Kamaruddin, Sk.
    2015 TENTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2015, : 27 - 32
  • [3] Editorial: Advancements in land cover classification and machine learning techniques for Urban areas using remote sensing big data
    Al-Najjar, Husam
    Kalantar, Bahareh
    Abdul Halin, Alfian
    FRONTIERS IN ENVIRONMENTAL SCIENCE, 2025, 13
  • [4] Identification of high impact factors of air quality on a national scale using big data and machine learning techniques
    Ma, Jun
    Ding, Yuexiong
    Cheng, Jack C. P.
    Jiang, Feifeng
    Tan, Yi
    Gan, Vincent J. L.
    Wan, Zhiwei
    JOURNAL OF CLEANER PRODUCTION, 2020, 244 (244)
  • [5] SVD enabled data augmentation for machine learning based surrogate modeling of non-linear structures
    Parida, Siddharth S.
    Bose, Supratik
    Butcher, Megan
    Apostolakis, Georgios
    Shekhar, Prashant
    ENGINEERING STRUCTURES, 2023, 280
  • [6] Understanding Factors Affecting Tourist Distribution in Urban National Parks Based on Big Data and Machine Learning
    Ye, Yang
    Qiu, Hongfei
    Jia, Yiru
    JOURNAL OF URBAN PLANNING AND DEVELOPMENT, 2024, 150 (03)
  • [7] Non-Linear Distance Based Large Scale Data Classifications
    Al-Behadili, Husam
    Grumpe, Arne
    Dopp, Christian
    Woehler, Christian
    PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATCS AND COMPUTING (IEEE PIC), 2015, : 613 - 617
  • [8] Analyzing and visualizing morphological features using machine learning techniques and non-big data: A case study of macaque mandibles
    Morita, Takashi
    Ito, Tsuyoshi
    Koda, Hiroki
    Wakamori, Hikaru
    Nishimura, Takeshi
    AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY, 2022, 178 (01): : 44 - 53
  • [9] Data-Driven Fault Diagnosis Techniques: Non-Linear Directional Residual vs. Machine-Learning-Based Methods
    Cartocci, Nicholas
    Napolitano, Marcello R.
    Crocetti, Francesco
    Costante, Gabriele
    Valigi, Paolo
    Fravolini, Mario L.
    SENSORS, 2022, 22 (07)
  • [10] Modeling urban scale human mobility through big data analysis and machine learning
    Liu, Yapan
    Dong, Bing
    BUILDING SIMULATION, 2024, 17 (01) : 3 - 21