A random forest classifier with cost-sensitive learning to extract urban landmarks from an imbalanced dataset

被引:15
作者
Kang, Mengjun [1 ]
Liu, Yue [1 ]
Wang, Mengqi [1 ]
Li, Lin [1 ]
Weng, Min [1 ]
机构
[1] Wuhan Univ, Sch Resource & Environm Sci, Wuhan, Peoples R China
基金
国家重点研发计划;
关键词
Urban landmark; salience; random forest; class imbalance; cost-sensitive ensemble; ENVIRONMENT; SALIENCE; SMOTE;
D O I
10.1080/13658816.2021.1977814
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Urban landmarks play an important role as spatial references in spatial cognition, navigation, map design and urban planning. However, the current landmark extraction methods do not consider the imbalance between the landmark and non-landmarknon-landmark samples in a dataset, so the extraction results are biased toward the class with the majority of sample data, resulting in poor classification performance for the class with the fewest sample data. This study introduces a random forest (RF) classifier combined with cost-sensitive learning to extract urban landmarks automatically from a basic spatial database. First, the optimal feature set is determined according to the importance of features. Next, a cost-sensitive RF algorithm is applied to extract landmarks, which determines the misclassification cost according to the class distribution, and each decision tree is weighted by the classification results. The method has good performance, with a recall and area under the ROC curve (AUC) greater than 90%, and the model is also applicable to small sample sets, which can reduce the cost of manual labor.
引用
收藏
页码:496 / 513
页数:18
相关论文
共 42 条
  • [1] Effects of active and passive exploration of the built environment on memory during wayfinding
    Afrooz, Aida
    White, David
    Parolin, Bruno
    [J]. APPLIED GEOGRAPHY, 2018, 101 : 68 - 74
  • [2] [Anonymous], 2004, P 4 INT WORKSH WEB W
  • [3] Software defect prediction using cost-sensitive neural network
    Arar, Omer Faruk
    Ayan, Kursat
    [J]. APPLIED SOFT COMPUTING, 2015, 33 : 263 - 277
  • [4] MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Yao, Xin
    Murase, Kazuyuki
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 405 - 425
  • [5] Formalisation of the level of detail in 3D city modelling
    Biljecki, Filip
    Ledoux, Hugo
    Stoter, Jantien
    Zhao, Junqiao
    [J]. COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2014, 48 : 1 - 15
  • [6] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [7] On the assessment of landmark salience for human navigation
    Caduff, David
    Timpf, Sabine
    [J]. COGNITIVE PROCESSING, 2008, 9 (04) : 249 - 267
  • [8] Adaptive zooming in web cartography
    Cecconi, A
    Galanda, M
    [J]. COMPUTER GRAPHICS FORUM, 2002, 21 (04) : 787 - 799
  • [9] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [10] Chen C., 2004, USING RANDOM FOREST