Feature selection methods for characterizing and classifying adaptive Sustainable Flood Retention Basins

被引:26
作者
Yang, Qinli [1 ]
Shao, Junming [2 ]
Scholz, Miklas [1 ,3 ]
Plant, Claudia [4 ,5 ]
机构
[1] Univ Edinburgh, Inst Infrastruct & Environm, Sch Engn, Edinburgh EH9 3JL, Midlothian, Scotland
[2] Univ Munich, Inst Comp Sci, D-80937 Munich, Germany
[3] Univ Salford, Sch Comp Sci & Engn, Civil Engn Res Grp, Salford M5 4WT, Lancs, England
[4] Florida State Univ, Dept Comp Sci, Tallahassee, FL 32306 USA
[5] Tech Univ Munich, Klinikum Rechts Isar, D-8000 Munich, Germany
关键词
Sustainable flood risk management; Flood control; Classification; Information gain; Mutual information; Relief; MUTUAL INFORMATION; CLASSIFICATION;
D O I
10.1016/j.watres.2010.10.006
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The European Union's Flood Directive 2007/60/EC requires member states to produce flood risk maps for all river basins and coastal areas at risk of flooding by 2013. As a result, flood risk assessments have become an urgent challenge requiring a range of rapid and effective tools and approaches. The Sustainable Flood Retention Basin (SFRB) concept has evolved to provide a rapid assessment technique for impoundments, which have a pre-defined or potential role in flood defense and diffuse pollution control. A previous version of the SFRB survey method developed by the co-author Scholz in 2006 recommends gathering of over 40 variables to characterize an SERB. Collecting all these variables is relatively time-consuming and more importantly, these variables are often correlated with each other. Therefore, the objective is to explore the correlation among these variables and find the most important variables to represent an SFRB. Three feature selection techniques (Information Gain, Mutual Information and Relief) were applied on the SFRB data set to identify the importance of the variables in terms of classification accuracy. Four benchmark classifiers (Support Vector Machine, K-Nearest Neighbours, C4.5 Decision Tree and Naive Bayes) were subsequently used to verify the effectiveness of the classification with the selected variables and automatically identify the optimal number of variables. Experimental results indicate that our proposed approach provides a simple, rapid and effective framework for variable selection and SFRB classification. Only nine important variables are sufficient to accurately classify SFRB. Finally, six typical cases were studied to verify the performance of the identified nine variables on different SFRB types. The findings provide a rapid scientific tool for SFRB assessment in practice. Moreover, the generic value of this tool allows also for its wide application in other areas. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:993 / 1004
页数:12
相关论文
共 33 条
  • [1] AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
  • [2] Almuallim H., 1991, AAAI-91. Proceedings Ninth National Conference on Artificial Intelligence, P547
  • [3] [Anonymous], P 9 INT WORKSH MACH
  • [4] [Anonymous], 1979, CLASSIFICATION WETLA
  • [5] [Anonymous], 2014, C4. 5: programs for machine learning
  • [6] Feature selection and land cover classification of a MODIS-like data set for a semiarid environment
    Borak, JS
    Strahler, AH
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 1999, 20 (05) : 919 - 938
  • [7] Chen YW, 2006, STUD FUZZ SOFT COMP, V207, P315
  • [8] Dash M., 1997, Intelligent Data Analysis, V1
  • [9] Normalized Mutual Information Feature Selection
    Estevez, Pablo. A.
    Tesmer, Michel
    Perez, Claudio A.
    Zurada, Jacek A.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (02): : 189 - 201
  • [10] European Commission, 2007, OFFICIAL J EUROPEAN