A general model for fuzzy decision tree and fuzzy random forest

被引:12
作者
Zheng, Hui [1 ,2 ,9 ,10 ]
He, Jing [2 ,3 ]
Zhang, Yanchun [4 ,5 ]
Huang, Guangyan [6 ]
Zhang, Zhenjiang [7 ]
Liu, Qing [8 ]
机构
[1] Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing, Peoples R China
[2] Swinburne Univ Technol, Sch Software & Elect Engn, Melbourne, Vic, Australia
[3] Nanjing Univ Finance & Econ, Inst Informat Technol, Nanjing, Jiangsu, Peoples R China
[4] Victoria Univ, Coll Engn & Sci, Ctr Appl Informat, Melbourne, Vic, Australia
[5] Fudan Univ, Shanghai Key Res Lab Data Sci, Shanghai, Peoples R China
[6] Deakin Univ, Sch Informat Technol, Fac Sci Engn & Built Environm, Melbourne, Vic, Australia
[7] Beijing Jiaotong Univ, Inst Software Engn, Beijing, Peoples R China
[8] CSIRO, Data61, Software & Computat Syst Program, Clayton, Australia
[9] Nanjing Univ Posts & Telecommun, Coll Comp, Nanjing, Jiangsu, Peoples R China
[10] Nanjing Univ Posts & Telecommun, Jiangsu High Technol Res Key Lab Wireless Sensor, Nanjing, Jiangsu, Peoples R China
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
fuzzy decision tree; fuzzy random forest; membership function; risk classification and prediction; BIG DATA; ID3;
D O I
10.1111/coin.12195
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of risk classification and prediction, an essential research direction, aiming to identify and predict risks for various applications, has been researched in this paper. To identify and predict risks, numerous researchers build models on discovering hidden information of a label (positive credit or negative credit). Fuzzy logic is robust in dealing with ambiguous data and, thus, benefits the problem of classification and prediction. However, the way to apply fuzzy logic optimally depends on the characteristics of the data and the objectives, and it is extraordinarily tricky to find such a way. This paper, therefore, proposes a general membership function model for fuzzy sets (GMFMFS) in the fuzzy decision tree and extend it to the fuzzy random forest method. The proposed methods can be applied to identify and predict the credit risks with almost optimal fuzzy sets. In addition, we analyze the feasibility of our GMFMFS and prove our GMFMFS-based linear membership function can be extended to a nonlinear membership function without a significant increase in computing complex. Our GMFMFS-based fuzzy decision tree is tested with a real dataset of US credit, Susy dataset of UCI, and synthetic datasets of big data. The results of experiments further demonstrate the effectiveness and potential of our GMFMFS-based fuzzy decision tree with linear membership function and nonlinear membership function.
引用
收藏
页码:310 / 335
页数:26
相关论文
共 37 条
[1]  
[Anonymous], 2013, Int J Adv Comput
[2]  
Ayed AB, 2017, P 9 INT C MACH VIS I, V10341
[3]   Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers [J].
Barenboim, Maxim ;
Masso, Majid ;
Vaisman, Iosif I. ;
Jamison, D. Curtis .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 71 (04) :1930-1939
[4]   Big data-based extraction of fuzzy partition rules for heart arrhythmia detection: a semi-automated approach [J].
Behadada, Omar ;
Trovati, Marcello ;
Chikh, M. A. ;
Bessis, Nik .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (02) :360-373
[5]   A fuzzy random forest [J].
Bonissone, Piero ;
Cadenas, Jose M. ;
Carmen Garrido, M. ;
Andres Diaz-Valladares, R. .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2010, 51 (07) :729-747
[6]  
Bonissone PP, 2008, P 12 INT C INF PROC
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Extending information processing in a Fuzzy Random Forest ensemble [J].
Cadenas, Jose M. ;
Carmen Garrido, M. ;
Martinez, Raquel ;
Bonissone, Piero P. .
SOFT COMPUTING, 2012, 16 (05) :845-861
[9]   FUZZY DECISION TREE ALGORITHMS [J].
CHANG, RLP ;
PAVLIDIS, T .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1977, 7 (01) :28-35
[10]  
DEMATTEIS AD, 2015, ISTANBUL