Rough set-based approaches for discretization: a compact review

被引:99
作者
Ali, Rahman [1 ]
Siddiqi, Muhammad Hameed [1 ]
Lee, Sungyoung [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Ubiquitous Comp Lab, Yongin 446701, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Rough set theory (RST); Rough set discretization; Data reduction; Real values; Knowledge discovery; Categorization; Taxonomy; ALGORITHM;
D O I
10.1007/s10462-014-9426-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extraction of knowledge from a huge volume of data using rough set methods requires the transformation of continuous value attributes to discrete intervals. This paper presents a systematic study of the rough set-based discretization (RSBD) techniques found in the literature and categorizes them into a taxonomy. In the literature, no review is solely based on RSBD. Only a few rough set discretizers have been studied, while many new developments have been overlooked and need to be highlighted. Therefore, this study presents a formal taxonomy that provides a useful roadmap for new researchers in the area of RSBD. The review also elaborates the process of RSBD with the help of a case study. The study of the existing literature focuses on the techniques adapted in each article, the comparison of these with other similar approaches, the number of discrete intervals they produce as output, their effects on classification and the application of these techniques in a domain. The techniques adopted in each article have been considered as the foundation for the taxonomy. Moreover, a detailed analysis of the existing discretization techniques has been conducted while keeping the concept of RSBD applications in mind. The findings are summarized and presented in this paper.
引用
收藏
页码:235 / 263
页数:29
相关论文
共 61 条
  • [1] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [2] [Anonymous], THEATRICAL ASPECTS R
  • [3] Bakar Azuraliza Abu, 2009, 2009 2nd Conference on Data Mining and Optimization, P132, DOI 10.1109/DMO.2009.5341896
  • [4] The complexity of approximating the entropy
    Batu, T
    Dasgupta, S
    Kumar, R
    Rubinfeld, R
    [J]. SIAM JOURNAL ON COMPUTING, 2005, 35 (01) : 132 - 150
  • [5] Bazan J., 2005, T ROUGH SETS 3, P25
  • [6] Bazan JG, 2000, STUD FUZZ SOFT COMP, V56, P49
  • [7] Blajdo P, 2008, LECT NOTES ARTIF INT, V5009, P31, DOI 10.1007/978-3-540-79721-0_10
  • [8] MODL:: A Bayes optimal discretization method for continuous attributes
    Boulle, Marc
    [J]. MACHINE LEARNING, 2006, 65 (01) : 131 - 165
  • [9] Chebrolu S, 2012, INT J COMPUT CORP RE, V2, P75
  • [10] Study on discretization in rough set based on genetic algorithm
    Chen, CY
    Li, ZG
    Qiao, SY
    Wen, SP
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 1430 - 1434