Conditional entropy for incomplete decision systems and its application in data mining

被引:54
作者
Dai, Jianhua [1 ]
Xu, Qing [1 ]
Wang, Wentao [1 ]
Tian, Haowei [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
rough set theory; incomplete decision systems; conditional entropy; Shannon entropy; attribute reduction; ROUGH SET APPROACH; KNOWLEDGE GRANULATION; MEASURING UNCERTAINTY; INFORMATION ENTROPY; REDUCTION; RULES; DATABASES; SELECTION;
D O I
10.1080/03081079.2012.685471
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Rough set theory is a useful mathematic tool for dealing with vague and uncertain information. Shannon's entropy and its variants have been applied to measure uncertainty in rough set theory from the viewpoint of information theory. However, few studies have been carried out on information-theoretical measure of attribute importance in incomplete decision system (IDS) considering the relation between decision attribute and condition attributes. In this paper, we introduce the concept of conditional entropy together with entropy and joint entropy in IDSs. By using the new conditional entropy, we propose a measure for attribute importance. Based on the measure, a heuristic attribute reduction algorithm is presented. Some test experiments on real-lift data-sets show the effectiveness of the algorithm. The attribute importance measure and the attribute reduction algorithm can be used in data mining or machine learning for handling incomplete data.
引用
收藏
页码:713 / 728
页数:16
相关论文
共 41 条
  • [1] Information-theoretic measures of uncertainty for rough sets and rough relational databases
    Beaubouef, T
    Petry, FE
    Arora, G
    [J]. INFORMATION SCIENCES, 1998, 109 (1-4) : 185 - 195
  • [2] Rough 3-valued algebras
    Dai, Jian-Hua
    [J]. INFORMATION SCIENCES, 2008, 178 (08) : 1986 - 1996
  • [3] Approximations and uncertainty measures in incomplete information systems
    Dai, Jianhua
    Xu, Qing
    [J]. INFORMATION SCIENCES, 2012, 198 : 62 - 80
  • [4] Uncertainty measurement for interval-valued decision systems based on extended conditional entropy
    Dai, Jianhua
    Wang, Wentao
    Xu, Qing
    Tian, Haowei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 27 : 443 - 450
  • [5] Uncertainty measures of rough set prediction
    Düntsch, I
    Gediga, G
    [J]. ARTIFICIAL INTELLIGENCE, 1998, 106 (01) : 109 - 137
  • [6] Frank A., 2010, UCI machine learning repository, V213
  • [7] A rough set approach for selecting clustering attribute
    Herawan, Tutut
    Deris, Mustafa Mat
    Abawajy, Jemal H.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2010, 23 (03) : 220 - 231
  • [8] ROUGH SET REDUCTION OF ATTRIBUTES AND THEIR DOMAINS FOR NEURAL NETWORKS
    JELONEK, J
    KRAWIEC, K
    SLOWINSKI, R
    [J]. COMPUTATIONAL INTELLIGENCE, 1995, 11 (02) : 339 - 347
  • [9] A handwritten numeral character classification using tolerant rough set
    Kim, D
    Bang, SY
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (09) : 923 - 937
  • [10] Rules in incomplete information systems
    Kryszkiewicz, M
    [J]. INFORMATION SCIENCES, 1999, 113 (3-4) : 271 - 292