Outlier detection for set-valued data based on rough set theory and granular computing

被引:6
|
作者
Lin, Hai [1 ]
Li, Zhaowen [2 ]
机构
[1] Guangxi Univ, Coll Math & Informat Sci, Nanning, Guangxi, Peoples R China
[2] Yulin Normal Univ, Key Lab Complex Syst Optimizat & Big Data Proc, Dept Guangxi Educ, Yulin, Guangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
RST; GrC; SVIS; outlier detection; outlier factor; INFORMATION GRANULATION; ATTRIBUTE REDUCTION; FUZZY; ALGORITHMS;
D O I
10.1080/03081079.2022.2132491
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Outlier detection has been broadly used in industrial practices such as public security and fraud detection, etc. Outlier detection from various perspectives against different backgrounds has been proposed. However, most of outlier detection consider categorical or numerical data. There are few researches on outlier detection for set-valued data, and a set-valued information system (SVIS) is a proper way of tackling the problem of missing values in data sets. This paper investigates outlier detection for set-valued data based on rough set theory (RST) and granular computing (GrC). First, the similarity between two information values in an SVIS is introduced and a variable parameter to control the similarity is given. Then, the tolerance relations on the object set are defined, and based on this tolerance relation, theta-lower and theta-upper approximations in an SVIS are put forward. Next, the outlier factor in an SVIS is presented and applied to various data sets. Finally, outlier detection method for set-valued data based on RST and GrC is proposed, and the corresponding algorithms are designed. Through numerical experiments based on UCI, the designed algorithm is compared with six other detection algorithms. The experimental results show the designed algorithm is arguably the best choice under the context of an SVIS. It is worth mentioning that for a comprehensive comparison, we use two criteria: AUC value and F-1 measure, to show the superiority of the designed algorithm.
引用
收藏
页码:385 / 413
页数:29
相关论文
共 50 条
  • [41] An Knowledge Reduction Algorithms in Data Mining Based on Rough Set Theory
    Liu Tieying
    Jia Ru
    Ye Jianchun
    ADVANCES IN MANAGEMENT OF TECHNOLOGY, PT 1, 2009, : 562 - +
  • [42] A Dimensionality Reduction Based On Rough Set Theory for Complex Massive Data
    Dai Zhe
    Liu Jianhui
    2015 8TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), 2015, : 1520 - 1524
  • [43] Attribute Reduction for Massive Data Based on Rough Set Theory and MapReduce
    Yang, Yong
    Chen, Zhengrong
    Liang, Zhu
    Wang, Guoyin
    ROUGH SET AND KNOWLEDGE TECHNOLOGY (RSKT), 2010, 6401 : 672 - 678
  • [44] Outlier detection for incomplete real-valued data based on inner boundary
    Zhao, Zhengwei
    Yang, Genteng
    Li, Zhaowen
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 3023 - 3041
  • [45] Atoms of monotone set-valued measures and integrals
    Wu, Jian-Rong
    Kai, Xue-Wen
    Li, Jiao-Jiao
    FUZZY SETS AND SYSTEMS, 2016, 304 : 131 - 139
  • [46] An Outlier Fuzzy Detection Method Using Fuzzy Set Theory
    Jin, Lizhong
    Chen, Junjie
    Zhang, Xiaobo
    IEEE ACCESS, 2019, 7 : 59321 - 59332
  • [47] Set-based granular computing: A lattice model
    Qian, Yuhua
    Zhang, Hu
    Li, Feijiang
    Hu, Qinghua
    Liang, Jiye
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (03) : 834 - 852
  • [48] New differentiability concepts for set-valued functions and applications to set differential equations
    Khastan, A.
    Rodriguez-Lopez, R.
    Shahidi, M.
    INFORMATION SCIENCES, 2021, 575 : 355 - 378
  • [49] On Selection of Representative Object Set for Attribute Reduction in Set-valued Information Systems
    Thi Thu Hien Phung
    2013 THIRD WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES (WICT), 2013, : 268 - 273
  • [50] Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set
    Sang, Binbin
    Chen, Hongmei
    Yang, Lei
    Li, Tianrui
    Xu, Weihua
    Luo, Chuan
    KNOWLEDGE-BASED SYSTEMS, 2021, 227