Selective Feature Bagging of one-class classifiers for novelty detection in high-dimensional data

被引:4
作者
Wang, Biao [1 ]
Wang, Wenjing [2 ]
Meng, Guanglei [1 ]
Meng, Tiankuo [1 ]
Song, Bin [1 ]
Wang, Yingnan [1 ]
Guo, Yuming [1 ]
Qiao, Zhihua [1 ]
Mao, Zhizhong [3 ]
机构
[1] Shenyang Aerosp Univ, Sch Automat, Shenyang 110136, Peoples R China
[2] Liaoning Vocat Coll Ecol Engn, Sch Elect Engn, Shenyang 110122, Peoples R China
[3] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110819, Peoples R China
关键词
Novelty detection; Feature bagging; Ensemble learning; Subspace analysis; ENSEMBLE; CLASSIFICATION; ALGORITHMS;
D O I
10.1016/j.engappai.2023.105825
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Novelty detection in high-dimensional data is a challenging task due to the masking effect of irrelevant attributes. A common solution is to discover feature subspace, of which attributes are relevant to novelties. Due to the high uncertainty of novelties in practical applications, ensemble models that combine results from multiple subspaces are proved to be more effective than single models. According to the theory of bias-variance tradeoff, existing ensembles are often developed based on variance reduction. However, it is argued that the combination of poor detectors will deteriorate the performance of ensembles. To this end, this paper proposes an ensemble detector that takes into account variance and bias reduction simultaneously. Our ensemble is referred to as Selective Feature Bagging (SFB) since it is developed on the basis of Feature Bagging (FB). In order to improve the accuracy without deterioration of diversity of base detectors in FB, we resort to the notion of dynamic classifier selection which is proved be effective in classification. During the ensemble generation phase, base detectors are produced and categorized into different groups that are distinguished by the dimensionality of subspace used for training. The purpose of such a design is to maintain the diversity. During the generation phase, the most competent base detector from each of groups is dynamically selected and used to make decision on the test pattern. The purpose of such a design is to enhance the accuracy. We verify the effectiveness of SFB on 15 data sets from KEEL repository. Experimental results have shown that SFB can statistically outperform FB. In addition, several state-of-the-art have also been outperformed by SFB.
引用
收藏
页数:12
相关论文
共 39 条
  • [1] Aggarwal C. C., 2001, SIGMOD Record, V30, P37, DOI 10.1145/376284.375668
  • [2] Aggarwal C. C., 2015, ACM SIGKDD EXPLORATI, V17, P24, DOI [DOI 10.1145/2830544.2830549, 10.1145/2830544.2830549]
  • [3] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [4] Graph-based relevancy-redundancy gene selection method for cancer diagnosis
    Azadifar, Saeid
    Rostami, Mehrdad
    Berahmand, Kamal
    Moradi, Parham
    Oussalah, Mourad
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 147
  • [5] A multi-modal unsupervised fault detection system based on power signals and thermal imaging via deep AutoEncoder neural network
    Cordoni, Francesco
    Bacchiega, Gianluca
    Bondani, Giulio
    Radu, Robert
    Muradore, Riccardo
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [6] Robust Randomized Autoencoder and Correntropy Criterion-Based One-Class Classification
    Cui, Xiaonan
    Cao, Jiuwen
    Wang, Tianlei
    Lai, Xiaoping
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (04) : 1517 - 1521
  • [7] An Industrial Strength Novelty Detection Framework for Autonomous Equipment Monitoring and Diagnostics
    Filev, Dimitar P.
    Chinnam, Ratna Babu
    Tseng, Finn
    Baruah, Pundarikaksha
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2010, 6 (04) : 767 - 779
  • [8] Gao J, 2006, IEEE DATA MINING, P212
  • [9] Multi-class classification via heterogeneous ensemble of one-class classifiers
    Kang, Seokho
    Cho, Sungzoon
    Rang, Pilsung
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 43 : 35 - 43
  • [10] HiCS: High Contrast Subspaces for Density-Based Outlier Ranking
    Keller, Fabian
    Mueller, Emmanuel
    Beohm, Klemens
    [J]. 2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 1037 - 1048