KNNENS: A k-Nearest Neighbor Ensemble-Based Method for Incremental Learning Under Data Stream With Emerging New Classes

被引:21
作者
Zhang, Jianjun [1 ]
Wang, Ting [2 ]
Ng, Wing W. Y. [1 ]
Pedrycz, Witold [3 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangdong Prov Key Lal Xwatory Computat Intellige, Guangzhou 510006, Peoples R China
[2] South China Univ Technol, Guangzhou Peoples Hosp 1, Dept Radiol, Guangzhou 510006, Guangdong, Peoples R China
[3] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2R3, Canada
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Data models; Training; Computational modeling; Adaptation models; Task analysis; Predictive models; Learning systems; Classification; data stream; ensemble methods; incremental learning; streaming emerging new class;
D O I
10.1109/TNNLS.2022.3149991
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this brief, we investigate the problem of incremental learning under data stream with emerging new classes (SENC). In the literature, existing approaches encounter the following problems: 1) yielding high false positive for the new class; i) having long prediction time; and 3) having access to true labels for all instances, which is unrealistic and unacceptable in real-life streaming tasks. Therefore, we propose the k-Nearest Neighbor ENSemble-based method (KNNENS) to handle these problems. The KNNENS is effective to detect the new class and maintains high classification performance for known classes. It is also efficient in terms of run time and does not require true labels of new class instances for model update, which is desired in real-life streaming classification tasks. Experimental results show that the KNNENS achieves the best performance on four benchmark datasets and three real-world data streams in terms of accuracy and F1-measure and has a relatively fast run time compared to four reference methods. Codes are available at https://github.com/Ntriver/KNNENS.
引用
收藏
页码:9520 / 9527
页数:8
相关论文
共 27 条
  • [1] Aggarwal CC, 2014, CH CRC DATA MIN KNOW, P231
  • [2] Chen Y., 2015, The ucr time series classification archive
  • [3] NEAREST NEIGHBOR PATTERN CLASSIFICATION
    COVER, TM
    HART, PE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
  • [4] Da Q, 2014, AAAI CONF ARTIF INTE, P1760
  • [5] MINAS: multiclass learning algorithm for novelty detection in data streams
    de Faria, Elaine Ribeiro
    de Leon Ferreira Carvalho, Andre Carlos Ponce
    Gama, Joao
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (03) : 640 - 680
  • [6] Multistream Classification with Relative Density Ratio Estimation
    Dong, Bo
    Gao, Yang
    Chandra, Swarup
    Khan, Latifur
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3478 - 3485
  • [7] Fernandez A., 2018, Learning from imbalanced data sets, P63, DOI 10.1007/978- 3-319-98074-4
  • [8] Gama Ja, 2003, KDD'03, P523, DOI [10.1145/956750.956813, DOI 10.1145/956750.956813]
  • [9] A Survey on Concept Drift Adaptation
    Gama, Joao
    Zliobaite, Indre
    Bifet, Albert
    Pechenizkiy, Mykola
    Bouchachia, Abdelhamid
    [J]. ACM COMPUTING SURVEYS, 2014, 46 (04)
  • [10] SACCOS: A Semi-Supervised Framework for Emerging Class Detection and Concept Drift Adaption Over Data Streams
    Gao, Yang
    Chandra, Swarup
    Li, Yifan
    Khan, Latifur
    Bhavani, Thuraisingham
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (03) : 1416 - 1426