Unsupervised concept drift detection for multi-label data streams

被引:26
作者
Gulcan, Ege Berkay [1 ]
Can, Fazli [1 ]
机构
[1] Bilkent Univ, Comp Engn Dept, Bilkent Informat Retrieval Grp, Ankara, Turkey
关键词
Big data; Multi-label data stream; Multi-label classification; Concept drift; Drift detection;
D O I
10.1007/s10462-022-10232-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many real-world applications adopt multi-label data streams as the need for algorithms to deal with rapidly changing data increases. Changes in data distribution, also known as concept drift, cause existing classification models to rapidly lose their effectiveness. To assist the classifiers, we propose a novel algorithm called Label Dependency Drift Detector (LD3), an unsupervised concept drift detector using label dependencies within the data for multi-label data streams. Our study exploits the dynamic temporal dependencies between labels using a label influence ranking method, which leverages a data fusion algorithm and uses the produced ranking to detect concept drift. LD3 is the first unsupervised concept drift detection algorithm in the multi-label classification problem area. In this study, we perform an extensive evaluation of LD3 by comparing it with 14 prevalent supervised concept drift detection algorithms that we adapt to the problem area using 15 datasets and a baseline classifier. The results show that LD3 provides between 16.9 and 56% better predictive performance than comparable detectors on both real-world and synthetic data streams.
引用
收藏
页码:2401 / 2434
页数:34
相关论文
共 62 条
[1]   Abstractions, Their Algorithms, and Their Compilers [J].
Aho, Alfred ;
Ullman, Jeffrey .
COMMUNICATIONS OF THE ACM, 2022, 65 (02) :76-91
[2]  
[Anonymous], 2011, 22 INT JOINT C ART I, DOI DOI 10.5591/978-1-57735-516-8/IJCAI11-220
[3]  
Baena-Garcia M, 2006, 4 ECML PKDD INT WORK
[4]   Data stream analysis: Foundations, major tasks and tools [J].
Bahri, Maroua ;
Bifet, Albert ;
Gama, Joao ;
Gomes, Heitor Murilo ;
Maniu, Silviu .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2021, 11 (03)
[5]   RDDM: Reactive drift detection method [J].
Barros, Roberto S. M. ;
Cabral, Danilo R. L. ;
Goncalves, Paulo M., Jr. ;
Santos, Silas G. T. C. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 90 :344-355
[6]  
Bifet A, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P443
[7]   GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for Evolving Data Streams [J].
Bonab, Hamed R. ;
Can, Fazli .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (02)
[8]   A Novel Online Stacked Ensemble for Multi-Label Stream Classification [J].
Buyukcakir, Alican ;
Bonab, Hamed ;
Can, Fazli .
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, :1063-1072
[9]   A Diversity Framework for Dealing With Multiple Types of Concept Drift Based on Clustering in the Model Space [J].
Chiu, Chun Wai ;
Minku, Leandro L. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (03) :1299-1309
[10]   Reciprocal Rank Fusion outperforms Condorcet and Individual Rank Learning Methods [J].
Cormack, Gordon V. ;
Clarke, Charles L. A. ;
Buettcher, Stefan .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :758-759