Semi-Supervised Learning with Concept Drift using Particle Dynamics applied to Network Intrusion Detection Data

被引：12

作者：

Breve, Fabricio ^{[1
]}

Zhao, Liang ^{[2
]}

机构：

[1] Sao Paulo State Univ UNESP, Inst Geosci & Exact Sci IGCE, Rio Claro, Brazil

[2] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, Brazil

来源：

2013 1ST BRICS COUNTRIES CONGRESS ON COMPUTATIONAL INTELLIGENCE AND 11TH BRAZILIAN CONGRESS ON COMPUTATIONAL INTELLIGENCE (BRICS-CCI & CBIC) | 2013年

基金：

巴西圣保罗研究基金会;

关键词：

UNLABELED DATA; CLASSIFICATION; CLASSIFIERS;

D O I：

10.1109/BRICS-CCI-CBIC.2013.63

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.

引用

页码：335 / 340

页数：6

共 47 条

[1]

Abney S, 2008, CH CRC COMP SCI DATA, P1

[2]

[Anonymous], 1999, ARTIFICIAL NEURAL NE

[3]

[Anonymous], 2008, STAT LEARNING THEORY

[4]

[Anonymous], 2004, LANGUAGE KNOWLEDGE R

[5]

[Anonymous], 2003, P 20 INT C MACH LEAR

[6]

[Anonymous], 2010, Semi-Supervised Learning

[7] Regularization and semi-supervised learning on large graphs [J].

Belkin, M ;

Matveeva, I ;

Niyogi, P .

LEARNING THEORY, PROCEEDINGS, 2004, 3120 :624-638

[8]

Belkin M., 2005, Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics AISTAT 2005, P17

[9]

Bifet A., 2007, SIAM INT C DAT MIN

[10]

Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962

← 1 2 3 4 5 →