Survey of Online Learning Algorithms for Streaming Data Classification

被引:0
|
作者
Zhai T.-T. [1 ,2 ]
Gao Y. [2 ]
Zhu J.-W. [1 ]
机构
[1] School of Information Engineering, Yangzhou University, Yangzhou
[2] State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2020年 / 31卷 / 04期
基金
中国国家自然科学基金;
关键词
Concept drifting; Curse of dimensionality; Evolving data stream classification; Online learning; Sparse online learning; Streaming data classification;
D O I
10.13328/j.cnki.jos.005916
中图分类号
学科分类号
摘要
The objective of streaming data classification is to learn incrementally a decision function that maps input variables to a label variable, from continuously arriving streaming data, so as to accurately classify the test data that may arrive anytime. The online learning paradigm, as an incremental machine learning technology, is an effective tool for classification of streaming data. This paper mainly summarizes, from the perspective of online learning, the recent development of algorithms for streaming data classification. Specifically, the basic framework and the performance evaluation methodology of online learning are first introduced. Then, the latest development of online learning algorithms for general streaming data, for alleviating the "curse of dimensionality" problem in high-dimensional streaming data, and for resolving the "concept drifting" problem in evolving streaming data are reviewed respectively. Finally, future challenges and promising research directions for classification of high-dimensional and evolving streaming data are also discussed. © Copyright 2020, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:912 / 931
页数:19
相关论文
共 137 条
  • [1] Aggarwal C.C., A survey of stream classification algorithms, Data Classification: Algorithms and Applications, pp. 245-274, (2014)
  • [2] Krempl G., Zliobaite I., Brzezinski D., Hullermeier E., Last M., Lemaire V., Noack T., Shaker A., Sievi S., Spiliopoulou M., Stefanowski J., Open challenges for data stream mining research, SIGKDD Explorations, 16, 1, pp. 1-10, (2014)
  • [3] Zhai T.T., Online learning algorithms for classification of streaming data, (2018)
  • [4] Vapnik V., An overview of statistical learning theory, IEEE Trans. on Neural Networks, 10, 5, pp. 988-999, (1999)
  • [5] Shalev-Shwartz S., Singer Y., Online learning: Theory, algorithms, and applications, (2007)
  • [6] Shalev-Shwartz S., Online learning and online convex optimization, Foundations and Trends in Machine Learning, 4, 2, pp. 107-194, (2012)
  • [7] Hazan E., Introduction to online convex optimization, Foundations and Trends in Optimization, 2, 3-4, pp. 157-325, (2016)
  • [8] Cesa-Bianchi N., Conconi A., Gentile C., On the generalization ability of online learning algorithms, IEEE Trans. on Information Theory, 50, 9, pp. 2050-2057, (2004)
  • [9] Shalev-Shwartz S., Singer Y., Srebro N., Pegasos: Primal estimated sub-gradient solver for SVM, Proc. of the Int'l Conf. on Machine Learning (ICML 2007), pp. 807-814, (2007)
  • [10] Zhang L., Yi J., Jin R., Lin M., He X., Online kernel learning with a near optimal sparsity bound, Proc. of the Int'l Conf. on Machine Learning (ICML 2013), pp. 621-629, (2013)