Data stream analysis: Foundations, major tasks and tools

被引:62
作者
Bahri, Maroua [1 ]
Bifet, Albert [1 ,2 ]
Gama, Joao [3 ]
Gomes, Heitor Murilo [2 ]
Maniu, Silviu [4 ]
机构
[1] Telecom Paris, IP Paris, LTCI, Palaiseau, France
[2] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
[3] Univ Porto, INESC TEC, Porto, Portugal
[4] Univ Paris Saclay, LRI, Orsay, France
关键词
CLASSIFICATION; ALGORITHMS; ITEMSETS; NETWORK; TREES;
D O I
10.1002/widm.1405
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The significant growth of interconnected Internet-of-Things (IoT) devices, the use of social networks, along with the evolution of technology in different domains, lead to a rise in the volume of data generated continuously from multiple systems. Valuable information can be derived from these evolving data streams by applying machine learning. In practice, several critical issues emerge when extracting useful knowledge from these potentially infinite data, mainly because of their evolving nature and high arrival rate which implies an inability to store them entirely. In this work, we provide a comprehensive survey that discusses the research constraints and the current state-of-the-art in this vibrant framework. Moreover, we present an updated overview of the latest contributions proposed in different stream mining tasks, particularly classification, regression, clustering, and frequent patterns. This article is categorized under: Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Fundamental Concepts of Data and Knowledge > Motivation and Emergence of Data Mining
引用
收藏
页数:17
相关论文
共 109 条
  • [1] Abdulsalam H, 2007, INT DATABASE ENG APP, P225
  • [2] Aggarwal C.C., 2007, DATA STREAMS MODELS
  • [3] Aggarwal C.C., 2007, Data streams: models and algorithms, P169, DOI [10.1007/978-0-387-47534-9_9, DOI 10.1007/978-0-387-47534-9_9]
  • [4] Aggarwal CC, 2003, P 29 INT C VER LARG, P81, DOI [DOI 10.1016/B978-012722442-8/50016-1, 10.1016/B978-, DOI 10.1016/B978]
  • [5] On Density-Based Data Streams Clustering Algorithms: A Survey
    Amini, Amineh
    Teh, Ying Wah
    Saboohi, Hadi
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 29 (01) : 116 - 141
  • [6] [Anonymous], 2001, ADAP COMP MACH LEARN
  • [7] [Anonymous], 2006, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), DOI DOI 10.1145/1150402.1150491
  • [8] [Anonymous], 2012, P 29 INT COF INT C M
  • [9] Babcock Brian, 2002, PODS, P1, DOI DOI 10.1145/543613.543615
  • [10] Bahri M., 2020, THESIS I POLYTECHNIQ