Data stream analysis: Foundations, major tasks and tools

被引:74
作者
Bahri, Maroua [1 ]
Bifet, Albert [1 ,2 ]
Gama, Joao [3 ]
Gomes, Heitor Murilo [2 ]
Maniu, Silviu [4 ]
机构
[1] Telecom Paris, IP Paris, LTCI, Palaiseau, France
[2] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
[3] Univ Porto, INESC TEC, Porto, Portugal
[4] Univ Paris Saclay, LRI, Orsay, France
关键词
CLASSIFICATION; ALGORITHMS; ITEMSETS; NETWORK; TREES;
D O I
10.1002/widm.1405
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The significant growth of interconnected Internet-of-Things (IoT) devices, the use of social networks, along with the evolution of technology in different domains, lead to a rise in the volume of data generated continuously from multiple systems. Valuable information can be derived from these evolving data streams by applying machine learning. In practice, several critical issues emerge when extracting useful knowledge from these potentially infinite data, mainly because of their evolving nature and high arrival rate which implies an inability to store them entirely. In this work, we provide a comprehensive survey that discusses the research constraints and the current state-of-the-art in this vibrant framework. Moreover, we present an updated overview of the latest contributions proposed in different stream mining tasks, particularly classification, regression, clustering, and frequent patterns. This article is categorized under: Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Fundamental Concepts of Data and Knowledge > Motivation and Emergence of Data Mining
引用
收藏
页数:17
相关论文
共 109 条
[1]  
Abdulsalam H, 2007, INT DATABASE ENG APP, P225
[2]  
Aggarwal C.C., 2007, DATA STREAMS MODELS, V31
[3]  
Aggarwal C. C., 2003, P VLDB C, P81
[4]  
Aggarwal C. C., 2007, Data Streams, P169, DOI [10.1007/978-0-387-47534-9_9, DOI 10.1007/978-0-387-47534-9_9]
[5]   On Density-Based Data Streams Clustering Algorithms: A Survey [J].
Amini, Amineh ;
Teh, Ying Wah ;
Saboohi, Hadi .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 29 (01) :116-141
[6]  
[Anonymous], 2011, SIGKDD, DOI [10.1145/2339530.2339677, DOI 10.1145/2339530.2339677, DOI 10.1145/2020408.2020555]
[7]  
[Anonymous], 2012, ICML 12
[8]  
[Anonymous], 2009, Technical report
[9]  
[Anonymous], 2003, P 9 ACM SIGKDD INT C, DOI 10.1145/956750.956813
[10]  
Babcock B., 2002, Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, P1, DOI DOI 10.1145/543613.543615