Incremental Learning for Large Scale Classification Systems

被引:4
作者
Alexopoulos, Athanasios [1 ]
Kanavos, Andreas [1 ,2 ]
Giotopoulos, Konstantinos [2 ]
Mohasseb, Alaa [3 ]
Bader-El-Den, Mohamed [3 ]
Tsakalidis, Athanasios [1 ]
机构
[1] Univ Patras, Comp Engn & Informat Dept, Patras, Greece
[2] Technol Educ Inst Western Greece, Patras, Greece
[3] Univ Portsmouth, Sch Comp, Portsmouth, Hants, England
来源
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018 | 2018年 / 520卷
关键词
Apache Spark; Apache MLlib; Big data; Classification; Computing performance; DataFrame; Spark SQL; SPARK;
D O I
10.1007/978-3-319-92016-0_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the main characteristics of our time is the growth of the data volumes. We collect data literally from everywhere; smart phones, smart devices, social media and the health care system, which defines a small portion of the sources of the big data. The big data growth poses two main difficulties, storing and processing them. For the former, there are certain new technologies that enable us to store large amounts of data in a fast and reliable way. For the latter, new application frameworks have been developed. In this paper, we perform classification analysis using Apache Spark in one real dataset. The classification algorithms that we have used are multiclass, and we are going to examine the effect of the dataset size and input features on the classification results.
引用
收藏
页码:112 / 122
页数:11
相关论文
共 20 条
[1]  
[Anonymous], 2009, SIGKDD Explorations, DOI DOI 10.1145/1656274.1656278
[2]  
[Anonymous], 2001, ADAP COMP MACH LEARN
[3]  
[Anonymous], 2016, The Journal of Machine Learning Research, DOI DOI 10.1145/2882903.2912565
[4]   An Apache Spark Implementation for Sentiment Analysis on Twitter Data [J].
Baltas, Alexandros ;
Kanavos, Andreas ;
Tsakalidis, Athanasios K. .
ALGORITHMIC ASPECTS OF CLOUD COMPUTING, ALGOCLOUD 2016, 2017, 10230 :15-25
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[7]  
Fayyad U, 1996, AI MAG, V17, P37
[8]   Large scale implementations for twitter sentiment classification [J].
Kanavos, Andreas ;
Nodarakis, Nikolaos ;
Sioutas, Spyros ;
Tsakalidis, Athanasios ;
Tsolis, Dimitrios ;
Tzimas, Giannis .
Algorithms, 2017, 10 (01)
[9]  
Kang U., 2012, ACM SIGKDD EXPLORATI, V14, P29, DOI DOI 10.1145/2481244.2481249
[10]   A Parallel DistributedWeka Framework for Big Data Mining using Spark [J].
Koliopoulos, Aris-Kyriakos ;
Yiapanis, Paraskevas ;
Tekiner, Firat ;
Nenadic, Goran ;
Keane, John .
2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, :9-16