Distributed Big Data Mining Platform for Smart Grid

被引:0
作者
Wang, Zhixiang [1 ,2 ]
Wu, Bin [1 ,2 ]
Bai, Demeng [3 ]
Qin, Jiafeng [3 ]
机构
[1] Key Lab Intelligent Telecommun Software & Multime, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[3] Shandong Power Supply Co State Grid, Elect Power Reasearch Inst, Jinan, Shandong, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2018年
基金
国家重点研发计划;
关键词
Parallel; Data Mining; Components; Spark; Workflow;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid development of information technology and internet, all kinds of industry data exploded causing difficult to analyze and mine useful information from big data. Traditional analysis system has bottlenecks of performance and scalability in big data processing. The research and development of novel and efficient big data analysis and mining platform has become the focus of all organizations. Along with the development of smart grid, power data with characteristics of power industry needs more targeted and efficient data mining analysis. In this paper, aiming at the shortage of existing work, we propose a distributed big data mining platform based on distributed system infrastructure such as Hadoop and Spark. The platform develops and implements a variety of rapid highly parallel mining algorithm by Spark and Tensorflow, including machine learning, statistics and analysis, deep learning and so on. Using the OSGI technology to build low coupling component model, the platform improve reusability of component algorithm, introduces the workflow engine and user-friendly GUI, reduces the complexity of the user operations, support user-defined data mining tasks. For the characteristics of smart grid big data, the platform develops and improves the dozens of algorithm components about data processing and analysis. And designing a scalable algorithms library and the component library greatly improves the scalability of big data mining platform and processing smart grid data. Our platform has already been launched in a state grid Company, satisfying the demand of various smart grid data analysis business.
引用
收藏
页码:2345 / 2354
页数:10
相关论文
共 36 条
[1]  
Abadi M., 2016, TENSORFLOW LARGESCAL
[2]  
[Anonymous], 2008, ADV NEURAL INF PROCE
[3]  
[Anonymous], 2013, RapidMiner: Data Mining Use Cases and Business Analytics Applications
[4]  
[Anonymous], 2016, LARGE DATA PLATFORM
[5]  
[Anonymous], 2010, Advances in Neural Information Processing Systems
[6]   PRIVAaaS: privacy approach for a distributed cloud-based data analytics platforms [J].
Basso, Tania ;
Moraes, Regina ;
Antunes, Nuno ;
Vieira, Marco ;
Santos, Walter ;
Meira, Wagner, Jr. .
2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, :1108-1116
[7]  
Berthold M. R., 2006, ACM SIGKDD EXPLOR NE, V11, P26
[8]  
[卜尧 Bu Yao], 2017, [中国科学技术大学学报, Journal of University of Science and Technology of China], V47, P358
[9]  
CHEN Yun, 2016, DESIGN IMPLEMENTATIO
[10]  
Morales GD, 2013, PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), P777