Machine Learning with Distributed Data Management and Process Architecture

被引:0
作者
Baysal, Engin [1 ]
Bayilmis, Cuneyt [2 ]
机构
[1] Istanbul Gedik Univ, Gedik Vocat Sch, Istanbul, Turkey
[2] Sakarya Unveristy, Comp & Informat Engn, Sakarya, Turkey
来源
2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK) | 2019年
关键词
big data; big data analytics; machine learning; apache spark; pyspark; logistic regression; yarn;
D O I
10.1109/ubmk.2019.8907073
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the development of technology that takes place more and more every day in our lives, it becomes almost impossible to manage and process the data produced and thus brought about the necessity of storage and analysis. Both the data size and the increase in the variety of data have necessitated the development of new methods in this context. In this study, distributed data management and analysis tools which are developed for data that cannot be processed in traditional regulations have been used. The machine learning application has been developed by using Logistic Regression classification algorithm. The application was implemented with the data set obtained from the sensors using pyspark libraries on the Spark cluster created using the Google Cloud service. And the working environment managed by YARN, has been observed during the implementation of the application.
引用
收藏
页码:53 / 57
页数:5
相关论文
共 50 条
[21]   Stroke Prediction Using Machine Learning in a Distributed Environment [J].
Rajora, Maihul ;
Rathod, Mansi ;
Naik, Nenavath Srinivas .
DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, ICDCIT 2021, 2021, 12582 :238-252
[22]   Implikationen von Machine Learning auf das Datenmanagement in UnternehmenImplications of Machine Learning on Data Management in Companies [J].
René Kessler ;
Jorge Marx Gómez .
HMD Praxis der Wirtschaftsinformatik, 2020, 57 (1) :89-105
[23]   Strategies and Principles of Distributed Machine Learning on Big Data [J].
Xing, Eric P. ;
Ho, Qirong ;
Xie, Pengtao ;
Wei, Dai .
ENGINEERING, 2016, 2 (02) :179-195
[24]   Big Data and Machine Learning Driven Handover Management and Forecasting [J].
Vy, Le Luong ;
Tung, Li-Ping ;
Lin, Bao-Shuh Paul .
2017 IEEE CONFERENCE ON STANDARDS FOR COMMUNICATIONS AND NETWORKING (CSCN), 2017, :214-219
[25]   Accelerating Machine Learning on Sparse Datasets with a Distributed Memory Vector Architecture [J].
Araki, Takuya .
2017 16TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC-2017), 2017, :112-121
[26]   Enhancing dexterous hand control: a distributed architecture for machine learning integration [J].
Tu, Baoxu ;
Zhang, Yuanfei ;
Li, Wangyang ;
Ni, Fenglei ;
Jin, Minghe .
INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2024, 51 (06) :1006-1014
[27]   ARTHUR: Machine Learning Data Acquisition System with Distributed Data Sensors [J].
Schneider, Niels ;
Ruf, Philipp ;
Lermer, Matthias ;
Reich, Christoph .
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, CLOSER 2023, 2023, :155-163
[28]   Lightweight Distributed Gaussian Process Regression for Online Machine Learning [J].
Yuan, Zhenyuan ;
Zhu, Minghui .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (06) :3928-3943
[29]   Machine Learning and Big Data in optical CD metrology for process control [J].
Bringoltz, Barak ;
Rothstein, Eitan ;
Rubinovich, Ilya ;
Kim, YongHa ;
Tal, Noam ;
Cohen, Oded ;
Yogev, Shay ;
Broitman, Ariel ;
Rabinovich, Eylon ;
Zaharoni, Tal .
2018 E-MANUFACTURING & DESIGN COLLABORATION SYMPOSIUM (EMDC 2018), 2018,
[30]   Machine Learning for Data Management: A System View [J].
Li, Guoliang ;
Zhou, Xuanhe .
2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, :3198-3201