A Comparison of Data Mining Tools and Classification Algorithms: Content Producers on the Video Sharing Platform

被引:0
作者
Atagun, Ercan [1 ]
Argun, Irem Duzdar [2 ]
机构
[1] Duzce Univ, Dept Comp Engn, Duzce, Turkey
[2] Duzce Univ, Ind Engn Dept, Duzce, Turkey
来源
ARTIFICIAL INTELLIGENCE AND APPLIED MATHEMATICS IN ENGINEERING PROBLEMS | 2020年 / 43卷
关键词
Data mining; Classification; Youtube;
D O I
10.1007/978-3-030-36178-5_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of internet technologies, the use of video sharing sites has increased. Video sharing sites allow users to watch videos of others. In addition, users can create an account to upload content and upload videos. These platforms stand out as the places where individuals are both producers and consumers. In this study, data about YouTube which is a video sharing site was used. The content of the content, which is also called as a channel on YouTube, was made by using a set of producers. The data set with 5000 samples on YouTube channels is taken from Kaggle. The data were classified using 4 different data mining tools such as Weka, RapidMiner, Knime and Orange using Naive Bayes and Random Forest algorithms. The parameters are requested from the user in order to obtain a more efficient result in the application of data mining algorithms and in the data preprocessing steps and in the data mining steps. Although these parameters are common in some data mining software, they are not included in all data mining software. Data mining software provides management of some parameters while other parameters cannot be managed. These changes affect the accuracy value in the study and affect the accuracy value in different ratios. Changing the values of the parameters revealed differences in the accuracy rates obtained. A data mining software model has been proposed by emphasizing to what extent the management of the parameters of the study and the extent of the management of the parameters should be connected to the data mining software developer.
引用
收藏
页码:526 / 538
页数:13
相关论文
共 19 条
  • [1] Akpolat O., 2016, 16XVI AK BIL K AYD 16XVI AK BIL K AYD
  • [2] [Anonymous], 2003, ERCIYES U J FACULTY
  • [3] Ata A, 2016, Soc Sci, V11, P312
  • [4] Bansal K. L, 2015, J ADV DATABASE MANAG, V2, P35
  • [5] Calis K., 2013, BILISIM TEKNOLOJILER, P1
  • [6] Chen X., 2014, PAC AS C KNOWL DISC PAC AS C KNOWL DISC, P3
  • [7] COOPER GF, 1992, MACH LEARN, V9, P309, DOI 10.1007/BF00994110
  • [8] Demsar J, 2013, J MACH LEARN RES, V14, P2349
  • [9] Dusanka D., 2017, 17 INT SCI C IND SYS, P150
  • [10] Duzdar I., 2017, EL EL COMP SCI BIOM EL EL COMP SCI BIOM, P1