Analysis of Suitable Approaches for Data Mining Algorithms

被引:0
作者
Ansari, Nazneen [1 ]
Singh, Anjali B. [1 ]
Trivedi, Bina D. [1 ]
Nandankar, Priti B. [1 ]
机构
[1] St Francis Inst Technol, Informat Technol, Mumbai, Maharashtra, India
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020) | 2020年
关键词
classification; regression; clustering; accuracy; data mining; TP; TN; FP; FN; confusion matrix;
D O I
10.1109/iciccs48265.2020.9120892
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data mining is the knowledge discovery method by examining the huge bulks of information from numerous perspectives and summarizing it into valuable data; data mining has become an important component in numerous fields. It is used to recognize hidden patterns in a huge data set. In this paper, we are using three techniques of Data mining i.e. classification, clustering, and regression. The paper helps users to identify the best algorithm suitable for their dataset along with their advantages and disadvantages. It also shows the accuracy of the best four algorithms for classification and regression, and for clustering, it shows the number of clusters for different clustering algorithms. The performance of algorithms depends on the size of datasets. As the size of the dataset increases, the performance will also increase. This will reduce the work of users finding the best algorithm using different tools like weka, orange, TPOT, etc. The advantages and disadvantages can help users identify the single best algorithm for further analysis. This system also helps the user to identify target columns from the dataset and various dispensable columns in the dataset.
引用
收藏
页码:916 / 921
页数:6
相关论文
共 12 条
[1]  
[Anonymous], P INT C INF MAN INN
[2]  
Erol H, 2018, 2018 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA)
[3]  
Kranen P., 2010, Proceedings 2010 10th IEEE International Conference on Data Mining Workshops (ICDMW 2010), P1400, DOI 10.1109/ICDMW.2010.17
[4]  
Nookala G.K.M., 2013, INT J ADV RES ARTIFI, V2
[5]  
Patel KMA, 2016, 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, P2042, DOI 10.1109/ICCSP.2016.7754534
[6]  
Sadasivam SK, 2015, PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS 2015), P334, DOI 10.1109/HPCSim.2015.7237059
[7]  
Singh I, 2016, 2016 6TH INTERNATIONAL CONFERENCE - CLOUD SYSTEM AND BIG DATA ENGINEERING (CONFLUENCE), P294, DOI 10.1109/CONFLUENCE.2016.7508131
[8]  
Tiwari M., 2012, IOSR Journal of Computer Engineering, V6, P32
[9]  
Vasani V. P., 2014, INT J INNOVATIVE RES, V3, P10453
[10]  
Wang Xiangrui, 2011, Proceedings of the 2011 International Conference on Transportation and Mechanical & Electrical Engineering (TMEE), P292, DOI 10.1109/TMEE.2011.6199200