BEANS CLASSIFICATION USING DECISION TREE AND RANDOM FOREST WITH RANDOMIZED SEARCH HYPERPARAMETER TUNING

被引:0
作者
Koeshardianto, Meidya [1 ]
Permana, Kurniawan Eka [1 ]
Kartika, Dhian Satria Yudha [2 ]
Setiawan, Wahyudi [3 ]
机构
[1] Univ Trunojoyo Madura, Dept Informat, Bangkalan 69192, Jawa Timur, Indonesia
[2] Univ Pembangunan Nas Vet, Dept Informat Syst, Surabaya 60294, Jawa Timur, Indonesia
[3] Univ Trunojoyo Madura, Dept Informat Syst, Bangkalan 69192, Jawa Timur, Indonesia
关键词
beans; classification; random forest; randomized search; hyperparameter tuning;
D O I
10.28919/cmbn/8225
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dry-beans are a food with high protein. Dry-beans can be used as processed food products for emergency conditions such as famine, natural disasters, and war. Dry-beans can be used as a long-lasting product. To identify types of beans, manual work certainly requires a lot of time and effort. Therefore, creating a system that can classify beans in a computerized system is necessary. In this study, we classified beans using public data from Koklu. The data consists of sixteen features, seven classes with 13,611 rows. The data for each class of bean is unbalanced, so it is necessary to carry out a balanced dataset using random oversampling. Machine learning for classification using Decision Tree and Random Forest. Apart from that, hyperparameter tuning with randomize search for the number of trees 50, 75, 150, 200, and 300. The test results show that the Random Forest's accuracy, precision, recall, and f1-score reach 0.9658 respectively. The best parameter number of trees is 300.
引用
收藏
页数:14
相关论文
共 23 条
[1]  
Andriushchenko Maksym, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P484, DOI 10.1007/978-3-030-58592-1_29
[2]   A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping [J].
Avand, Mohammadtaghi ;
Janizadeh, Saeid ;
Naghibi, Seyed Amir ;
Pourghasemi, Hamid Reza ;
Bozchaloei, Saeid Khosrobeigi ;
Blaschke, Thomas .
WATER, 2019, 11 (10)
[3]  
Charbuty B, 2021, Journal of Applied Science and Technology Trends, V2, P20, DOI [DOI 10.38094/JASTT20165, 10.38094/jastt20165]
[4]  
Dellosa R.M., 2023, INT J BIOSCI, V23, P81, DOI [10.12692/ijb/23.1.81-92, DOI 10.12692/ijb/23.1.81-92]
[5]  
Dolbeault M, 2024, Arxiv, DOI arXiv:2306.07435
[6]  
Ekafitri R., 2014, Pangan, V23, P134
[7]  
Haghighi S., 2018, J OPEN SOURCE SOFTW, V3, P729, DOI DOI 10.21105/JOSS.00729
[8]   Missing data imputation in clinical trials using recurrent neural network facilitated by clustering and oversampling [J].
Haliduola, Halimu N. ;
Bretz, Frank ;
Mansmann, Ulrich .
BIOMETRICAL JOURNAL, 2022, 64 (05) :863-882
[9]   Multiclass classification of dry beans using computer vision and machine learning techniques [J].
Koklu, Murat ;
Ozkan, Ilker Ali .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 174
[10]  
Komorowski M., 2016, Secondary Analysis of Electronic Health Records, P351, DOI [10.1007/978-3-319-43742-2_24, DOI 10.1007/978-3-319-43742-2_24]