Big data classification of learning behaviour based on data reduction and ensemble learning

被引:1
作者
Wang, Taotao [1 ]
Wu, Xiaoxuan [2 ]
机构
[1] Jiangxi Univ Technol, Dept Informat Engn Coll, Ganzhou 330098, Jiangxi, Peoples R China
[2] Guangxi Vocat Coll Water Resources & Elect Power, Dept Gen Educ, Nanning 530023, Peoples R China
关键词
data reduction; ensemble learning; rough set theory; big data of learning behaviour; big data classification;
D O I
10.1504/IJCEELL.2023.132418
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
In order to overcome the problems of low classification accuracy, long time, and high missing ratio of traditional methods, a big data classification method of learning behaviour based on data reduction and ensemble learning was proposed. By cleaning and transforming the big data of learning behaviour and discretising the attributes of big data of learning behaviour, the data reduction algorithm is used to simplify the attributes of big data of learning behaviour. The ensemble learning method is used to linearly combine several weak classifiers, and the ensemble classifier is trained according to Choquet integral. The trained classifier is used to classify the big data of learning behaviour after simplified processing. The experimental results show that when the amount of big data on learning behaviour reaches 5,000 GB, the average classification accuracy of the proposed method is 92%, the classification time is 29 s, and the failure rate of classification is 0.32%.
引用
收藏
页码:496 / 510
页数:16
相关论文
共 15 条
[1]  
Bao H., 2020, COMPUTER SIMULATION, V37, P317
[2]   Online feature selection system for big data classification based on multi-objective automated negotiation [J].
BenSaid, Fatma ;
Alimi, Adel M. .
PATTERN RECOGNITION, 2021, 110
[3]  
Chamikara M., 2019, INFORM SCIENCES, V19, P693
[4]   Retrospective Risk Assessment of Chemical Mixtures in the Big Data Era: An Alternative Classification Strategy to Integrate Chemical and Toxicological Data [J].
Cheng, Fei ;
Li, Huizhen ;
Brooks, Bryan W. ;
You, Jing .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2020, 54 (10) :5925-5927
[5]   Fuzzy-NN approach with statistical features for description and classification of efficient image retrieval [J].
Garg, Meenakshi ;
Singh, Harpal ;
Malhotra, Manisha .
MODERN PHYSICS LETTERS A, 2019, 34 (03)
[6]   Statistical Data Analysis in the Era of Big Data [J].
Lengauer, Thomas .
CHEMIE INGENIEUR TECHNIK, 2020, 92 (07) :831-841
[7]   Fast and Scalable Approaches to Accelerate the Fuzzy k-Nearest Neighbors Classifier for Big Data [J].
Maillo, Jesus ;
Garcia, Salvador ;
Luengo, Julian ;
Herrera, Francisco ;
Triguero, Isaac .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (05) :874-886
[8]   FDM: Fuzzy-Optimized Data Management Technique for Improving Big Data Analytics [J].
Manogaran, Gunasekaran ;
Shakeel, P. Mohamed ;
Baskar, S. ;
Hsu, Ching-Hsien ;
Kadry, Seifedine Nimer ;
Sundarasekar, Revathi ;
Kumar, Priyan Malarvizhi ;
Muthu, Bala Anand .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2021, 29 (01) :177-185
[9]   Implications of juvenile idiopathic arthritis genetic risk variants for disease pathogenesis and classification [J].
Nigrovic, Peter A. ;
Martinez-Bonet, Marta ;
Thompson, Susan D. .
CURRENT OPINION IN RHEUMATOLOGY, 2019, 31 (05) :401-410
[10]   Big Data Statistical Analysis of Facial Fractures in Korea [J].
Park, Cheol-Heum ;
Chung, Kyu Jin ;
Kim, Tae Gon ;
Lee, Jun Ho ;
Kim, Il-Kug ;
Kim, Yong-Ha .
JOURNAL OF KOREAN MEDICAL SCIENCE, 2020, 35 (07)