A classification and extraction method of attribute hybrid big data based on Naive Bayes algorithm

被引:1
作者
Li, Liantian [1 ]
Yang, Ling [1 ]
机构
[1] Yangjiang Polytech, Dept Informat Engn, Yangjiang 529500, Guangdong, Peoples R China
关键词
Naive Bayes; big data; classification model; attribute mixing;
D O I
10.3233/JCM-226802
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In the identification of network text information, the existing technology is difficult to accurately extract and classify text information with high propagation speed and high update speed. In order to solve this problem, the research combines the Naive Bayes algorithm with the feature two-dimensional information gain weighting method, uses the feature weighting method to optimize the Naive Bayes algorithm, and calculates the dimension of different documents and data categories through a new feature operation method. The data gain between them can improve its classification performance, and the classification models are compared and analyzed in the actual Chinese and English databases. The research results show that the classification accuracy rates of the IGDC-DWNB model in the Sogou database, 20-newsgroup database, Fudan database and Ruster21578 database are 0.89, 0.89, 0.93, and 0.88, respectively, which are higher than other classification models in the same environment. It can be seen that the model designed in the research has higher classification accuracy, stronger overall performance, and stronger reliability and robustness in practical applications, which can provide a new development idea for big data classification technology.
引用
收藏
页码:1955 / 1970
页数:16
相关论文
共 28 条
[1]  
Alfianti ZI, 2020, JURNAL RISET INFORMA, V2, P169
[2]  
Alotaibi A., 2021, Eur J Sci Res, V158, P181
[3]  
Boyapati Sumati, 2020, Proceedings of the 3rd International Conference on Intelligent Sustainable Systems (ICISS 2020), P762, DOI 10.1109/ICISS49785.2020.9315870
[4]   Rider Chaotic Biography Optimization-driven Deep Stacked Auto-encoder for Big Data Classification Using Spark Architecture: Rider Chaotic Biography Optimization [J].
Brahmane, Anilkumar, V ;
Krishna, Chaitanya B. .
INTERNATIONAL JOURNAL OF WEB SERVICES RESEARCH, 2021, 18 (03) :42-62
[5]  
Cao Z, 2021, Nan Fang Yi Ke Da Xue Xue Bao, V41, P607, DOI 10.12122/j.issn.1673-4254.2021.04.19
[6]  
Celik S, 2021, J. Adv. Res. Appl. Math, V7, P17
[7]  
Isa I., 2021, JURNAL SISFOKOM, V10, P31
[8]  
Lakhwani K, 2020, J NAT REM, V21, P972
[9]   Random forest for big data classification in the internet of things using optimal features [J].
Lakshmanaprabu, S. K. ;
Shankar, K. ;
Ilayaraja, M. ;
Nasir, Abdul Wahid ;
Vijayakumar, V. ;
Chilamkurti, Naveen .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (10) :2609-2618
[10]   Optimization of architectural art teaching model based on Naive Bayesian classification algorithm and fuzzy model [J].
Liu, Ying .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) :1965-1976