Research on power marketing data mining and clustering techniques based on Bert and k-meas

被引:2
作者
Wang, Hongwei [1 ]
Yin, Peng [1 ]
Duan, Zhitian [1 ]
Li, Yu [1 ]
机构
[1] State Grid Tianjin Power Co, Mkt Dept, Tianjin, Peoples R China
来源
PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON POWER ELECTRONICS AND ARTIFICIAL INTELLIGENCE, PEAI 2024 | 2024年
关键词
Electricity marketing; Data mining; Text clustering; Natural language processing; BERT;
D O I
10.1145/3674225.3674360
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering analysis is an important branch of data mining, which is applied to the power industry to improve the market competitiveness of power enterprises. This paper proposes a machine recognition algorithm KBert (Bert+K-Means) for specific type clustering of power marketing texts. The algorithm first converts the power marketing text into a high-dimensional text matrix; Secondly, iteratively optimizes key weight parameters in the Chinese Bert model to obtain a global semantic vector. Finally, in order to solve the limitations of the traditional BERT model, we introduced the K-Means algorithm to improve. The results show that the proposed KBert model overcomes the problems of long distance of power marketing text and uneven classification of sample types, and the performance index F1 value is better than the traditional BERT and Attention+Bilstm models, which realize the fast clustering recognition of multiple power marketing information with high accuracy.
引用
收藏
页码:747 / 751
页数:5
相关论文
共 10 条
[1]  
FENG Bin, 2020, China Journal of Electrical Engineering, V40, P1
[2]  
[蒋晨 Jiang Chen], 2021, [电网技术, Power System Technology], V45, P2141
[3]  
Jiawei Shu, 2023, High Voltage Technology, DOI [10.13336/j.1003-6520.hve.20230772, DOI 10.13336/J.1003-6520.HVE.20230772]
[4]  
Liu Bei, 2021, High Voltage Engineering, V47, P445, DOI 10.13336/j.1003-6520.hve.20200675
[5]  
Nguyen T. H., 2015, P 1 WORKSH VECT SPAC, P39, DOI [DOI 10.3115/V1/W15-1506, 10.3115/V1/W15-1506]
[6]  
PU Tianjiao, 2021, Construction and application of knowledge graph in the electric power field, V45, P12, DOI [10.13335/j.1000-3673.pst.2020.2145, DOI 10.13335/J.1000-3673.PST.2020.2145]
[7]  
Shao Guanyu, 2020, Power System Automation, V44, P178
[8]   An automated framework for the extraction of semantic legal metadata from legal texts [J].
Sleimi, Amin ;
Sannier, Nicolas ;
Sabetzadeh, Mehrdad ;
Briand, Lionel ;
Ceci, Marcello ;
Dann, John .
EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (03)
[9]  
Wang C ..., 2019, Power Autom. Equip., V39, P126
[10]  
Zhai F, 2020, Power Automation Equipment, V40, P38, DOI [10.16081/j.epae.202006019, DOI 10.16081/J.EPAE.202006019]