Xiaomi Brand Appraisal Research Based on Zhihu by Text Mining Technology

被引:1
作者
Xu, Aiting [1 ]
Wang, Fangyan [1 ]
Ying, Pingting [1 ]
机构
[1] Zhejiang Gongshang Univ, Coll Stat & Math, Hangzhou, Zhejiang, Peoples R China
来源
ICBDC 2019: PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON BIG DATA AND COMPUTING | 2019年
关键词
Xiaomi; text mining; LDA topic model; Gibbs sampling;
D O I
10.1145/3335484.3335515
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As the largest knowledge social platform on the Chinese Internet, Zhihu has gradually become an important resource for merchants to improve publicity and optimize products, and the public to understand the brand image. The topic of "Xiaomi Technology" remains hot on Zhihu. In this context, this paper takes the essences of the "Xiaomi Technology" topic on Zhihu as the research object. First we carry on the data collection and preprocessing. Then by extracting feature based on word segmentation results, we build a corpus and construct an LDA topic model for text mining. Besides, by calculating and comparing the perplexity index, we select 20 as the number of topics. According to the results, the relationship between document-topic and topic-term is analyzed to form a topic description of the text, which shows that Xiaomi products have received great attention from consumers and are often used for comparison with other brands in the same industry; Xiaomi product launches have received much attention and had a direct impact on product sales; Xiaomi is widely recognized as one of the representatives of China's future technology.
引用
收藏
页码:221 / 225
页数:5
相关论文
共 12 条
[1]  
Blei D., 2006, ADV NEURAL INFORM PR
[2]  
Blei D.M., 2006, P 23 INT C MACH LEAR, DOI DOI 10.1145/1143844.1143859
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]  
GEMAN S, 1993, J APPL STAT, V20, P25, DOI DOI 10.1080/02664769300000058
[5]  
Hoffman Matthew, 2010, P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF PROC P ADV NEUR INF P
[6]  
Huang L., 2017, AS PAC SOFTW ENG C W
[7]  
Maity S. K., 2015, 9 INT AAAI C WEB SOC
[8]   Statistical topic models for multi-label document classification [J].
Rubin, Timothy N. ;
Chambers, America ;
Smyth, Padhraic ;
Steyvers, Mark .
MACHINE LEARNING, 2012, 88 (1-2) :157-208
[9]  
Selvi M., 2019, Nanoelectronics, Circuits and Communication Systems. Proceeding of NCCS 2017. Lecture Notes in Electrical Engineering (LNEE 511), P1, DOI 10.1007/978-981-13-0776-8_1
[10]   Topic Discovery Based on LDA Model with Fast Gibbs Sampling [J].
Shi Jing ;
Li Wanlong .
2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, VOL III, PROCEEDINGS, 2009, :91-95