Random forest based online topic detection using topic graph cluster

被引:0
作者
Chen, Qian [1 ,2 ]
Gui, Zhiguo [1 ,3 ,4 ]
Guo, Xin [2 ]
Xiang, Yang [5 ]
机构
[1] School of Information and Communication Engineering, North University of China, Taiyuan, Shanxi, China
[2] School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi, China
[3] Key Laboratory of Instrumentation Science and Dynamic Measurement, North University of China, Taiyuan, Shanxi, China
[4] National Key laboratory for Electronic Measurement Technology, Taiyuan, Shanxi, China
[5] School of Electronics and Information Engineering, Tongji University, Shanghai, China
来源
Metallurgical and Mining Industry | 2015年 / 7卷 / 09期
关键词
Decision trees - Semantics;
D O I
暂无
中图分类号
学科分类号
摘要
We proposed an online topic detection approach using random forest based on topic graph cluster which models a topic in the form of graph comprised of terms and the edges among terms. The topic graph structure can largely enhance the semantic information hidden in the corpus, thus avoided the shortcoming of bag-of-words. Random Forest was used to simplify the online topic detection process in a considerate way thus gain low complexity in terms of time and space. Our approach can link two different corpuses, and investigate the linkage among topics between two corpuses. Furthermore, RF-OTD can detect outliers and novelty in text stream by computing the proximity of two topic graph. Experimental results showed that, compared to baseline topic detection algorithm, our approach gain better performance in computational efficiency, consistency and semantic explanation.
引用
收藏
页码:68 / 75
相关论文
empty
未找到相关数据