Random Walks on Adjacency Graphs for Mining Lexical Relations from Big Text Data

被引:0
作者
Jiang, Shan [1 ]
Zhai, ChengXiang [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
来源
2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2014年
基金
美国国家科学基金会;
关键词
NETWORKS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Lexical relations, or semantic relations of words, are useful knowledge fundamental to all applications since they help to capture inherent semantic variations of vocabulary in human languages. Discovering such knowledge in a robust way from arbitrary text data is a significant challenge in big text data mining. In this paper, we propose a novel general probabilistic approach based on random walks on word adjacency graphs to systematically mine two fundamental and complementary lexical relations, i.e., paradigmatic and syntagmatic relations between words from arbitrary text data. We show that representing text data as an adjacency graph opens up many opportunities to define interesting random walks for mining lexical relation patterns, and propose specific random walk algorithms for mining paradigmatic and syntagmatic relations. Evaluation results on multiple corpora show that the proposed random walk-based algorithms can discover meaningful paradigmatic and syntagmatic relations of words from text data.
引用
收藏
页码:549 / 554
页数:6
相关论文
empty
未找到相关数据