Matching Reviews to Database Objects Based on Labeled Latent Dirichlet Allocation Model

被引:1
作者
Zhu, Yumin [1 ]
Li, Qingzhong [1 ]
Zhu, Yumin [1 ]
机构
[1] Shandong Univ, Sch Comp Sci & Technol, Jinan 250100, Peoples R China
来源
2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013) | 2013年
关键词
Latent Dirichlet Allocatio; Gibbs sampling; review matching; data integration;
D O I
10.1109/WISA.2013.18
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We develop a method for matching unstructured reviews to database objects in data integration, where each object has a set of attributes. To this end, we propose a Labeled Latent Dirichlet Allocation model. We model reviews as if they were generated by a two-stage stochastic process. Each review is represented by a probability distribution over attributes, and each attribute is represented as a probability distribution over words for that attribute. We introduce the label for each attribute, and then the model integrates object information. We use an unsupervised manner to estimate the model parameters, and use this model to find, given a review, the most likely object to be the topic of the review. Experiments in multiple domains show that our method is superior to the TFIDF method as well as a recent RLM method for the review matching problem.
引用
收藏
页码:48 / +
页数:2
相关论文
共 20 条
[1]  
[Anonymous], 2007, ACM Transactions on Knowledge Discovery from Data (TKDD), DOI [DOI 10.1145/1217299.1217304, 10.1145/1217299.1217304]
[2]  
Barbosa L., 2009, P NAACL, P494
[3]  
Berger A, 1999, SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P222, DOI 10.1145/312624.312681
[4]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[5]  
Brown P. F., 1993, Computational Linguistics, V19, P263
[6]  
Brown P. F., 1990, Computational Linguistics, V16, P79
[7]  
Chakaravarthy VenkatesanT., 2006, P INT C VERY LARGE D, P667
[8]  
Dalvi N., 2009, CIKM 09 HK CHIN, P167
[9]  
Dalvi N.N., 2009, EMNLP, P609
[10]  
Doan A., 2005, AI Magazine