Using cocitation information to estimate political orientation in web documents

被引:13
作者
Efron, M [1 ]
机构
[1] Sch Informat, Austin, TX 78712 USA
关键词
opinion mining; style analysis; document classification; political orientation;
D O I
10.1007/s10115-005-0214-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a simple method for estimating cultural orientation, the affiliation of online entities in a polarized field of discourse. In particular, cocitation information is used to estimate the political orientation of hypertext documents. A type of cultural orientation, the political orientation of a document is the degree to which it participates in traditionally left- or right-wing beliefs. Estimating documents' political orientation is of interest for personalized information retrieval and recommender systems. In its application to politics, the method uses a simple probabilistic model to estimate the strength of association between a document and left- and right-wing communities. The model estimates the likelihood of cocitation between a document of interest and a small number of documents of known orientation. The model is tested on three sets of data, 695 partisan web documents, 162 political weblogs, and 198 nonpartisan documents. Accuracy above 90% is obtained from the cocitation model, outperforming lexically based classifiers at statistically significant levels.
引用
收藏
页码:492 / 511
页数:20
相关论文
共 26 条
[1]  
Agrawal R., 2003, PROC 12 INT C WORLD, P529
[2]  
[Anonymous], 2011, Categorical data analysis
[3]  
BARABASI L, 2002, LINKED NEW SCI NETWO
[4]  
Beineke P., 2004, SENTIMENTAL FACTOR I, P263, DOI DOI 10.3115/1218955.1218989
[5]  
BOTAFOGO RA, 1991, UK C HYPERTEXT, P63
[6]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[7]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[8]  
CHURCH KW, 1990, 27TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P76
[9]  
Dave K., 2003, P 12 INT C WORLD WID, P519, DOI DOI 10.1145/775152.775226
[10]  
EHRLICH E, 2003, NY TIMES 1214, pB1