Using cocitation information to estimate political orientation in web documents

被引:13
作者
Efron, M [1 ]
机构
[1] Sch Informat, Austin, TX 78712 USA
关键词
opinion mining; style analysis; document classification; political orientation;
D O I
10.1007/s10115-005-0214-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a simple method for estimating cultural orientation, the affiliation of online entities in a polarized field of discourse. In particular, cocitation information is used to estimate the political orientation of hypertext documents. A type of cultural orientation, the political orientation of a document is the degree to which it participates in traditionally left- or right-wing beliefs. Estimating documents' political orientation is of interest for personalized information retrieval and recommender systems. In its application to politics, the method uses a simple probabilistic model to estimate the strength of association between a document and left- and right-wing communities. The model estimates the likelihood of cocitation between a document of interest and a small number of documents of known orientation. The model is tested on three sets of data, 695 partisan web documents, 162 political weblogs, and 198 nonpartisan documents. Accuracy above 90% is obtained from the cocitation model, outperforming lexically based classifiers at statistically significant levels.
引用
收藏
页码:492 / 511
页数:20
相关论文
共 26 条
  • [1] Agrawal R., 2003, PROC 12 INT C WORLD, P529
  • [2] [Anonymous], 2011, Categorical data analysis
  • [3] BARABASI L, 2002, LINKED NEW SCI NETWO
  • [4] Beineke P., 2004, SENTIMENTAL FACTOR I, P263, DOI DOI 10.3115/1218955.1218989
  • [5] BOTAFOGO RA, 1991, UK C HYPERTEXT, P63
  • [6] The anatomy of a large-scale hypertextual Web search engine
    Brin, S
    Page, L
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 107 - 117
  • [7] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [8] CHURCH KW, 1990, 27TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P76
  • [9] Dave K., 2003, P 12 INT C WORLD WID, P519, DOI DOI 10.1145/775152.775226
  • [10] EHRLICH E, 2003, NY TIMES 1214, pB1