Understanding political polarization using language models: A dataset and method

被引:3
作者
Gode, Samiran [1 ]
Bare, Supreeth [1 ]
Raj, Bhiksha [1 ,2 ]
Yoo, Hyungon [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Computational linguistics;
D O I
10.1002/aaai.12104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our paper aims to analyze political polarization in US political system using language models, and thereby help candidates make an informed decision. The availability of this information will help voters understand their candidates' views on the economy, healthcare, education, and other social issues. Our main contributions are a dataset extracted from Wikipedia that spans the past 120 years and a language model-based method that helps analyze how polarized a candidate is. Our data are divided into two parts, background information and political information about a candidate, since our hypothesis is that the political views of a candidate should be based on reason and be independent of factors such as birthplace, alma mater, and so forth. We further split this data into four phases chronologically, to help understand if and how the polarization amongst candidates changes. This data has been cleaned to remove biases. To understand the polarization, we begin by showing results from some classical language models in Word2Vec and Doc2Vec. And then use more powerful techniques like the Longformer, a transformer-based encoder, to assimilate more information and find the nearest neighbors of each candidate based on their political view and their background. The code and data for the project will be available here: ""
引用
收藏
页码:248 / 254
页数:7
相关论文
共 20 条
[1]   Learning Political Polarization on Social Media Using Neural Networks [J].
Belcastro, Loris ;
Cantini, Riccardo ;
Marozzo, Fabrizio ;
Talia, Domenico ;
Trunfio, Paolo .
IEEE ACCESS, 2020, 8 :47177-47187
[2]  
Beltagy I., 2020, LONGFORMER LONG DOCU, DOI DOI 10.48550/ARXIV.2004.05150
[3]   Illuminating an Ecosystem of Partisan Websites [J].
Bhatt, Shweta ;
Joglekar, Sagar ;
Bano, Shehar ;
Sastry, Nishanth .
COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, :545-554
[4]   The price of political polarization: Evidence from municipal issuers during the coronavirus pandemic [J].
Chen, Zhiwei ;
Li, Zhaoyuan ;
Liu, Sibo .
FINANCE RESEARCH LETTERS, 2022, 47
[5]  
Dean J., 2013, EFFICIENT ESTIMATION
[6]  
DeSilver D., 2022, The polarization in today's Congress has roots that go back decades
[7]  
Devlin J., 2018, NAACLHLT
[8]  
Hamilton WL, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1489
[9]  
Jiang S, 2020, AAAI CONF ARTIF INTE, V34, P13669
[10]   Gender bias, social bias, and representation: 70 years of BHollywood [J].
Khadilkar, Kunal ;
KhudaBukhsh, Ashiqur R. ;
Mitchell, Tom M. .
PATTERNS, 2022, 3 (02)