Stance detection from text using semi-automatic corpus expansion

被引:0
作者
Pereira, Camila Farias Pena [1 ]
Paraboni, Ivandre [1 ]
机构
[1] Univ Sao Paulo, Escola Artes Ciencias & Human, Sao Paulo, Brazil
来源
LINGUAMATICA | 2024年 / 16卷 / 02期
关键词
stance detection; corpus expansion;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Computational stance detection-the task of determining, given an input text, the attitude (e.g., for or against) towards a particular target topic-usually makes use of annotated corpus as training data and, since possible topics are in principle unlimited, so is the need for new labelled datasets about every topic of interest. In order to overcome some of these challenges, the present work adapts to the stance prediction task an existing corpus expansion method that has been originally devised for sentiment analysis. The method is applied to a large (46K instances) Brazilian Portuguese corpus conveying stances towards six target topics of moral and/or political nature, achieving a substantial increase in the number of labelled instances. Results from both automatic and human evaluation suggest that adding semi-automatically labelled data to the corpus does not decrease accuracy, and that the majority of these labels are correct.
引用
收藏
页码:25 / 25
页数:1
相关论文
共 62 条
[1]   Stance detection on social media: State of the art and trends [J].
ALDayel, Abeer ;
Magdy, Walid .
INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (04)
[2]  
Allaway E, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P8913
[3]   Stance detection using diverse feature sets based on machine learning techniques [J].
Ayyub, Kashif ;
Iqbal, Saqib ;
Nisar, Muhammad Wasif ;
Ahmad, Saima Gulzar ;
Munir, Ehsan Ullah .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (05) :9721-9740
[4]   Semi-supervised Sentiment Annotation of Large Corpora [J].
Brum, Henrico Bertini ;
Volpe Nunes, Maria das Gracas .
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 :385-395
[5]   Applying Self-attention for Stance Classification [J].
Bugueno, Margarita ;
Mendoza, Marcelo .
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 :51-61
[6]   A Deep Learning Model of Common Sense Knowledge for Augmenting Natural Language Processing Tasks in Portuguese Language [J].
Carvalho, Cecilia Silvestre ;
Pinheiro, Vladia C. ;
Freire, Livio .
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020, 2020, 12037 :303-312
[7]  
Cavalheiro Lais Carraro Leme, 2023, P 14 INT C RECENT AD, P242, DOI 10.26615/978-954-452-092-2027
[8]   A multilingual dataset of COVID-19 vaccination attitudes on Twitter [J].
Chen, Ninghan ;
Chen, Xihui ;
Pang, Jun .
DATA IN BRIEF, 2022, 44
[9]  
Conforti C, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P1715
[10]  
Conforti Costanza, 2021, Em EACL Hackashop on News Media Content Analysis and Automated Report Generation, P1