PolyHope: Two-level hope speech detection from tweets

被引：17

作者：

Balouchzahi, Fazlourrahman ^{[1
]}

Sidorov, Grigori ^{[1
]}

Gelbukh, Alexander ^{[1
]}

机构：

[1] Inst Politecn Nacl IPN, Ctr Invest Comp CIC, Mexico City, Mexico

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2023年 / 225卷

关键词：

Hope; Wish; Desire; Expectation; Machine learning; Deep learning; Transformers; Natural Language Processing; OPTIMISM; NARRATIVES; ILLNESS;

D O I：

10.1016/j.eswa.2023.120078

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Hope is characterized as openness of spirit towards the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into "Hope"and "Not Hope", then into three fine-grained hope categories: "Generalized Hope", "Realistic Hope", and "Unrealistic Hope"(along with "Not Hope"). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using averaged-weighted and averaged-macro F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process yielded promising performance for simple machine learning classifiers with only uni-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.

引用

页数：13

共 60 条

[1]

Anusha MD, 2022, PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), P161

[2] Abusive language detection in youtube comments leveraging replies as conversational context [J].

Ashraf, Noman ;

Zubiaga, Arkaitz ;

Gelbukh, Alexander .

PEERJ COMPUTER SCIENCE, 2021, 7 (07)

[3]

Averill JR., 2012, RULES HOPE

[4]

Bailey TC, 2007, J POSIT PSYCHOL, V2, P168, DOI [10.1080/17439760701409546, DOI 10.1080/17439760701409546]

[5]

Balouchzahi F., 2021, P 1 WORKSH LANG TECH, P180

[6]

Balouchzahi F., 2021, CEUR Workshop Proc, V2936, P1829

[7]

Balouchzahi F, 2022, PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), P206

[8] Fake news spreaders profiling using N-grams of various types and SHAP-based feature selection [J].

Balouchzahi, Fazlourrahman ;

Sidorov, Grigori ;

Shashirekha, Hosahalli Lakshmaiah .

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) :4437-4448

[9] Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection [J].

Bevendorff, Janek ;

Chulvi, Berta ;

Sarracen, Gretel Liz De La Pena ;

Kestemont, Mike ;

Manjavacas, Enrique ;

Markov, Ilia ;

Mayerl, Maximilian ;

Potthast, Martin ;

Rangel, Francisco ;

Rosso, Paolo ;

Stamatatos, Efstathios ;

Stein, Benno ;

Wiegmann, Matti ;

Wolska, Magdalena ;

Zangerle, Eva .

EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, CLEF 2021, 2021, 12880 :419-431

[10] Distinguishing hope from optimism and related affective states [J].

Bruininks, Patricia ;

Malle, Bertram F. .

MOTIVATION AND EMOTION, 2005, 29 (04) :327-355

← 1 2 3 4 5 6 →