Cyberbullying Detection using BERT for Telugu Language

被引：0

作者：

Talasila, Sri Lakshmi ^{[1
]}

Kothuri, Dharani Priya ^{[1
]}

Manchiraju, Savithri Jahnavi ^{[1
]}

Mallavalli, Mutyala Sai Sasank ^{[1
]}

Dande, Lourdu Gnana Harshith ^{[1
]}

机构：

[1] Prasad V Potluri Siddhartha Inst Technol, Comp Sci & Engn, Vijayawada, India

来源：

2024 4TH INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2024 | 2024年

关键词：

Cyberbullying; Telugu; Bidirectional Encoder Representations from Transformers (BERT); Bullying Preprocessing; Harassment; Language; Social Media;

D O I：

10.1109/ICPCSN62568.2024.00077

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The rapid proliferation of online communication has introduced cyberbullying as a significant concern affecting individuals' well-being. Existing research employs various techniques like Tf-Idf, XLM-RoBERTa, and machine learning algorithms such as Logistic Regression, Random Forest, and Naive Bayes to detect cyberbullying across mixed and bilingual languages. However, these approaches often struggle with accuracy and fail to effectively discern cyberbullying instances due to language nuances and context misinterpretation. Key challenges faced by previous systems include limited linguistic coverage, contextual understanding, and nuanced interpretation of cyberbullying. The new advancement to address these challenges is the implementation of BERT (Bidirectional Encoder Representations from Transformers) architecture by leveraging bidirectional context understanding, allowing it to capture subtle linguistic nuances and contextual cues, thereby improving accuracy and contextual understanding. The proposed model is advancing further by integrating specialized models like IndicBERT, specifically tailored for languages like Telugu. By focusing on contextual nuances, our model aims to improve precision and accuracy of cyberbullying detection for a local language, Telugu content. This study has developed a local language, Telugu dataset comprising 27,000 sentences and achieve an accuracy rate of 90%, highlighting the efficacy of our approach in overcoming these challenges and contributing to online safety.

引用

页码：454 / 461

页数：8

共 50 条

[21] Image cyberbullying detection and recognition using transfer deep machine learning
Almomani A.
Nahar K.
Alauthman M.
Al-Betar M.A.
Yaseen Q.
Gupta B.B.
International Journal of Cognitive Computing in Engineering, 2024, 5 : 14 - 26
[22] Cyberbullying detection using deep transfer learning
Pradeep Kumar Roy
Fenish Umeshbhai Mali
Complex & Intelligent Systems, 2022, 8 : 5449 - 5467
[23] Cyberbullying Detection in Twitter Using Sentiment Analysis
Theng, Chong Poh
Othman, Nur Fadzilah
Abdullah, Raihana Syahirah
Anawar, Syarulnaziah
Ayop, Zakiah
Ramli, Sofia Najwa
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (11): : 1 - 10
[24] Detection of Cyberbullying Using Deep Neural Network
Banerjee, Vijay
Telavane, Jui
Gaikwad, Pooja
Vartak, Pallavi
2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 604 - 607
[25] Cyberbullying Detection using Time Series Modeling
Potha, Nektaria
Maragoudakis, Manolis
2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 373 - 382
[26] Cyberbullying detection using deep transfer learning
Roy, Pradeep Kumar
Mali, Fenish Umeshbhai
COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (06) : 5449 - 5467
[27] Telugu Movie Review Sentiment Analysis Using Natural Language Processing Approach
Badugu, Srinivasu
DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT-2K19, 2020, 1079 : 685 - 695
[28] Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language
Leon-Paredes, Gabriel A.
Palomeque-Leon, Wilson F.
Gallegos-Segovia, Pablo L.
Vintimilla-Tapia, Paul E.
Bravo-Torres, Jack F.
Barbosa-Santillan, Liliana, I
Paredes-Pinos, Maria M.
2019 IEEE CHILEAN CONFERENCE ON ELECTRICAL, ELECTRONICS ENGINEERING, INFORMATION AND COMMUNICATION TECHNOLOGIES (CHILECON), 2019,
[29] Pashto offensive language detection: a benchmark dataset and monolingual Pashto BERT
Haq, Ijazul
Qiu, Weidong
Guo, Jie
Tang, Peng
PEERJ COMPUTER SCIENCE, 2023, 9
[30] Pashto offensive language detection: a benchmark dataset and monolingual Pashto BERT
Haq I.
Qiu W.
Guo J.
Tang P.
PeerJ Computer Science, 2023, 9 : 1 - 26

← 1 2 3 4 5 →