Distinguishing Human From Machine: A Review of Advances and Challenges in AI-Generated Text Detection

被引:0
作者
Fariello, Serena [1 ]
Fenza, Giuseppe [1 ]
Forte, Flavia [1 ]
Gallo, Mariacristina [1 ]
Marotta, Martina [1 ]
机构
[1] Univ Salerno, Dept Management & Innovat Syst, I-84084 Fisciano, SA, Italy
关键词
Generated-Text Detection; AI-Detection; Large Language Models (LLMs); Literature Review; Survey; CARE;
D O I
10.9781/ijimai.2024.12.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rise of Large Language Models (LLMs) has dramatically altered the generation and spreading of textual content. This advancement offers benefits in various domains, including medicine, education, law, coding, and journalism, but also has negative implications, mainly related to ethical concerns. Preventing measures to mitigate negative implications pass through solutions that distinguish machine-generated text from humanwritten text. This study aims to provide a comprehensive review of existing literature for detecting LLMgenerated texts. Emerging techniques are categorized into five categories: watermarking, feature-based, neural-based, hybrid, and human-aided methods. For each introduced category, strengths and limitations are discussed, providing insights into their effectiveness and potential for future improvements. Moreover, available datasets and tools are introduced. Results demonstrate that, despite the good delimited performance, the multitude of languages to recognize, hybrid texts, the continuous improvement of algorithms for text generation and the lack of regulation require additional efforts for efficient detection.
引用
收藏
页码:6 / 18
页数:191
相关论文
共 106 条
[1]   Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding [J].
Abdelnabi, Sahar ;
Fritz, Mario .
2021 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP, 2021, :121-140
[2]  
Aich Ankit, 2022, P 29 INT C COMPUTATI, P6586
[3]   Generative Artificial Intelligence in Education: From Deceptive to Disruptive [J].
Alier, Marc ;
Garcia-Penalvo, Francisco Jose ;
Camba, Jorge D. .
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (05) :5-14
[4]   DBpedia: A nucleus for a web of open data [J].
Auer, Soeren ;
Bizer, Christian ;
Kobilarov, Georgi ;
Lehmann, Jens ;
Cyganiak, Richard ;
Ives, Zachary .
SEMANTIC WEB, PROCEEDINGS, 2007, 4825 :722-+
[5]  
Bakhtin A., 2019, arXiv
[6]   Computer-Generated Text Detection Using Machine Learning: A Systematic Review [J].
Beresneva, Daria .
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2016, 2016, 9612 :421-426
[7]  
Bhattacharjee A., 2024, ARXIV
[8]  
Bien M., 2020, P 13 INT C NAT LANG, P22, DOI DOI 10.18653/V1/2020.INLG-1.4
[9]  
Bojar O., 2016, Wmt-2016, V2, P131
[10]  
Bowman S.R., 2015, C P EMNLP 2015 C EMP, P632