Classification of Human and Machine-Generated Texts Using Lexical Features and Supervised/Unsupervised Machine Learning Algorithms

被引:0
|
作者
Rojas-Simon, Jonathan [1 ]
Ledeneva, Yulia [1 ]
Arnulfo Garcia-Hernandez, Rene [1 ]
机构
[1] Autonomous Univ State Mexico, Inst Literario 100, Toluca 50000, State Of Mexico, Mexico
来源
关键词
Large-Language Models (LLMs); AuTexTification; Lexical Features; Supervised/Unsupervised Learning Algorithms; Text representation models;
D O I
10.1007/978-3-031-62836-8_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's digital information era, distinguishing between human- and machine-generated texts has become a focus of study in academia and industry. This is because Large-Language Models (LLMs) can produce high-quality texts, posing a challenge to the legitimacy and authenticity of texts. In this regard, it is essential to create methods and models that can differentiate whether a human or an LLM wrote a text. Therefore, this paper explores the effectiveness of supervised and unsupervised machine learning algorithms using lexical features. Mainly, we focused on traditional algorithms, such as Multilayer Perceptron (MLP), Naive Bayes (NB), Logistic Regression (LR), Agglomerative Hierarchical Clustering (AHC), and K-means Clustering (KC). Obtained results have been compared to state-of-the-art approaches presented in the Automated Text Identification (AuTexTification) shared task, serving as reference methods. Moreover, we have found that both NB and KC may achieve competitive results in the before-mentioned task.
引用
收藏
页码:331 / 341
页数:11
相关论文
共 50 条
  • [41] Application of supervised machine learning algorithms for the classification of regulatory RNA riboswitches
    Singh, Swadha
    Singh, Raghvendra
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2017, 16 (02) : 99 - 105
  • [42] IP traffic classification in NFV: a benchmarking of supervised Machine Learning algorithms
    Vergara-Reyes, Juliana
    Camila Martinez-Ordonez, Maria
    Ordonez, Armando
    Caicedo Rendon, Oscar Mauricio
    2017 IEEE COLOMBIAN CONFERENCE ON COMMUNICATIONS AND COMPUTING (COLCOM), 2017,
  • [43] A Comparison of Supervised Machine Learning Algorithms for Classification of Communications Network Traffic
    Perera, Pramitha
    Tian, Yu-Chu
    Fidge, Colin
    Kelly, Wayne
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 445 - 454
  • [44] Human Gait Patterns Classification based on MEMS Data using Unsupervised and Supervised Learning Algorithms
    Nguyen, My N.
    Zao, John Kar-Kin
    Thanh Hai Nguyen
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 405 - 409
  • [45] Melt Instability Identification Using Unsupervised Machine Learning Algorithms
    Gansen, Alex
    Hennicker, Julian
    Sill, Clemens
    Dheur, Jean
    Hale, Jack S. S.
    Baller, Jorg
    MACROMOLECULAR MATERIALS AND ENGINEERING, 2023, 308 (06)
  • [46] Classification of SURF Image Features by Selected Machine Learning Algorithms
    Horak, Karel
    Klecka, Jan
    Bostik, Ondrej
    Davidek, Daniel
    2017 40TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2017, : 636 - 641
  • [47] Surface roughness discrimination using unsupervised machine learning algorithms
    Qin, Longhui
    Zhang, Yilei
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 854 - 857
  • [48] Machine-Generated Questions Attract InstructorsWhen Acquainted with Learning Objectives
    Shimmei, Machi
    Bier, Norman
    Matsuda, Noboru
    ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2023, 2023, 13916 : 3 - 15
  • [49] Zero-phase angle asteroid taxonomy classification using unsupervised machine learning algorithms☆
    Colazo, M.
    Alvarez-Candal, A.
    Duffard, R.
    ASTRONOMY & ASTROPHYSICS, 2022, 666
  • [50] Zero-phase angle asteroid taxonomy classification using unsupervised machine learning algorithms
    Colazo, M.
    Alvarez-Candal, A.
    Duffard, R.
    Astronomy and Astrophysics, 2022, 666