Decoding the AI's Gaze: Unraveling ChatGPT's Evaluation of Poetic Creativity

被引：0

作者：

Fischer, Nina ^{[1
]}

Dischinger, Emma ^{[1
]}

Gunser, Vivian Emily ^{[1
]}

机构：

[1] Leibniz Inst Wissensmedien, Schleichstr 6, D-72076 Tubingen, Germany

来源：

HCI INTERNATIONAL 2024 POSTERS, PT VII, HCII 2024 | 2024年 / 2120卷

关键词：

Turing Test; ChatGPT; Linguistic Markers; Heuristics;

D O I：

10.1007/978-3-031-62110-9_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As artificial intelligence (AI) technology advances, it becomes increasingly challenging to distinguish between human-written and AI-generated poetry. In this explorative study, we employ a novel approach to the Turing Test, traditionally used to evaluate a machine's ability to exhibit human-like intelligence. We use the Reverse Turing Test to classify poems as either human-written or AI-generated, with the AI (ChatGPT, GPT-4) serving as the evaluator. The AI analyzed 18 poems, half of which were in their original form (authored by classic poets), and half were with AI-generated continuations created using a pretrained GPT-3 model. However, despite ChatGPT's extensive training on a vast textual database, its classification performance on human and AI-generated poems did not surpass random guessing. This raises questions about the AI's ability to accurately distinguish between the two, particularly with original texts from renowned classic authors. The qualitative analysis reveals significant disparities in the evaluation of human-classified versus AI-classified poems, demonstrating a bias where the same markers, such as consistency, coherence, and fluency, are viewed positively in poems classified as human-written but negatively in poems classified as AI-generated. The results indicate a consistent pattern in the assessment criteria for both human- and AI-classified poems, often neglecting their actual textual characteristics. This study not only highlights the current challenges in differentiating between human and AI generated poetry but also provides insights into the cues and heuristics AI uses for such classifications.

引用

页码：186 / 197

页数：12

共 27 条

[1]

Brown TB, 2020, ADV NEUR IN, V33

[2] Semantics derived automatically from language corpora contain human-like biases [J].

Caliskan, Aylin ;

Bryson, Joanna J. ;

Narayanan, Arvind .

SCIENCE, 2017, 356 (6334) :183-186

[3]

Chen YT, 2023, Arxiv, DOI arXiv:2305.07969

[4]

Clark Elizabeth, 2021, arXiv, DOI [DOI 10.48550/ARXIV.2107.00061, DOI 10.1371/JOURNAL.PONE.0025085, 10.48550/arXiv.2107.00061]

[5] Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err [J].

Dietvorst, Berkeley J. ;

Simmons, Joseph P. ;

Massey, Cade .

JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2015, 144 (01) :114-126

[6] Human-Centered Explainable AI (HCXAI): Beyond Opening the Black-Box of AI [J].

Ehsan, Upol ;

Wintersberger, Philipp ;

Liao, Q. Vera ;

Watkins, Elizabeth Anne ;

Manger, Carina ;

Daume, Hal, III ;

Riener, Andreas ;

Riedl, Mark O. .

EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,

[7]

Ferrara E, 2023, Arxiv, DOI [arXiv:2304.07683, 10.48550/arXiv.2304.07683, DOI 10.48550/ARXIV.2304.07683, 10.48550/ARXIV.2304.07683]

[8]

Gunser VE, 2022, PROCEEDINGS OF THE FIRST WORKSHOP ON INTELLIGENT AND INTERACTIVE WRITING ASSISTANTS (IN2WRITING 2022), P60

[9] Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT [J].

Hagendorff, Thilo ;

Fabi, Sarah ;

Kosinski, Michal .

NATURE COMPUTATIONAL SCIENCE, 2023, 3 (10) :833-+

[10]

Hayawi K, 2023, Arxiv, DOI arXiv:2307.12166

← 1 2 3 →