ChatGPT-4 in the Turing Test

被引：0

作者：

Echavarria, Ricardo Restrepo ^{[1
]}

机构：

[1] Univ Tecn Manabi, Dept Ciencias Sociales & Comportamiento, Portoviejo, Ecuador

来源：

MINDS AND MACHINES | 2025年 / 35卷 / 01期

关键词：

Turing test; ChatGPT; Artificial intelligence; Science; Thinking; Intelligence; COMPUTERS;

D O I：

10.1007/s11023-025-09711-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

There has been considerable optimistic speculation on how well ChatGPT-4 would perform in a Turing Test. However, no minimally serious implementation of the test has been reported to have been carried out. This brief note documents the results of subjecting ChatGPT-4 to 10 Turing Tests, with different interrogators and participants. The outcome is tremendously disappointing for the optimists. Despite ChatGPT reportedly outperforming 99.9% of humans in a Verbal IQ test, it falls short of passing the Turing Test. In 9 out of the 10 tests conducted, the interrogators successfully identified ChatGPT-4 and the human participant. The probability of obtaining this result from a process in which the interrogator is really no better than chance at correct identification is calculated to be less than 1%. An additional question was posed to the interrogators at the end of each test: What led them to distinguish between the human and the machine? The interrogators, who effectively filtered out ChatGPT-4 from passing the Turing Test for intelligence, stated that they could identify the machine because it, in effect, responded more intelligently than the human. Subsequently, ChatGPT-4 was tasked with differentiating syntax from semantics and self-corrected when falling for the fallacy of equivocation. The curious situation is arrived at that passing the Turing Test for intelligence remains a challenge that ChatGPT-4 has yet to overcome, precisely because, as per the interrogators, its intellectual abilities surpass those of individual humans.

引用

页数：10

共 50 条

[21] Using ChatGPT-4 to Teach the Design of Data Visualizations
Lear, Benjamin J.
JOURNAL OF CHEMICAL EDUCATION, 2024, 101 (07) : 2749 - 2756
[22] The impact of history of depression and access to weapons on suicide risk assessment: a comparison of ChatGPT-3.5 and ChatGPT-4
Shinan-Altman, Shiri
Elyoseph, Zohar
Levkovich, Inbar
PEERJ, 2024, 12
[23] Exploring the Potential of ChatGPT-4 in Predicting Refractive Surgery Categorizations: Comparative Study
Cirkovic, Aleksandar
Katz, Toam
JMIR FORMATIVE RESEARCH, 2023, 7
[24] The performance of ChatGPT-4 and Bing Chat in frequently asked questions about glaucoma
Dogan, Levent
Yilmaz, Ibrahim Edhem
EUROPEAN JOURNAL OF OPHTHALMOLOGY, 2025,
[25] Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4
Martin Krusche
Johnna Callhoff
Johannes Knitza
Nikolas Ruffer
Rheumatology International, 2024, 44 : 303 - 306
[26] Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4
Krusche, Martin
Callhoff, Johnna
Knitza, Johannes
Ruffer, Nikolas
RHEUMATOLOGY INTERNATIONAL, 2024, 44 (02) : 303 - 306
[27] Comparative Efficacy of AI LLMs in Clinical Social Work: ChatGPT-4, Gemini, Copilot
Tepe, Hacer Taskiran
Aslanturk, Husnunur
RESEARCH ON SOCIAL WORK PRACTICE, 2025,
[28] Enhancing dermatological diagnosis with artificial intelligence: a comparative study of ChatGPT-4 and Google Lens
Praveenraj, T.
Mitra, Debdeep
Nagaraju, Pramodh K.
Sirohi, Gulshan K.
Kandinhapally, Sanoj Periyadan
INTERNATIONAL JOURNAL OF DERMATOLOGY, 2024, 63 (11) : e369 - e372
[29] The turing triage test
Sparrow R.
Ethics and Information Technology, 2004, 6 (4) : 203 - 213
[30] Evaluating the potential of ChatGPT-4 in ophthalmology: The good, the bad and the ugly
Khanna, R. K.
Ducloyer, J. -B.
Hage, A.
Rezkallah, A.
Durbant, E.
Bigoteau, M.
Mouchel, R.
Guillon-Rolf, R.
Le, L.
Tahiri, R.
Chammas, J.
Baudouin, C.
JOURNAL FRANCAIS D OPHTALMOLOGIE, 2023, 46 (07): : 697 - 705

← 1 2 3 4 5 →