Can a Conversational Agent Pass Theory-of-Mind Tasks? A Case Study of ChatGPT with the Hinting, False Beliefs, and Strange Stories Paradigms

被引:1
作者
Brunet-Gouet, Eric [1 ,2 ]
Vidal, Nathan [2 ]
Roux, Paul [1 ,2 ]
机构
[1] Ctr Hosp Versailles, Serv Hosp Univ Psychiat Adultes & Addictol, Le Chesnay, France
[2] Univ Versailles St Quentin En Yvelines, Univ Paris Saclay, INSERM, DisAP,DevPsy,CESP,UMR1018, F-94807 Villejuif, France
来源
HUMAN AND ARTIFICIAL RATIONALITIES, HAR 2023 | 2024年 / 14522卷
关键词
large language model; ChatGPT; theory-of-mind; indirect speech; False Beliefs; SCHIZOPHRENIA; SYMPTOMATOLOGY; COMMUNICATION; PEOPLE;
D O I
10.1007/978-3-031-55245-8_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the possibility that the recently proposed OpenAI's ChatGPT conversational agent could be examined with classical theory-of-mind paradigms. We used an indirect speech understanding task, the hinting task, a new text version of a False Belief/False Photographs paradigm, and the Strange Stories paradigm. The hinting task is usually used to assess individuals with autism or schizophrenia by requesting them to infer hidden intentions from short conversations involving two characters. In a first experiment, ChatGPT 3.5 exhibits quite limited performances on the Hinting task when either original scoring or revised rating scales are used. We introduced slightly modified versions of the hinting task in which either cues about the presence of a communicative intention were added or a specific question about the character's intentions were asked. Only the latter demonstrated enhanced performances. No dissociation between the conditions was found. The Strange Stories were associated with correct performances but we could not be sure that the algorithm had no prior knowledge of the test. In the second experiment, the most recent version of ChatGPT (4-0314) exhibited better performances in the Hinting task, although they did not match the average scores of healthy subjects. In addition, the model could solve first and second order False Beliefs tests but failed on items with reference to a physical property like object visibility or more complex inferences. This work offers an illustration of the possible application of psychological constructs and paradigms to a conversational agent of a radically new nature.
引用
收藏
页码:107 / 126
页数:20
相关论文
共 28 条
  • [1] False-belief understanding in infants
    Baillargeon, Rene
    Scott, Rose M.
    He, Zijing
    [J]. TRENDS IN COGNITIVE SCIENCES, 2010, 14 (03) : 110 - 118
  • [2] DOES THE AUTISTIC-CHILD HAVE A THEORY OF MIND
    BARONCOHEN, S
    LESLIE, AM
    FRITH, U
    [J]. COGNITION, 1985, 21 (01) : 37 - 46
  • [3] Scale for the evaluation of communication disorders in patients with schizophrenia:: A validation study
    Bazin, N
    Sarfati, Y
    Lefrère, F
    Passerieux, C
    Hardy-Baylé, MC
    [J]. SCHIZOPHRENIA RESEARCH, 2005, 77 (01) : 75 - 84
  • [4] On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
    Bender, Emily M.
    Gebru, Timnit
    McMillan-Major, Angelina
    Shmitchell, Shmargaret
    [J]. PROCEEDINGS OF THE 2021 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2021, 2021, : 610 - 623
  • [5] Why Pragmatics and Theory of Mind Do Not (Completely) Overlap
    Bosco, Francesca M.
    Tirassa, Maurizio
    Gabbatore, Ilaria
    [J]. FRONTIERS IN PSYCHOLOGY, 2018, 9
  • [6] Machine Learning Interpretability: A Survey on Methods and Metrics
    Carvalho, Diogo, V
    Pereira, Eduardo M.
    Cardoso, Jaime S.
    [J]. ELECTRONICS, 2019, 8 (08)
  • [7] SCHIZOPHRENIA, SYMPTOMATOLOGY AND SOCIAL INFERENCE - INVESTIGATING THEORY OF MIND IN PEOPLE WITH SCHIZOPHRENIA
    CORCORAN, R
    MERCER, G
    FRITH, CD
    [J]. SCHIZOPHRENIA RESEARCH, 1995, 17 (01) : 5 - 13
  • [8] Dou Z, 2023, PREPRINT, DOI [10.31219/osf.io/8r3ma, DOI 10.31219/OSF.IO/8R3MA]
  • [9] Frith C.D., 1992, COGN NEUROPSYCHOL, DOI DOI 10.4324/9781315785011
  • [10] Gozalo-Brizuela R, 2023, arXiv, DOI DOI 10.48550/ARXIV.2301.04655