Mind meets machine: Unravelling GPT-4’s cognitive psychology

被引：0

作者：

Dhingra S. ^{[1
,2
]}

Singh M. ^{[3
]}

S.B. V. ^{[3
]}

Malviya N. ^{[4
]}

Gill S.S. ^{[5
]}

机构：

[1] Department of Psychology, Nowrosjee Wadia College, Pune

[2] Jindal Institute of Behavioural Sciences, O.P. Jindal Global University, Delhi-NCR

[3] Indian Institute of Tropical Meteorology, Pune

[4] Defence Institute of Advanced Technology, Pune

[5] School of Electronic Engineering and Computer Science, Queen Mary University of London, London

来源：

BenchCouncil Transactions on Benchmarks, Standards and Evaluations | 2023年 / 3卷 / 03期

关键词：

Artificial intelligence; ChatGPT; Cognitive psychology; GPT; Large language models;

D O I：

10.1016/j.tbench.2023.100139

中图分类号：

学科分类号：

摘要：

Cognitive psychology delves on understanding perception, attention, memory, language, problem-solving, decision-making, and reasoning. Large Language Models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. The recent development in the form of Generative Pre-trained Transformer 4 (GPT-4) and its demonstrated success in tasks complex to humans exam and complex problems has led to an increased confidence in the LLMs to become perfect instruments of intelligence. Although GPT-4 report has shown performance on some cognitive psychology tasks, a comprehensive assessment of GPT-4, via the existing well-established datasets is required. In this study, we focus on the evaluation of GPT-4’s performance on a set of cognitive psychology datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. In doing so, we understand how GPT-4 processes and integrates cognitive psychology with contextual information, providing insight into the underlying cognitive processes that enable its ability to generate the responses. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models. Our results strengthen the already available assessments and confidence on GPT-4’s cognitive psychology abilities. It has significant potential to revolutionise the field of Artificial Intelligence (AI), by enabling machines to bridge the gap between human and machine reasoning. © 2023 The Authors

引用

共 30 条

[1] Nunez R., Allen M., Gao R., Miller Rigoli C., Relaford-Doyle J., Semenuks A., What happened to cognitive science?, Nat. Hum. Behav., 3, 8, pp. 782-791, (2019)
[2] Barsalou L.W., Cognitive Psychology: an Overview for Cognitive Scientists, (2014)
[3] Frank M.C., Baby steps in evaluating the capacities of large language models, Nat. Rev. Psychol., pp. 1-2, (2023)
[4] Gill S.S., Xu M., Ottaviani C., Patros P., Bahsoon R., Shaghaghi A., Golec M., Stankovski V., Wu H., Abraham A., Et al., AI for next generation computing: Emerging trends and future directions, Internet Things, 19, (2022)
[5] Harrer S., Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine, eBioMedicine, 90, (2023)
[6] Brown T., Mann B., Ryder N., Subbiah M., Kaplan J.D., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Et al., Language models are few-shot learners, Adv. Neural Inf. Process. Syst., 33, pp. 1877-1901, (2020)
[7] Devlin J., Chang M.-W., Lee K., Toutanova K., Bert: Pre-training of deep bidirectional transformers for language understanding, (2018)
[8] Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I., Et al., Language models are unsupervised multitask learners, OpenAI Blog, 1, 8, (2019)
[9] Brundage M., Avin S., Clark J., Toner H., Eckersley P., Garfinkel B., Dafoe A., Scharre P., Zeitzoff T., Filar B., Et al., The malicious use of artificial intelligence: Forecasting, prevention, and mitigation, (2018)
[10] Lievin V., Hother C.E., Winther O., Can large language models reason about medical questions?, (2022)

← 1 2 3 →