Mind meets machine: Unravelling GPT-4’s cognitive psychology

被引:0
作者
Dhingra S. [1 ,2 ]
Singh M. [3 ]
S.B. V. [3 ]
Malviya N. [4 ]
Gill S.S. [5 ]
机构
[1] Department of Psychology, Nowrosjee Wadia College, Pune
[2] Jindal Institute of Behavioural Sciences, O.P. Jindal Global University, Delhi-NCR
[3] Indian Institute of Tropical Meteorology, Pune
[4] Defence Institute of Advanced Technology, Pune
[5] School of Electronic Engineering and Computer Science, Queen Mary University of London, London
来源
BenchCouncil Transactions on Benchmarks, Standards and Evaluations | 2023年 / 3卷 / 03期
关键词
Artificial intelligence; ChatGPT; Cognitive psychology; GPT; Large language models;
D O I
10.1016/j.tbench.2023.100139
中图分类号
学科分类号
摘要
Cognitive psychology delves on understanding perception, attention, memory, language, problem-solving, decision-making, and reasoning. Large Language Models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. The recent development in the form of Generative Pre-trained Transformer 4 (GPT-4) and its demonstrated success in tasks complex to humans exam and complex problems has led to an increased confidence in the LLMs to become perfect instruments of intelligence. Although GPT-4 report has shown performance on some cognitive psychology tasks, a comprehensive assessment of GPT-4, via the existing well-established datasets is required. In this study, we focus on the evaluation of GPT-4’s performance on a set of cognitive psychology datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. In doing so, we understand how GPT-4 processes and integrates cognitive psychology with contextual information, providing insight into the underlying cognitive processes that enable its ability to generate the responses. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models. Our results strengthen the already available assessments and confidence on GPT-4’s cognitive psychology abilities. It has significant potential to revolutionise the field of Artificial Intelligence (AI), by enabling machines to bridge the gap between human and machine reasoning. © 2023 The Authors
引用
收藏
相关论文
共 30 条
  • [1] Nunez R., Allen M., Gao R., Miller Rigoli C., Relaford-Doyle J., Semenuks A., What happened to cognitive science?, Nat. Hum. Behav., 3, 8, pp. 782-791, (2019)
  • [2] Barsalou L.W., Cognitive Psychology: an Overview for Cognitive Scientists, (2014)
  • [3] Frank M.C., Baby steps in evaluating the capacities of large language models, Nat. Rev. Psychol., pp. 1-2, (2023)
  • [4] Gill S.S., Xu M., Ottaviani C., Patros P., Bahsoon R., Shaghaghi A., Golec M., Stankovski V., Wu H., Abraham A., Et al., AI for next generation computing: Emerging trends and future directions, Internet Things, 19, (2022)
  • [5] Harrer S., Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine, eBioMedicine, 90, (2023)
  • [6] Brown T., Mann B., Ryder N., Subbiah M., Kaplan J.D., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Et al., Language models are few-shot learners, Adv. Neural Inf. Process. Syst., 33, pp. 1877-1901, (2020)
  • [7] Devlin J., Chang M.-W., Lee K., Toutanova K., Bert: Pre-training of deep bidirectional transformers for language understanding, (2018)
  • [8] Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I., Et al., Language models are unsupervised multitask learners, OpenAI Blog, 1, 8, (2019)
  • [9] Brundage M., Avin S., Clark J., Toner H., Eckersley P., Garfinkel B., Dafoe A., Scharre P., Zeitzoff T., Filar B., Et al., The malicious use of artificial intelligence: Forecasting, prevention, and mitigation, (2018)
  • [10] Lievin V., Hother C.E., Winther O., Can large language models reason about medical questions?, (2022)