GPT-4 passes the bar exam

被引:85
|
作者
Katz, Daniel Martin [1 ,2 ,3 ,4 ]
Bommarito, Michael James [1 ,2 ,3 ,4 ]
Gao, Shang [5 ]
Arredondo, Pablo [2 ,5 ]
机构
[1] Chicago Kent Coll Law, Illinois Tech, Chicago, IL 60661 USA
[2] Stanford Ctr Legal Informat, CodeX, Stanford, CA USA
[3] Bucerius Law Sch, Hamburg, Germany
[4] 273 Ventures LLC, Woburn, MA USA
[5] Casetext Inc, Herndon, VA USA
来源
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES | 2024年 / 382卷 / 2270期
关键词
large language models; Bar Exam; GPT-4; legal services; legal complexity; legal language; LAW;
D O I
10.1098/rsta.2023.0254
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, we experimentally evaluate the zero-shot performance of GPT-4 against prior generations of GPT on the entire uniform bar examination (UBE), including not only the multiple-choice multistate bar examination (MBE), but also the open-ended multistate essay exam (MEE) and multistate performance test (MPT) components. On the MBE, GPT-4 significantly outperforms both human test-takers and prior models, demonstrating a 26% increase over ChatGPT and beating humans in five of seven subject areas. On the MEE and MPT, which have not previously been evaluated by scholars, GPT-4 scores an average of 4.2/6.0 when compared with much lower scores for ChatGPT. Graded across the UBE components, in the manner in which a human test-taker would be, GPT-4 scores approximately 297 points, significantly in excess of the passing threshold for all UBE jurisdictions. These findings document not just the rapid and remarkable advance of large language model performance generally, but also the potential for such models to support the delivery of legal services in society.This article is part of the theme issue 'A complexity science approach to law and governance'.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination
    Hirano, Yuichiro
    Hanaoka, Shouhei
    Nakao, Takahiro
    Miki, Soichiro
    Kikuchi, Tomohiro
    Nakamura, Yuta
    Nomura, Yukihiro
    Yoshikawa, Takeharu
    Abe, Osamu
    JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (08) : 918 - 926
  • [22] GPT-4带来的变化与挑战
    贵重
    李云翔
    王光涛
    电信工程技术与标准化, 2023, 36 (04) : 17 - 19
  • [23] The Emotional Intelligence of the GPT-4 Large Language Model
    Vzorin, Gleb D.
    Bukinich, Alexey M.
    Sedykh, Anna V.
    Vetrova, Irina I.
    Sergienko, Elena A.
    PSYCHOLOGY IN RUSSIA-STATE OF THE ART, 2024, 17 (02): : 85 - 99
  • [24] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
    He, Yongbin
    Tang, Haifeng
    Wang, Dongxue
    Gu, Shuqin
    Ni, Guoxin
    Wu, Haiyang
    ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (07) : 1362 - 1365
  • [25] Performance of Novel GPT-4 in Otolaryngology Knowledge Assessment
    Revercomb, Lucy
    Patel, Aman M.
    Fu, Daniel
    Filimonov, Andrey
    INDIAN JOURNAL OF OTOLARYNGOLOGY AND HEAD & NECK SURGERY, 2024, 76 (06) : 6112 - 6114
  • [26] ChatGPT and Patient Information in Nuclear Medicine: GPT-3.5 Versus GPT-4
    Currie, Geoff
    Robbie, Stephanie
    Tually, Peter
    JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) : 307 - 313
  • [27] An exploratory assessment of GPT-4o and GPT-4 performance on the Japanese National Dental Examination
    Morishita, Masaki
    Fukuda, Hikaru
    Yamaguchi, Shino
    Muraoka, Kosuke
    Nakamura, Taiji
    Hayashi, Masanari
    Yoshioka, Izumi
    Ono, Kentaro
    Awano, Shuji
    SAUDI DENTAL JOURNAL, 2024, 36 (12) : 1577 - 1581
  • [28] Assessing the Performance of GPT-3.5 and GPT-4 on the 2023 Japanese Nursing Examination
    Kaneda, Yudai
    Takahashi, Ryo
    Kaneda, Uiri
    Akashima, Shiori
    Okita, Haruna
    Misaki, Sadaya
    Yamashiro, Akimi
    Ozaki, Akihiko
    Tanimoto, Tetsuya
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (08)
  • [29] Evaluating the performance of GPT-3.5, GPT-4, and GPT-4o in the Chinese National Medical Licensing Examination
    Dingyuan Luo
    Mengke Liu
    Runyuan Yu
    Yulian Liu
    Wenjun Jiang
    Qi Fan
    Naifeng Kuang
    Qiang Gao
    Tao Yin
    Zuncheng Zheng
    Scientific Reports, 15 (1)
  • [30] Integrating AI in Lipedema Management: Assessing the Efficacy of GPT-4 as a Consultation Assistant
    Leypold, Tim
    Lingens, Lara F.
    Beier, Justus P.
    Boos, Anja M.
    LIFE-BASEL, 2024, 14 (05):