GPT-4 passes the bar exam

被引：85

作者：

Katz, Daniel Martin ^{[1
,2
,3
,4
]}

Bommarito, Michael James ^{[1
,2
,3
,4
]}

Gao, Shang ^{[5
]}

Arredondo, Pablo ^{[2
,5
]}

机构：

[1] Chicago Kent Coll Law, Illinois Tech, Chicago, IL 60661 USA

[2] Stanford Ctr Legal Informat, CodeX, Stanford, CA USA

[3] Bucerius Law Sch, Hamburg, Germany

[4] 273 Ventures LLC, Woburn, MA USA

[5] Casetext Inc, Herndon, VA USA

来源：

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES | 2024年 / 382卷 / 2270期

关键词：

large language models; Bar Exam; GPT-4; legal services; legal complexity; legal language; LAW;

D O I：

10.1098/rsta.2023.0254

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

In this paper, we experimentally evaluate the zero-shot performance of GPT-4 against prior generations of GPT on the entire uniform bar examination (UBE), including not only the multiple-choice multistate bar examination (MBE), but also the open-ended multistate essay exam (MEE) and multistate performance test (MPT) components. On the MBE, GPT-4 significantly outperforms both human test-takers and prior models, demonstrating a 26% increase over ChatGPT and beating humans in five of seven subject areas. On the MEE and MPT, which have not previously been evaluated by scholars, GPT-4 scores an average of 4.2/6.0 when compared with much lower scores for ChatGPT. Graded across the UBE components, in the manner in which a human test-taker would be, GPT-4 scores approximately 297 points, significantly in excess of the passing threshold for all UBE jurisdictions. These findings document not just the rapid and remarkable advance of large language model performance generally, but also the potential for such models to support the delivery of legal services in society.This article is part of the theme issue 'A complexity science approach to law and governance'.

引用

页数：17

共 50 条

[21] GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination
Hirano, Yuichiro
Hanaoka, Shouhei
Nakao, Takahiro
Miki, Soichiro
Kikuchi, Tomohiro
Nakamura, Yuta
Nomura, Yukihiro
Yoshikawa, Takeharu
Abe, Osamu
JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (08) : 918 - 926
[22] GPT-4带来的变化与挑战
贵重
李云翔
王光涛
电信工程技术与标准化, 2023, 36 (04) : 17 - 19
[23] The Emotional Intelligence of the GPT-4 Large Language Model
Vzorin, Gleb D.
Bukinich, Alexey M.
Sedykh, Anna V.
Vetrova, Irina I.
Sergienko, Elena A.
PSYCHOLOGY IN RUSSIA-STATE OF THE ART, 2024, 17 (02): : 85 - 99
[24] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
He, Yongbin
Tang, Haifeng
Wang, Dongxue
Gu, Shuqin
Ni, Guoxin
Wu, Haiyang
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (07) : 1362 - 1365
[25] Performance of Novel GPT-4 in Otolaryngology Knowledge Assessment
Revercomb, Lucy
Patel, Aman M.
Fu, Daniel
Filimonov, Andrey
INDIAN JOURNAL OF OTOLARYNGOLOGY AND HEAD & NECK SURGERY, 2024, 76 (06) : 6112 - 6114
[26] ChatGPT and Patient Information in Nuclear Medicine: GPT-3.5 Versus GPT-4
Currie, Geoff
Robbie, Stephanie
Tually, Peter
JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) : 307 - 313
[27] An exploratory assessment of GPT-4o and GPT-4 performance on the Japanese National Dental Examination
Morishita, Masaki
Fukuda, Hikaru
Yamaguchi, Shino
Muraoka, Kosuke
Nakamura, Taiji
Hayashi, Masanari
Yoshioka, Izumi
Ono, Kentaro
Awano, Shuji
SAUDI DENTAL JOURNAL, 2024, 36 (12) : 1577 - 1581
[28] Assessing the Performance of GPT-3.5 and GPT-4 on the 2023 Japanese Nursing Examination
Kaneda, Yudai
Takahashi, Ryo
Kaneda, Uiri
Akashima, Shiori
Okita, Haruna
Misaki, Sadaya
Yamashiro, Akimi
Ozaki, Akihiko
Tanimoto, Tetsuya
CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (08)
[29] Evaluating the performance of GPT-3.5, GPT-4, and GPT-4o in the Chinese National Medical Licensing Examination
Dingyuan Luo
Mengke Liu
Runyuan Yu
Yulian Liu
Wenjun Jiang
Qi Fan
Naifeng Kuang
Qiang Gao
Tao Yin
Zuncheng Zheng
Scientific Reports, 15 (1)
[30] Integrating AI in Lipedema Management: Assessing the Efficacy of GPT-4 as a Consultation Assistant
Leypold, Tim
Lingens, Lara F.
Beier, Justus P.
Boos, Anja M.
LIFE-BASEL, 2024, 14 (05):

← 1 2 3 4 5 →