LawBench: Benchmarking Legal Knowledge of Large Language Models

被引：0

作者：

Fei, Zhiwei ^{[1
]}

Shen, Xiaoyu ^{[2
]}

Zhu, Dawei ^{[3
]}

Zhou, Fengzhe ^{[4
]}

Han, Zhuo ^{[1
]}

Huang, Alan ^{[5
]}

Zhang, Songyang ^{[4
]}

Chen, Kai ^{[4
]}

Yin, Zhixin ^{[1
]}

Shen, Zongwen ^{[1
]}

Ge, Jidong ^{[1
]}

Ng, Vincent ^{[6
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Eastern Inst Technol, Digital Twin Inst, Ningbo, Peoples R China

[3] Saarland Univ, Saarbrucken, Germany

[4] Shanghai AI Lab, Shanghai, Peoples R China

[5] Sch Sci & Engn Magnet, Dallas, TX USA

[6] Univ Texas Dallas, Dallas, TX USA

来源：

2024 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present LawBench, the first evaluation benchmark composed of 20 tasks aimed to assess the ability of Large Language Models (LLMs) to perform Chinese legal-related tasks. LawBench is meticulously crafted to enable precise assessment of LLMs' legal capabilities from three cognitive levels that correspond to the widely accepted Bloom's cognitive taxonomy. Using LawBench, we present a comprehensive evaluation of 21 popular LLMs and the first comparative analysis of the empirical results in order to reveal their relative strengths and weaknesses. All data, model predictions and evaluation code are accessible from https: //github.com/open- compass/LawBench.

引用

页码：7933 / 7962

页数：30

共 41 条

[1]

2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774, 10.48550/arXiv.2303.08774]

[2]

Adlakha V, 2024, Arxiv, DOI arXiv:2307.16877

[3]

Bai JZ, 2023, Arxiv, DOI [arXiv:2309.16609, DOI 10.48550/ARXIV.2309.16609]

[4]

Cai Z., 2024, arXiv, DOI 10.48550/arXiv.2403.17297

[5]

Chalkidis Ilias, 2022, Long Papers, V1, P4310

[6]

Cui JX, 2024, Arxiv, DOI [arXiv:2306.16092, 10.48550/arXiv.230616092]

[7]

Cui JY, 2022, Arxiv, DOI arXiv:2204.04859

[8]

Dai YF, 2024, Arxiv, DOI arXiv:2310.05620

[9]

Du ZX, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P320

[10] Learning Fine-Grained Fact-Article Correspondence in Legal Cases [J].

Ge, Jidong ;

Huang, Yunyun ;

Shen, Xiaoyu ;

Li, Chuanyi ;

Hu, Wei .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 :3694-3706

← 1 2 3 4 5 →