EQUALS: A Real-world Dataset for Legal Question Answering via Reading Chinese Laws

被引:9
作者
Chen, Andong [1 ]
Yao, Feng [2 ]
Zhao, Xinyan [1 ]
Zhang, Yating [1 ]
Sun, Changlong [1 ]
Liu, Yun [2 ]
Shen, Weixing [2 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND LAW, ICAIL 2023 | 2023年
关键词
Legal Dataset; Legal Question Answering; Question Answering Framework;
D O I
10.1145/3594536.3595159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Legal Question Answering (LQA) is a promising artificial intelligence application with high practical value. Aprofessional and effective legal question answering (QA) agent can assist in reducing the workload of lawyers and judges, and help to achieve judicial accessibility. However, the NLP community lacks a large-scale LQA dataset with high quality, making it difficult to develop practical data-driven LQA agents. To tackle this bottleneck, this work presents EQUALS, a well-annotated real-world dataset for lEgal QUestion Answering via reading Chinese LawS. EQUALS contains 6,914 {question, article, answer} triplets as well as a pool of articles of laws that covers 10 different collections of Chinese Laws. Questions and the corresponding answers in EQUALS are collected from a professional law consultation forum. More importantly, the exact spans of law articles are annotated by senior law students as the answers. In this way, we could assure the quality and professionalism of EQUALS. Furthermore, thiswork proposes a QA framework that encompasses a law article retrieval module and a machine reading comprehension module for extracting accurate answers from the law article. We conduct thorough experiments with representative baselines on EQUALS, and the results indicate that EQUALS is a challenging question answering task. To the best of our knowledge, EQUALS is the largest real-world LQA dataset which shall significantly promote the research of LQA tasks. The work has been open-sourced and is available at: https://github.com/andongBlue/EQUALS.
引用
收藏
页码:71 / 80
页数:10
相关论文
共 25 条
[1]   Reading Wikipedia to Answer Open-Domain Questions [J].
Chen, Danqi ;
Fisch, Adam ;
Weston, Jason ;
Bordes, Antoine .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, :1870-1879
[2]  
Cui YM, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P5883
[3]  
Feng Y, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P648
[4]  
Green B.F., 1961, MAY 9 11 1961 W JOIN, P219, DOI 10.1145/1460690.1460714
[5]  
Ji CZ, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P1900
[6]   An end-to-end joint model for evidence information extraction from court record document [J].
Ji, Donghong ;
Tao, Peng ;
Fei, Hao ;
Ren, Yafeng .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
[7]  
Kano Yoshinobu, 2018, Revised Selected Papers, V11717, P177
[8]  
Kien Phi Manh, 2020, COLING, P988, DOI DOI 10.18653/V1/2020.COLING-MAIN.86
[9]  
Louis A, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P6789
[10]   Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real Court Setting [J].
Ma, Luyao ;
Zhang, Yating ;
Wang, Tianyi ;
Liu, Xiaozhong ;
Ye, Wei ;
Sun, Changlong ;
Zhang, Shikun .
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, :993-1002