Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model

被引:2
作者
Yuan, Mingruo [1 ]
Kao, Ben [1 ]
Wu, Tien-Hsuan [1 ]
Cheung, Michael M. K. [2 ]
Chan, Henry W. H. [2 ]
Cheung, Anne S. Y. [2 ]
Chan, Felix W. H. [2 ]
Chen, Yongxi [3 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[2] Univ Hong Kong, Fac Law, Pokfulam, Hong Kong, Peoples R China
[3] Australian Natl Univ, Coll Law, Canberra, ACT 2601, Australia
关键词
Legal knowledge dissemination; Navigability and comprehensibility of legal information; Machine question generation; Pre-trained language model; READABILITY;
D O I
10.1007/s10506-023-09367-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson's terms. Second, we construct a Legal Question Bank, which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive CLIC Recommender. Given a user's verbal description of a legal situation that requires a legal solution, CRec interprets the user's input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions against human-composed questions and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.
引用
收藏
页码:769 / 805
页数:37
相关论文
共 37 条
  • [1] CPM: A large-scale generative Chinese Pre-trained language model
    Zhang, Zhengyan
    Han, Xu
    Zhou, Hao
    Ke, Pei
    Gu, Yuxian
    Ye, Deming
    Qin, Yujia
    Su, Yusheng
    Ji, Haozhe
    Guan, Jian
    Qi, Fanchao
    Wang, Xiaozhi
    Zheng, Yanan
    Zeng, Guoyang
    Cao, Huanqi
    Chen, Shengqi
    Li, Daixuan
    Sun, Zhenbo
    Liu, Zhiyuan
    Huang, Minlie
    Han, Wentao
    Tang, Jie
    Li, Juanzi
    Zhu, Xiaoyan
    Sun, Maosong
    AI OPEN, 2021, 2 : 93 - 99
  • [2] Lawformer: A pre-trained language model for Chinese legal long documents
    Xiao, Chaojun
    Hu, Xueyu
    Liu, Zhiyuan
    Tu, Cunchao
    Sun, Maosong
    AI OPEN, 2021, 2 : 79 - 84
  • [3] Abstractive Summarization of Korean Legal Cases using Pre-trained Language Models
    Yoon, Jiyoung
    Junaid, Muhammad
    Ali, Sajid
    Lee, Jongwuk
    PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
  • [4] Pre-trained Language Model for Biomedical Question Answering
    Yoon, Wonjin
    Lee, Jinhyuk
    Kim, Donghyeon
    Jeong, Minbyul
    Kang, Jaewoo
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 727 - 740
  • [5] On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study
    Song, Dezhao
    Gao, Sally
    He, Baosheng
    Schilder, Frank
    IEEE ACCESS, 2022, 10 : 75835 - 75858
  • [6] Question Answering based Clinical Text Structuring Using Pre-trained Language Model
    Qiu, Jiahui
    Zhou, Yangming
    Ma, Zhiyuan
    Ruan, Tong
    Liu, Jinlin
    Sun, Jing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1596 - 1600
  • [7] Question-answering Forestry Pre-trained Language Model: ForestBERT
    Tan, Jingwei
    Zhang, Huaiqing
    Liu, Yang
    Yang, Jie
    Zheng, Dongping
    Linye Kexue/Scientia Silvae Sinicae, 2024, 60 (09): : 99 - 110
  • [8] ReLMKG: reasoning with pre-trained language models and knowledge graphs for complex question answering
    Xing Cao
    Yun Liu
    Applied Intelligence, 2023, 53 : 12032 - 12046
  • [9] ReLMKG: reasoning with pre-trained language models and knowledge graphs for complex question answering
    Cao, Xing
    Liu, Yun
    APPLIED INTELLIGENCE, 2023, 53 (10) : 12032 - 12046
  • [10] Improving Extraction of Chinese Open Relations Using Pre-trained Language Model and Knowledge Enhancement
    Wen, Chaojie
    Jia, Xudong
    Chen, Tao
    DATA INTELLIGENCE, 2023, 5 (04) : 962 - 989