PRILJ: an efficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments

被引:0
作者
Graziella De Martino
Gianvito Pio
Michelangelo Ceci
机构
[1] University of Bari Aldo Moro,Big Data Lab
[2] National Interuniversity Consortium for Informatics (CINI),undefined
[3] Jozef Stefan Institute,undefined
来源
Artificial Intelligence and Law | 2022年 / 30卷
关键词
Legal information retrieval; Embedding; Clustering; Approximate nearest neighbor search;
D O I
暂无
中图分类号
学科分类号
摘要
In an era characterized by fast technological progress that introduces new unpredictable scenarios every day, working in the law field may appear very difficult, if not supported by the right tools. In this respect, some systems based on Artificial Intelligence methods have been proposed in the literature, to support several tasks in the legal sector. Following this line of research, in this paper we propose a novel method, called PRILJ, that identifies paragraph regularities in legal case judgments, to support legal experts during the redaction of legal documents. Methodologically, PRILJ adopts a two-step approach that first groups documents into clusters, according to their semantic content, and then identifies regularities in the paragraphs for each cluster. Embedding-based methods are adopted to properly represent documents and paragraphs into a semantic numerical feature space, and an Approximated Nearest Neighbor Search method is adopted to efficiently retrieve the most similar paragraphs with respect to the paragraphs of a document under preparation. Our extensive experimental evaluation, performed on a real-world dataset provided by EUR-Lex, proves the effectiveness and the efficiency of the proposed method. In particular, its ability of modeling different topics of legal documents, as well as of capturing the semantics of the textual content, appear very beneficial for the considered task, and make PRILJ very robust to the possible presence of noise in the data.
引用
收藏
页码:359 / 390
页数:31
相关论文
共 6 条
  • [1] PRILJ: an efficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments
    De Martino, Graziella
    Pio, Gianvito
    Ceci, Michelangelo
    ARTIFICIAL INTELLIGENCE AND LAW, 2022, 30 (03) : 359 - 390
  • [2] Identification of Paragraph Regularities in Legal Judgements Through Clustering and Textual Embedding
    De Martino, Graziella
    Pio, Gianvito
    FOUNDATIONS OF INTELLIGENT SYSTEMS (ISMIS 2022), 2022, 13515 : 74 - 84
  • [3] A Two-Step Method for Clustering Mixed Categroical and Numeric Data
    Shih, Ming-Yi
    Jheng, Jar-Wen
    Lai, Lien-Fu
    JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2010, 13 (01): : 11 - 19
  • [4] Load Pattern Analysis of Key Accounts based on Two-step Clustering
    Li, Yujiao
    Huang, Qingping
    Liu, Song
    Liu, Peng
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION PROCESSING (ICIIP'16), 2016,
  • [5] Customer Segmentation Using Two-Step Mining Method Based on RFM Model
    Bachtiar, Fitra A.
    PROCEEDINGS OF 2018 3RD INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET 2018), 2018, : 10 - 15
  • [6] A Two-Step Best-Worst Method (BWM) and K-Means Clustering Recommender System Framework
    Najafi-Zangeneh, Saeed
    Shams-Gharneh, Naser
    Arjomandi-Nezhad, Ali
    ADVANCES IN BEST-WORST METHOD, BWM2021, 2022, : 29 - 40