Natural Language Processing Applications in Case-Law Text Publishing

被引:2
|
作者
Tarasconi, Francesco [1 ]
Botros, Milad [1 ]
Caserio, Matteo [1 ]
Sportelli, Gianpiero [1 ]
Giacalone, Giuseppe [2 ]
Uttini, Carlotta [2 ]
Vignati, Luca [2 ]
Zanetta, Fabrizio [2 ]
机构
[1] CELI Language Technol, Via San Quintino 31, I-10121 Turin, Italy
[2] Giuffre Francis Lefebvre, Milan, Italy
来源
LEGAL KNOWLEDGE AND INFORMATION SYSTEMS | 2020年 / 334卷
关键词
natural language processing; applications; transfer learning; language models; text classification; information extraction; publishing industry; machine learning; BERT fine-tuning; random forest; Italian language;
D O I
10.3233/FAIA200859
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Processing case-law contents for electronic publishing purposes is a time-consuming activity that encompasses several sub-tasks and usually involves adding annotations to the original text. On the other hand, recent trends in Artificial Intelligence and Natural Language Processing enable the automatic and efficient analysis of big textual data. In this paper we present our Machine Learning solution to three specific business problems, regularly met by a real world Italian publisher in their day-to-day work: recognition of legal references in text spans, new content ranking by relevance, and text classification according to a given tree of topics. Different approaches based on BERT language model were experimented with, together with alternatives, typically based on Bag-of-Words. The optimal solution, deployed in a controlled production environment, was in two out of three cases based on fine-tuned BERT (for the extraction of legal references and text classification), while, in the case of relevance ranking, a Random Forest model, with hand-crafted features, was preferred. We will conclude by discussing the concrete impact, as perceived by the publisher, of the developed prototypes.
引用
收藏
页码:154 / 163
页数:10
相关论文
共 50 条
  • [1] Text mining and natural language processing in construction
    Shamshiri, Alireza
    Ryu, Kyeong Rok
    Park, June Young
    AUTOMATION IN CONSTRUCTION, 2024, 158
  • [2] Predicting citations in Dutch case law with natural language processing
    Schepers, Iris
    Medvedeva, Masha
    Bruijn, Michelle
    Wieling, Martijn
    Vols, Michel
    ARTIFICIAL INTELLIGENCE AND LAW, 2024, 32 (03) : 807 - 837
  • [3] Natural language processing for Nepali text: a review
    Tej Bahadur Shahi
    Chiranjibi Sitaula
    Artificial Intelligence Review, 2022, 55 : 3401 - 3429
  • [4] Natural language processing for Nepali text: a review
    Shahi, Tej Bahadur
    Sitaula, Chiranjibi
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 3401 - 3429
  • [5] Trends and Features of the Applications of Natural Language Processing Techniques for Clinical Trials Text Analysis
    Chen, Xieling
    Xie, Haoran
    Cheng, Gary
    Poon, Leonard K. M.
    Leng, Mingming
    Wang, Fu Lee
    APPLIED SCIENCES-BASEL, 2020, 10 (06):
  • [6] Applications of Natural Language Processing to Geoscience Text Data and Prospectivity Modeling
    Christopher J. M. Lawley
    Michael G. Gadd
    Mohammad Parsa
    Graham W. Lederer
    Garth E. Graham
    Arianne Ford
    Natural Resources Research, 2023, 32 : 1503 - 1527
  • [7] Neurolinguistic approach to natural language processing with applications to medical text analysis
    Duch, Wlodzisfaw
    Matykiewicz, Pawel
    Pestian, John
    NEURAL NETWORKS, 2008, 21 (10) : 1500 - 1510
  • [8] RESEARCH ON THE TEXT CLASSIFICATION BASED ON NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Chen Keming
    Zheng Jianguo
    JOURNAL OF THE BALKAN TRIBOLOGICAL ASSOCIATION, 2016, 22 (03): : 2484 - 2494
  • [9] Applications of natural language processing in construction
    Ding, Yuexiong
    Ma, Jie
    Luo, Xiaowei
    AUTOMATION IN CONSTRUCTION, 2022, 136
  • [10] Applications of Natural Language Processing to Geoscience Text Data and Prospectivity Modeling
    Lawley, Christopher J. M.
    Gadd, Michael G.
    Parsa, Mohammad
    Lederer, Graham W.
    Graham, Garth E.
    Ford, Arianne
    NATURAL RESOURCES RESEARCH, 2023, 32 (04) : 1503 - 1527