Natural Language Processing Applications in Case-Law Text Publishing

被引:2
|
作者
Tarasconi, Francesco [1 ]
Botros, Milad [1 ]
Caserio, Matteo [1 ]
Sportelli, Gianpiero [1 ]
Giacalone, Giuseppe [2 ]
Uttini, Carlotta [2 ]
Vignati, Luca [2 ]
Zanetta, Fabrizio [2 ]
机构
[1] CELI Language Technol, Via San Quintino 31, I-10121 Turin, Italy
[2] Giuffre Francis Lefebvre, Milan, Italy
来源
LEGAL KNOWLEDGE AND INFORMATION SYSTEMS | 2020年 / 334卷
关键词
natural language processing; applications; transfer learning; language models; text classification; information extraction; publishing industry; machine learning; BERT fine-tuning; random forest; Italian language;
D O I
10.3233/FAIA200859
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Processing case-law contents for electronic publishing purposes is a time-consuming activity that encompasses several sub-tasks and usually involves adding annotations to the original text. On the other hand, recent trends in Artificial Intelligence and Natural Language Processing enable the automatic and efficient analysis of big textual data. In this paper we present our Machine Learning solution to three specific business problems, regularly met by a real world Italian publisher in their day-to-day work: recognition of legal references in text spans, new content ranking by relevance, and text classification according to a given tree of topics. Different approaches based on BERT language model were experimented with, together with alternatives, typically based on Bag-of-Words. The optimal solution, deployed in a controlled production environment, was in two out of three cases based on fine-tuned BERT (for the extraction of legal references and text classification), while, in the case of relevance ranking, a Random Forest model, with hand-crafted features, was preferred. We will conclude by discussing the concrete impact, as perceived by the publisher, of the developed prototypes.
引用
收藏
页码:154 / 163
页数:10
相关论文
共 50 条
  • [21] Study on Chinglish in Web Text for Natural Language Processing
    Chen, Bo
    Chen, Lyu
    Ji, Ziqing
    CHINESE LEXICAL SEMANTICS, CLSW 2017, 2018, 10709 : 533 - 539
  • [22] A systematic review of applications of natural language processing and future challenges with special emphasis in text-based emotion detection
    Kusal, Sheetal
    Patil, Shruti
    Choudrie, Jyoti
    Kotecha, Ketan
    Vora, Deepali
    Pappas, Ilias
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (12) : 15129 - 15215
  • [23] A systematic review of applications of natural language processing and future challenges with special emphasis in text-based emotion detection
    Sheetal Kusal
    Shruti Patil
    Jyoti Choudrie
    Ketan Kotecha
    Deepali Vora
    Ilias Pappas
    Artificial Intelligence Review, 2023, 56 : 15129 - 15215
  • [24] Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse
    Dligach, Dmitriy
    Afshar, Majid
    Miller, Timothy
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (11) : 1272 - 1278
  • [25] Text Encryption Algorithm Based on Natural Language Processing
    Jing, Xianghe
    Hao, Yu
    Fei, Huaping
    Li, Zhijun
    2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, : 670 - 672
  • [26] A Natural Language Processing Pipeline of Chinese Free-Text Radiology Reports for Liver Cancer Diagnosis
    Liu, Honglei
    Xu, Yan
    Zhang, Zhiqiang
    Wang, Ni
    Huang, Yanqun
    Hu, Yanjun
    Yang, Zhenghan
    Jiang, Rui
    Chen, Hui
    IEEE ACCESS, 2020, 8 : 159110 - 159119
  • [27] Editorial: Methods and applications of natural language processing in psychiatry research
    Wang, Li
    Li, Shuyan
    Chen, Hui
    Zhou, Yunyun
    FRONTIERS IN PSYCHIATRY, 2022, 13
  • [28] Natural Language Processing Techniques for Text Classification of Biomedical Documents: A Systematic Review
    YetuYetu Kesiku, Cyrille
    Chaves-Villota, Andrea
    Garcia-Zapirain, Begonya
    INFORMATION, 2022, 13 (10)
  • [29] Natural language processing in law: Prediction of outcomes in the higher courts of Turkey
    Mumcuoglu, Emre
    Ozturk, Ceyhun E.
    Ozaktas, Haldun M.
    Koc, Aykut
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (05)
  • [30] Clinical and research applications of natural language processing for heart failure
    Girouard, Michael P.
    Chang, Alex J.
    Liang, Yilin
    Hamilton, Steven A.
    Bhatt, Ankeet S.
    Svetlichnaya, Jana
    Fitzpatrick, Jesse K.
    Carey, Evan C. B.
    Avula, Harshith R.
    Adatya, Sirtaz
    Lee, Keane K.
    Solomon, Matthew D.
    Parikh, Rishi V.
    Go, Alan S.
    Ambrosy, Andrew P.
    HEART FAILURE REVIEWS, 2025, 30 (02) : 407 - 415