GarNLP: A Natural Language Processing Pipeline for Garnishment Documents

被引:2
作者
Bordino, Ilaria [1 ]
Ferretti, Andrea [2 ]
Gullo, Francesco [1 ]
Pascolutti, Stefano [3 ]
机构
[1] UniCredit, R&D Dept, Rome, Italy
[2] UniCredit, R&D Dept, Milan, Italy
[3] Google, Zurich, Switzerland
关键词
Applied data science; Natural language processing; Legal documents; Garnishment; Categorization; Information extraction; Supervised learning; Word embeddings; Named entity recognition;
D O I
10.1007/s10796-020-09997-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Basic elements of the law, such as statuses and regulations, are embodied in natural language, and strictly depend on linguistic expressions. Hence, analyzing legal contents is a challenging task, and the legal domain is increasingly looking for automatic-processing support. This paper focuses on a specific context in the legal domain, which has so far remained unexplored: automatic processing of garnishment documents. A garnishment is a legal procedure by which a creditor can collect what a debtor owes by requiring to confiscate a debtor's property (e.g., a checking account) that is hold by a third party, dubbed garnishee. Our proposal, motivated by a real-world use case, is a versatile natural-language-processing pipeline to support a garnishee in the processing of a large-scale flow of garnishment documents. In particular, we mainly focus on two tasks: (i) categorize received garnishment notices onto a predefined taxonomy of categories; (ii) perform an information-extraction phase, which consists in automatically identifying from the text various information, such as identity of involved actors, amounts, and dates. The main contribution of this work is to describe challenges, design, implementation, and performance of the core modules and methods behind our solution. Our proposal is a noteworthy example of how data-science techniques can be successfully applied to a novel yet challenging real-world context.
引用
收藏
页码:101 / 114
页数:14
相关论文
共 46 条
[1]  
Agnoloni T., 2009, P C LAW ONT SEM WEB
[2]  
Ajani G., 2010, SEMANTIC PROCESSING
[3]  
Allwood W, 1988, CAMBRIDGE LAW J, V47
[4]  
Almeida Felipe, 2019, ABS190109069 CORR
[5]  
Ananiadou Sophia., 2005, Text Mining for Biology And Biomedicine
[6]  
[Anonymous], 2011, P 20 ACM C INF KNOWL
[7]  
[Anonymous], 2013, EXPLOITING SIMILARIT
[8]  
[Anonymous], 2013, RANLP
[9]  
Aprosio A.P., 2016, ARXIV160906204
[10]  
Bartolini R., 2004, P OTM WORK