Data mining mining for smart legal systems

被引:6
作者
Sharafat, Shahmin [1 ]
Nasar, Zara [1 ]
Jaffry, Syed Waqar [1 ]
机构
[1] Univ Punjab, Punjab Univ Coll Informat Technol, Natl Ctr Artificial Intelligence, Artificial Intelligence & Multidisciplinary Res L, Lahore, Pakistan
关键词
Information extraction; Named Entity Recognition; Legal data; Text mining; Civil law proceeding;
D O I
10.1016/j.compeleceng.2019.07.017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Smart legal systems carry immense potential to provide legal community and public with valuable insights using legal data. These systems can consequently help in analyzing and mitigating various social issues. In Pakistan, since last couple of years, courts have been reporting judgments online for public consumption. This public data, once processed, can be utilized for betterment of society and policy making in Pakistan. This study takes the first step to realize smart legal system by extracting various entities such as dates, case numbers, reference cases, person names, etc. from legal judgments. To automatically extract these entities, the primary requirement is to construct dataset using legal judgments. Hence, firstly annotation guidelines are prepared followed by preparation of annotated dataset for extraction of various legal entities. Experiments conducted using variety of datasets, multiple algorithms and annotation schemes, resulted into maximum F1-score of 91.51% using Conditional Random Fields. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:328 / 342
页数:15
相关论文
共 24 条
[1]   STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T .
ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&
[2]  
Brants T, 2000, 6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, P224
[3]  
Carlson A., 2010, Proceedings of the third ACM international conference on Web search and data mining, P101, DOI DOI 10.1145/1718487.1718501
[4]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[5]  
Chou SC, 2010, LECT NOTES COMPUT SC, V6122, P113, DOI 10.1007/978-3-642-13601-6_14
[6]  
Dozier C., 2010, NAMED ENTITY RECOGNI, DOI [10.1007/978-3-642-12837-0_2, DOI 10.1007/978-3-642-12837-0_2]
[7]   Ontology Learning Process as a Bottom-up Strategy for Building Domain-specific Ontology from Legal Texts [J].
El Ghosh, Mirna ;
Naja, Hala ;
Abdulrab, Habib ;
Khalil, Mohamad .
ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2017, :473-480
[8]  
Galgani F, 2012, P WORKSH INN HYBR AP, P115
[9]   An introduction to hidden Markov models and Bayesian networks [J].
Ghahramani, Z .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2001, 15 (01) :9-42
[10]  
Hallgren Kevin A, 2012, Tutor Quant Methods Psychol, V8, P23