Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers

被引:28
作者
Tkaczyk, Dominika [1 ]
Collins, Andrew [1 ]
Sheridan, Paraic [1 ]
Beel, Joeran [1 ]
机构
[1] Trinity Coll Dublin, ADAPT Ctr, Sch Comp Sci & Stat, Dublin, Ireland
来源
JCDL'18: PROCEEDINGS OF THE 18TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES | 2018年
基金
爱尔兰科学基金会;
关键词
bibliographic reference parsing; citation parsing; machine learning; sequence tagging; METADATA; METHODOLOGY; EXTRACTION;
D O I
10.1145/3197026.3197048
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bibliographic reference parsing refers to extracting machine-readable metadata, such as the names of the authors, the title, or journal name, from bibliographic reference strings. Many approaches to this problem have been proposed so far, including regular expressions, knowledge bases and supervised machine learning. Many open source reference parsers based on various algorithms are also available. In this paper, we apply, evaluate and compare ten reference parsing tools in a specific business use case. The tools are Anystyle-Parser, Biblio, CERMINE, Citation, Citation-Parser, GROBID, ParsCit, PDFSSA4MET, Reference Tagger and Science Parse, and we compare them in both their out-of-the-box versions and versions tuned to the project-specific data. According to our evaluation, the best performing out-of-the-box tool is GROBID (F1 0.89), followed by CERMINE (F1 0.83) and ParsCit (F1 0.75). We also found that even though machine learning-based tools and tools based on rules or regular expressions achieve on average similar precision (0.77 for ML-based tools vs. 0.76 for non-ML-based tools), applying machine learning-based tools results in a recall three times higher than in the case of non-ML-based tools (0.66 vs. 0.22). Our study also confirms that tuning the models to the task-specific data results in the increase in the quality. The retrained versions of reference parsers are in all cases better than their out-of-the-box counterparts; for GROBID F1 increased by 3% 0.92 vs. 0.89), for CERMINE by 11% (0.92 vs. 0.83), and for ParsCit by 16% (0.87 vs. 0.75).
引用
收藏
页码:99 / 108
页数:10
相关论文
共 33 条
[1]   Document-document similarity approaches and science mapping: Experimental comparison of five approaches [J].
Ahlgren, Per ;
Colliander, Cristian .
JOURNAL OF INFORMETRICS, 2009, 3 (01) :49-63
[2]   h-Index: A review focused in its variants, computation and standardization for different scientific fields [J].
Alonso, S. ;
Cabrerizo, F. J. ;
Herrera-Viedma, E. ;
Herrera, F. .
JOURNAL OF INFORMETRICS, 2009, 3 (04) :273-289
[3]  
Beel J., 2017, VIRTUAL CITATION PRO
[4]  
Beel J., 2017, JCDL
[5]   Research-paper recommender systems: a literature survey [J].
Beel, Joeran ;
Gipp, Bela ;
Langer, Stefan ;
Breitinger, Corinna .
INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2016, 17 (04) :305-338
[6]   A Hirsch-type index for journals [J].
Braun, Tibor ;
Glanzel, Wolfgang ;
Schubert, Andras .
SCIENTOMETRICS, 2006, 69 (01) :169-173
[7]   BibPro: A Citation Parser Based on Sequence Alignment [J].
Chen, Chien-Chih ;
Yang, Kai-Hsiang ;
Chen, Chuen-Liang ;
Ho, Jan-Ming .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (02) :236-250
[8]  
Constantin Alexandru., 2013, Proceedings of the 2013 ACM symposium on Document engineering, P177, DOI DOI 10.1145/2494266.2494271
[9]   A Flexible Approach for Extracting Metadata From Bibliographic Citations [J].
Cortez, Eli ;
da Silva, Altigran S. ;
Goncalves, Marcos Andre ;
Mesquita, Filipe ;
de Moura, Edleno S. .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2009, 60 (06) :1144-1158
[10]  
Councill I., 2008, INT C LANG RES EV