AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

被引:10
作者
Schubotz, Moritz [1 ,2 ]
Scharpf, Philipp [3 ]
Teschke, Olaf [1 ]
Kuhnemund, Andreas [1 ]
Breitinger, Corinna [2 ,3 ]
Gipp, Bela [2 ,3 ]
机构
[1] FIZ Karlsruhe, Berlin, Germany
[2] Berg Univ Wuppertal, Wuppertal, Germany
[3] Univ Konstanz, Constance, Germany
来源
INTELLIGENT COMPUTER MATHEMATICS, CICM 2020 | 2020年 / 12236卷
关键词
Document classification; Applications of machine learning; Mathematical Subject Classification; Digital mathematical libraries; Mathematical information retrieval;
D O I
10.1007/978-3-030-53518-6_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Authors of research papers in the fields of mathematics, and other math-heavy disciplines commonly employ the Mathematics Subject Classification (MSC) scheme to search for relevant literature. The MSC is a hierarchical alphanumerical classification scheme that allows librarians to specify one or multiple codes for publications. Digital Libraries in Mathematics, as well as reviewing services, such as zbMATH and Mathematical Reviews (MR) rely on these MSC labels in their workflows to organize the abstracting and reviewing process. Especially, the coarse-grained classification determines the subject editor who is responsible for the actual reviewing process. In this paper, we investigate the feasibility of automatically assigning a coarse-grained primary classification using the MSC scheme, by regarding the problem as a multi class classification machine learning task. We find that the our method achieves an F1-score of over 77%, which is remarkably close to the agreement of zbMATH and MR (F1-score of 81%). Moreover, we find that the method's confidence score allows for reducing the effort by 86% compared to the manual coarse-grained classification effort while maintaining a precision of 81% for automatically classified articles.
引用
收藏
页码:237 / 250
页数:14
相关论文
共 15 条
  • [1] Bannister A, 2018, EMS Newslett, V6, P3, DOI [10.4171/news/108/1, DOI 10.4171/NEWS/108/1]
  • [2] Barthel Simon, 2013, Digital Libraries: Social Media and Community Networks. 15th International Conference on Asia-Pacific Digital Libraries, ICADL 2013. Proceedings: LNCS 8279, P83, DOI 10.1007/978-3-319-03599-4_10
  • [3] The New Numdam Platform
    Bouche, Thierry
    Labbe, Olivier
    [J]. INTELLIGENT COMPUTER MATHEMATICS, 2017, 10383 : 70 - 82
  • [4] Evans I., 2017, Ph.D. thesis
  • [5] Ion P., 2012, Eur. Math. Soc. Newsl., V84, P55
  • [6] Kuhnemund Andreas, 2016, Proceedings in Applied Mathematics and Mechanics, V16, P961, DOI 10.1002/pamm.201610459
  • [7] Mihaljevic-Brandt H., 2014, English. Eur. Math. Soc. Newsl., V91, P55
  • [8] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
  • [9] Rehurek R, 2008, LECT NOTES ARTIF INT, V5144, P543, DOI 10.1007/978-3-540-85110-3_44
  • [10] Scharpf P., 2020, LNCS LNAI