Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation

被引:2
作者
Scharpf, Philipp [1 ]
Schubotz, Moritz [2 ]
Gipp, Bela [3 ]
机构
[1] Univ Konstanz, Constance, Germany
[2] FIZ Karlsruhe, Karlsruhe, Germany
[3] Univ Wuppertal, Wuppertal, Germany
来源
WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021) | 2021年
关键词
Entity Linking; Wikipedia; Wikidata; Recommender Systems;
D O I
10.1145/3442442.3452348
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mathematical information retrieval (MathIR) applications such as semantic formula search and question answering systems rely on knowledge-bases that link mathematical expressions to their natural language names. For database population, mathematical formulae need to be annotated and linked to semantic concepts, which is very time-consuming. In this paper, we present our approach to structure and speed up this process by using an application-driven strategy and AI-aided system. We evaluate the quality and time-savings of AI-generated formula and identifier annotation recommendations on a test selection of Wikipedia articles from the physics domain. Moreover, we evaluate the community acceptance of Wikipedia formula entity links and Wikidata item creation and population to ground the formula semantics. Our evaluation shows that the AI guidance was able to significantly speed up the annotation process by a factor of 1.4 for formulae and 2.4 for identifiers. Our contributions were accepted in 88% of the edited Wikipedia articles and 67% of the Wikidata items. The "AnnoMathTeX" annotation recommender system is hosted by Wikimedia at annomathtex.wmflabs.org . In the future, our data refinement pipeline will be integrated seamlessly into the Wikimedia user interfaces.
引用
收藏
页码:602 / 609
页数:8
相关论文
共 24 条
  • [11] Musto Cataldo, 2009, P ECML PKDD DISC CHA, P215
  • [12] Rosales-Mendez Henry, 2018, AMW
  • [13] Scharpf P., 2019, BIRNDL SIGIR CEUR WO, V2414, P108
  • [14] Scharpf P., 2018, CEUR Workshop Proceedings, V2132, P46
  • [15] AnnoMathTeX - a Formula Identifier Annotation Recommender System for STEM Documents
    Scharpf, Philipp
    Mackerracher, Ian
    Schubotz, Moritz
    Beel, Joeran
    Breitinger, Corinna
    Gipp, Bela
    [J]. RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2019, : 532 - 533
  • [16] Scharpf Philipp, 2020, CEUR WORKSHOP P, V2696
  • [17] Scharpf Philipp, 2020, JCDL, P137
  • [18] AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels
    Schubotz, Moritz
    Scharpf, Philipp
    Teschke, Olaf
    Kuhnemund, Andreas
    Breitinger, Corinna
    Gipp, Bela
    [J]. INTELLIGENT COMPUTER MATHEMATICS, CICM 2020, 2020, 12236 : 237 - 250
  • [19] Schubotz M., 2020, JCDL 20, P447, DOI DOI 10.1145/3383583.3398557
  • [20] Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context
    Schubotz, Moritz
    Greiner-Petter, Andre
    Scharpf, Philipp
    Meuschke, Norman
    Cohl, Howard S.
    Gipp, Bela
    [J]. JCDL'18: PROCEEDINGS OF THE 18TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, 2018, : 233 - 242