Predicting Links on Wikipedia with Anchor Text Information

被引:0
作者
Brochier, Robin [1 ]
Bechet, Frederic [1 ]
机构
[1] Aix Marseille Univ, Univ Toulon, CNRS, LIS, Marseille, France
来源
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2021年
关键词
Wikipedia; link prediction; evaluation; hyperlinks; NETWORKS;
D O I
10.1145/3404835.3462994
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Wikipedia, the largest open-collaborative online encyclopedia, is a corpus of documents bound together by internal hyperlinks. These links form the building blocks of a large network whose structure contains important information on the concepts covered in this encyclopedia. The presence of a link between two articles, materialised by an anchor text in the source page pointing to the target page, can increase readers' understanding of a topic. However, the process of linking follows specific editorial rules to avoid both under-linking and over-linking. In this paper, we study the transductive and the inductive tasks of link prediction on several subsets of the English Wikipedia and identify some key challenges behind automatic linking based on anchor text information. We propose an appropriate evaluation sampling methodology and compare several algorithms. Moreover, we propose baseline models that provide a good estimation of the overall difficulty of the tasks.
引用
收藏
页码:1758 / 1762
页数:5
相关论文
共 50 条
  • [21] Improving Text Categorization with Semantic Knowledge in Wikipedia
    Wang, Xiang
    Jia, Yan
    Chen, Ruhua
    Fan, Hua
    Zhou, Bin
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (12) : 2786 - 2794
  • [22] Using Wikipedia knowledge to improve text classification
    Pu Wang
    Jian Hu
    Hua-Jun Zeng
    Zheng Chen
    Knowledge and Information Systems, 2009, 19 : 265 - 281
  • [23] Wikipedia Based Short Text Classification Method
    Li, Junze
    Cai, Yi
    Cai, Zhiwei
    Leung, Hofung
    Yang, Kai
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), 2017, 10179 : 275 - 286
  • [24] Dual Hypergraph Features for Path Inference in Wikipedia Links
    Toufa, Anastasia-Sotiria
    Kotropoulos, Constantine
    Tsingalis, Ioannis
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [25] Wikipedia mining of hidden links between political leaders
    Frahm, Klaus M.
    Jaffres-Runser, Katia
    Shepelyansky, Dima L.
    EUROPEAN PHYSICAL JOURNAL B, 2016, 89 (12)
  • [26] Pagico: Evaluating Wikipedia-based information retrieval in Portuguese
    Mota, Cristina
    Simoes, Alberto
    Freitas, Claudia
    Costa, Luis
    Santos, Diana
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2015 - 2022
  • [27] Link Prediction in a Bipartite Network using Wikipedia Revision Information
    Chang, Yang-Jui
    Kao, Hung-Yu
    2012 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2012, : 50 - 55
  • [28] Supporting navigation in Wikipedia by information visualization: extended evaluation measures
    Wu, I-Chin
    Vakkari, Pertti
    JOURNAL OF DOCUMENTATION, 2014, 70 (03) : 392 - 424
  • [29] Predicting Importance of Historical Persons using Wikipedia
    Jatowt, Adam
    Kawai, Daisuke
    Tanaka, Katsumi
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 1909 - 1912
  • [30] Wikipedia-Based Smoothing for Enhancing Text Clustering
    Rahimtoroghi, Elahe
    Shakery, Azadeh
    INFORMATION RETRIEVAL TECHNOLOGY, 2011, 7097 : 327 - 339