PageRank on Wikipedia: Towards General Importance Scores for Entities

被引:21
|
作者
Thalhammer, Andreas [1 ]
Rettinger, Achim [1 ]
机构
[1] Karlsruhe Inst Technol, AIFB, Karlsruhe, Germany
来源
SEMANTIC WEB, ESWC 2016 | 2016年 / 9989卷
关键词
Wikipedia; DBpedia; PageRank; Link analysis; Page views; Rank correlation;
D O I
10.1007/978-3-319-47602-5_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Link analysis methods are used to estimate importance in graph-structured data. In that realm, the PageRank algorithm has been used to analyze directed graphs, in particular the link structure of the Web. Recent developments in information retrieval focus on entities and their relations (i.e., knowledge graph panels). Many entities are documented in the popular knowledge base Wikipedia. The cross-references within Wikipedia exhibit a directed graph structure that is suitable for computing PageRank scores as importance indicators for entities. In this work, we present different PageRank-based analyses on the link graph of Wikipedia and according experiments. We focus on the question whether some links-based on their context/position in the article text-can be deemed more important than others. In our variants, we change the probabilistic impact of links in accordance to their context/position on the page and measure the effects on the output of the PageRank algorithm. We compare the resulting rankings and those of existing systems with page-view-based rankings and provide statistics on the pairwise computed Spearman and Kendall rank correlations.
引用
收藏
页码:227 / 240
页数:14
相关论文
共 50 条
  • [31] Wikipedia traffic data and electoral prediction: towards theoretically informed models
    Yasseri, Taha
    Bright, Jonathan
    EPJ DATA SCIENCE, 2016, 5
  • [32] A General Multi-Step Matrix Splitting Iteration Method for Computing PageRank
    Tian, Zhaolu
    Li, Xiaojing
    Liu, Zhongyun
    FILOMAT, 2021, 35 (02) : 679 - 706
  • [33] Wikipedia traffic data and electoral prediction: towards theoretically informed models
    Taha Yasseri
    Jonathan Bright
    EPJ Data Science, 5
  • [34] Automated selection of urban road network by fusion of PageRank algorithm and attribute importance metrics
    Chu, Tianshu
    Yan, Haowen
    Li, Pengbo
    Lu, Xiaomin
    Gao, Xiaorong
    CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE, 2025,
  • [35] Node importance evaluation in multi-platform avionics architecture based on TOPSIS and PageRank
    Liu, Chang
    Wang, Jinyan
    Xia, Rui
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2023, 2023 (01)
  • [36] Node importance evaluation in multi-platform avionics architecture based on TOPSIS and PageRank
    Chang Liu
    Jinyan Wang
    Rui Xia
    EURASIP Journal on Advances in Signal Processing, 2023
  • [37] The general inner-outer iteration method based on regular splittings for the PageRank problem
    Tian, Zhaolu
    Liu, Yong
    Zhang, Yan
    Liu, Zhongyun
    Tian, Maoyi
    APPLIED MATHEMATICS AND COMPUTATION, 2019, 356 : 479 - 501
  • [38] Towards robust tags for scientific publications from natural language processing tools and Wikipedia
    Lopuszynski, Michal
    Bolikowski, Lukasz
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2015, 16 (01) : 25 - 36
  • [39] Towards perfect text classification with Wikipedia-based semantic Naive Bayes learning
    Kim, Han-joon
    Kim, Jiyun
    Kim, Jinseog
    Lim, Pureum
    NEUROCOMPUTING, 2018, 315 : 128 - 134
  • [40] Templates and Trust-o-meters: Towards a widely deployable indicator of trust in Wikipedia
    Kuznetsov, Andrew
    Novotny, Margeigh
    Klein, Jessica
    Saez-Trumper, Diego
    Kittur, Aniket
    PROCEEDINGS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI' 22), 2022,