History-based Article Quality Assessment on Wikipedia

被引:51
|
作者
Zhang, Shiyue [1 ]
Hu, Zheng [1 ]
Zhang, Chunhong [1 ]
Yu, Ke [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) | 2018年
关键词
Wikipedia; Information Quality; LSTM;
D O I
10.1109/BigComp.2018.00010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Wikipedia is widely considered as the biggest encyclopedia on Internet. Quality assessment of articles on Wikipedia has been studied for years. Conventional methods addressed this task by feature engineering and statistical machine learning algorithms. However, manually defined features are difficult to represent the long edit history of an article. Recently, researchers proposed an end-to-end neural model which used a Recurrent Neural Network(RNN) to learn the representation automatically. Although RNN showed its power in modeling edit history, the end-to-end method is time and resource consuming. In this paper, we propose a new history-based method to represent an article. We also take advantage of an RNN to handle the long edit history, but we do not abandon feature engineering. We still represent each revision of an article by manually defined features. This combination of deep neural model and feature engineering enables our model to be both simple and effective. Experiments demonstrate our model has better or comparable performance than previous works, and has the potential to work as a real-time service. Plus, we extend our model to do quality prediction.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [1] Classifying Wikipedia Article Quality With Revision History Networks
    Raman, Narun
    Sauerberg, Nathaniel
    Fisher, Jonah
    Narayan, Sneha
    PROCEEDINGS OF THE 16TH INTERNATIONAL SYMPOSIUM ON OPEN COLLABORATION (OPENSYM), 2020,
  • [2] Mining team characteristics to predict Wikipedia article quality
    Betancourt, Grace Gimon
    Segnini, Armando
    Trabuco, Carlos
    Rezgui, Amira
    Jullien, Nicolas
    PROCEEDINGS OF THE 12TH INTERNATIONAL SYMPOSIUM ON OPEN COLLABORATION (OPENSYM), 2016,
  • [3] Article Quality Classification on Wikipedia: Introducing Document Embeddings and Content Features
    Schmidt, Manuel
    Zangerle, Eva
    PROCEEDINGS OF THE 15TH INTERNATIONAL SYMPOSIUM ON OPEN COLLABORATION (OPENSYM), 2019,
  • [4] WikipediaViz: Conveying Article Quality for Casual Wikipedia Readers
    Chevalier, Fanny
    Huot, Stephane
    Fekete, Jean-Daniel
    IEEE PACIFIC VISUALIZATION SYMPOSIUM 2010, 2010, : 49 - 56
  • [5] A Psycho-lexical Approach to the Assessment of Information Quality on Wikipedia
    Su, Qi
    Liu, Pengyuan
    2015 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), VOL 3, 2015, : 184 - 187
  • [6] Relating Wikipedia article quality to edit behavior and link structure
    Thorsten Ruprechter
    Tiago Santos
    Denis Helic
    Applied Network Science, 5
  • [7] Relating Wikipedia article quality to edit behavior and link structure
    Ruprechter, Thorsten
    Santos, Tiago
    Helic, Denis
    APPLIED NETWORK SCIENCE, 2020, 5 (01)
  • [8] On the Relation of Edit Behavior, Link Structure, and Article Quality on Wikipedia
    Ruprechter, Thorsten
    Santos, Tiago
    Helic, Denis
    COMPLEX NETWORKS AND THEIR APPLICATIONS VIII, VOL 2, 2020, 882 : 242 - 254
  • [9] WikiLyzer: Interactive Information Quality Assessment in Wikipedia
    di Sciascio, Cecilia
    Strohmaier, David
    Errecalde, Marcelo
    Veas, Eduardo
    IUI'17: PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2017, : 377 - 388
  • [10] PERSONALIZED LEARNING PATHS BASED ON WIKIPEDIA ARTICLE STATISTICS
    Lahti, Lauri
    CSEDU 2010: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED EDUCATION, VOL 1, 2010, : 110 - 120