History-based Article Quality Assessment on Wikipedia

被引:51
|
作者
Zhang, Shiyue [1 ]
Hu, Zheng [1 ]
Zhang, Chunhong [1 ]
Yu, Ke [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) | 2018年
关键词
Wikipedia; Information Quality; LSTM;
D O I
10.1109/BigComp.2018.00010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Wikipedia is widely considered as the biggest encyclopedia on Internet. Quality assessment of articles on Wikipedia has been studied for years. Conventional methods addressed this task by feature engineering and statistical machine learning algorithms. However, manually defined features are difficult to represent the long edit history of an article. Recently, researchers proposed an end-to-end neural model which used a Recurrent Neural Network(RNN) to learn the representation automatically. Although RNN showed its power in modeling edit history, the end-to-end method is time and resource consuming. In this paper, we propose a new history-based method to represent an article. We also take advantage of an RNN to handle the long edit history, but we do not abandon feature engineering. We still represent each revision of an article by manually defined features. This combination of deep neural model and feature engineering enables our model to be both simple and effective. Experiments demonstrate our model has better or comparable performance than previous works, and has the potential to work as a real-time service. Plus, we extend our model to do quality prediction.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [41] Extracting Representative Phrases from Wikipedia Article Sections
    Liu, Shan
    Iwaihara, Mizuho
    2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 759 - 764
  • [42] The readability of the English Wikipedia article on Parkinson's disease
    Brigo, Francesco
    Erro, Roberto
    NEUROLOGICAL SCIENCES, 2015, 36 (06) : 1045 - 1046
  • [43] The readability of the English Wikipedia article on Parkinson’s disease
    Francesco Brigo
    Roberto Erro
    Neurological Sciences, 2015, 36 : 1045 - 1046
  • [44] Revision history: Translation trends in Wikipedia
    Dolmaya, Julie McDonough
    TRANSLATION STUDIES, 2015, 8 (01) : 16 - 34
  • [45] Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia
    Dalip, Daniel Hasan
    Goncalves, Marcos Andre
    Cristo, Marco
    Calado, Pavel
    JCDL 09: PROCEEDINGS OF THE 2009 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, 2009, : 295 - 304
  • [46] Indicator of quality for environmental articles on Wikipedia at the higher education level
    Petiska, Eduard
    Moldan, Bedrich
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (02) : 269 - 280
  • [47] Information Quality in Wikipedia: The Effects of Group Composition and Task Conflict
    Arazy, Ofer
    Nov, Oded
    Patterson, Raymond
    Yeo, Lisa
    JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2011, 27 (04) : 71 - 98
  • [48] Finding Co-occurring Topics in Wikipedia Article Segments
    Wang, Renzhi
    Wu, Jianmin
    Iwaihara, Mizuho
    EMERGENCE OF DIGITAL LIBRARIES - RESEARCH AND PRACTICES, 2014, 8839 : 252 - 259
  • [49] Wikipedia Editing as Connective Intelligence: Analyzing the Vandal Fighter Role in the "2022 Russian Invasion of Ukraine" Wikipedia Article
    Roberts, Laura E.
    Xiong-Gum, Mai N.
    PROCEEDINGS OF THE 40TH ACM INTERNATIONAL CONFERENCE ON DESIGN OF COMMUNICATION, SIGDOC 2022, 2022, : 55 - 62
  • [50] Wikipedia as reference source: assessment and prospects
    Kern, Vinicius Medina
    PERSPECTIVAS EM CIENCIA DA INFORMACAO, 2018, 23 (01): : 120 - 143