History-based Article Quality Assessment on Wikipedia

被引:51
|
作者
Zhang, Shiyue [1 ]
Hu, Zheng [1 ]
Zhang, Chunhong [1 ]
Yu, Ke [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) | 2018年
关键词
Wikipedia; Information Quality; LSTM;
D O I
10.1109/BigComp.2018.00010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Wikipedia is widely considered as the biggest encyclopedia on Internet. Quality assessment of articles on Wikipedia has been studied for years. Conventional methods addressed this task by feature engineering and statistical machine learning algorithms. However, manually defined features are difficult to represent the long edit history of an article. Recently, researchers proposed an end-to-end neural model which used a Recurrent Neural Network(RNN) to learn the representation automatically. Although RNN showed its power in modeling edit history, the end-to-end method is time and resource consuming. In this paper, we propose a new history-based method to represent an article. We also take advantage of an RNN to handle the long edit history, but we do not abandon feature engineering. We still represent each revision of an article by manually defined features. This combination of deep neural model and feature engineering enables our model to be both simple and effective. Experiments demonstrate our model has better or comparable performance than previous works, and has the potential to work as a real-time service. Plus, we extend our model to do quality prediction.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [21] Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression
    TeBlunthuis, Nathan
    PROCEEDINGS OF THE 17TH INTERNATIONAL SYMPOSIUM ON OPEN COLLABORATION (OPENSYM), 2021,
  • [22] THE ADOPTION OF WIKIPEDIA: A COMMUNITY-AND INFORMATION QUALITY-BASED VIEW
    Wang, Kai
    Lin, Chien-Liang
    Chen, Chun-Der
    Yang, Shu-Chen
    12TH PACIFIC ASIA CONFERENCE ON INFORMATION SYSTEMS (PACIS 2008), 2008, : 248 - +
  • [23] Modelling the Quality of Attributes in Wikipedia Infoboxes
    Wecel, Krzysztof
    Lewoniewski, Wlodzimierz
    BUSINESS INFORMATION SYSTEMS WORKSHOPS, BIS 2015, 2015, 228 : 308 - 320
  • [24] Automatically Assessing the Quality of Wikipedia Contents
    Bassani, Elias
    Viviani, Marco
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 804 - 807
  • [25] Generating Quizzes for History Learning Based on Wikipedia Articles
    Tamura, Yoshihiro
    Takase, Yutaka
    Hayashi, Yuki
    Nakano, Yukiko I.
    LEARNING AND COLLABORATION TECHNOLOGIES, LCT 2015, 2015, 9192 : 337 - 346
  • [26] Quality Evaluation of Wikipedia Articles through Edit History and Editor Groups
    Wang, Se
    Iwaihara, Mizuho
    WEB TECHNOLOGIES AND APPLICATIONS, 2011, 6612 : 188 - 199
  • [27] Enrichment of Information in Multilingual Wikipedia Based on Quality Analysis
    Lewoniewski, Wlodzimierz
    BUSINESS INFORMATION SYSTEMS WORKSHOPS, BIS 2017, 2017, 303 : 216 - 227
  • [28] A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia
    Wang, Ping
    Li, Xiaodan
    Wu, Renli
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (02) : 176 - 191
  • [29] Quality and Importance of Wikipedia Articles in Different Languages
    Lewoniewski, Wlodzimierz
    Wecel, Krzysztof
    Abramowicz, Witold
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2016, 2016, 639 : 613 - 624
  • [30] Digital History Meets Wikipedia: Analyzing Historical Persons in Wikipedia
    Jatowt, Adam
    Kawai, Daisuke
    Tanaka, Katsumi
    2016 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2016, : 17 - 26