Mining team characteristics to predict Wikipedia article quality

被引:5
作者
Betancourt, Grace Gimon [1 ]
Segnini, Armando [1 ]
Trabuco, Carlos [1 ]
Rezgui, Amira [1 ]
Jullien, Nicolas [1 ]
机构
[1] Telecom Bretagne, Brest, France
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL SYMPOSIUM ON OPEN COLLABORATION (OPENSYM) | 2016年
关键词
Wikipedia; Epistemic community; Article Quality; Teaming; INFORMATION QUALITY;
D O I
10.1145/2957792.2971802
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we were interested in studying which characteristics of virtual teams are good predictors for the quality of their production. The experiment involved obtaining the Spanish Wikipedia database dump and applying different data mining techniques suitable for large data sets to label the whole set of articles according to their quality (comparing them with the Featured/Good Articles, or FA/GA). Then we created the attributes that describe the characteristics of the team who produced the articles and using decision tree methods, we obtained the most relevant characteristics of the teams that produced FA/GA. The team's maximum efficiency and the total length of contribution are the most important predictors. This article contributes to the literature on virtual team organization.
引用
收藏
页数:9
相关论文
共 41 条
  • [1] Adler B. T., 2008, P 4 INT S WIK WIKISY
  • [2] [Anonymous], 2006, Understanding Knowledge as a Commons: from Theory to Practice
  • [3] Information Quality in Wikipedia: The Effects of Group Composition and Task Conflict
    Arazy, Ofer
    Nov, Oded
    Patterson, Raymond
    Yeo, Lisa
    [J]. JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2011, 27 (04) : 71 - 98
  • [4] Arazy O, 2010, 2010 ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK, P233
  • [5] The relationship between the big five personality factors and burnout: A study among volunteer counselors
    Bakker, AB
    Van der Zee, KI
    Lewig, KA
    Dollard, MF
    [J]. JOURNAL OF SOCIAL PSYCHOLOGY, 2006, 146 (01) : 31 - 50
  • [6] Blumenstock JE, 2008, P 17 INT C WORLD WID, P1095, DOI [DOI 10.1145/1367497.1367673, 10.1145/1367497.1367673]
  • [7] Temporal analysis of the wikigraph
    Buriol, Luciana S.
    Castillo, Carlos
    Donato, Debora
    Leonardi, Stefano
    Millozzi, Stefano
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 45 - +
  • [8] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [9] Crowston K., 2006, Knowledge Technology & Policy, V18, P65, DOI 10.1007/s12130-006-1004-8
  • [10] Crowston K., 2006, Software Process Improvement and Practice, V11, P123, DOI 10.1002/spip.259