WikipEvent: Leveraging wikipedia edit history for event detection

被引:6
作者
Tran, Tuan [1 ]
Ceroni, Andrea [1 ]
Georgescu, Mihai [1 ]
Naini, Kaweh Djafari [1 ]
Fisichella, Marco [1 ]
机构
[1] L3S Research Center, Appelstr. 9a, Hannover
来源
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2014年 / 8787卷
基金
欧洲研究理事会;
关键词
Clustering; Event detection; Temporal retrieval; Wikipedia;
D O I
10.1007/978-3-319-11746-1_7
中图分类号
学科分类号
摘要
Much of existing work in information extraction assumes the static nature of relationships in fixed knowledge bases. However, in collaborative environments such as Wikipedia, information and structures are highly dynamic over time. In this work, we introduce a new method to extract complex event structures from Wikipedia. We propose a new model to represent events by engaging multiple entities, generalizable to an arbitrary language. The evolution of an event is captured effectively based on analyzing the user edits history in Wikipedia. Our work provides a foundation for a novel class of evolution-aware entity-based enrichment algorithms, and considerably increases the quality of entity accessibility and temporal retrieval for Wikipedia. We formalize this problem and introduce an efficient end-to-end platform as a solution. We conduct comprehensive experiments on a real dataset of 1.8 million Wikipedia articles to show the effectiveness of our proposed solution. Our results demonstrate that we are able to achieve a precision of 70% when evaluated using manually annotated data. Finally, we make a comparative analysis of our work with the well established Current Event Portal of Wikipedia and find that our system WikipEvent using Co-References method can be used in a complementary way to deliver new and more information about events. © Springer International Publishing Switzerland 2014.
引用
收藏
页码:90 / 108
页数:18
相关论文
共 25 条
[1]  
Allan J., Papka R., Lavrenko V., On-line new event detection and tracking, ACM SIGIR, pp. 37-45, (1998)
[2]  
Alonso O., Strotgen J., Baeza-Yates R., Gertz M., Temporal Information Retrieval: Challenges and Opportunities, WWW, (2011)
[3]  
Bandari R., Asur S., Huberman B.A., The pulse of news in social media: Forecasting popularity, ICWSM, (2012)
[4]  
Budanitsky A., Hirst G., Evaluating wordnet-based measures of lexical semantic relatedness, Comput. Linguist, 32, 1, (2006)
[5]  
Ceroni A., Fisichella M., Towards an entity–based automatic event validation, ECIR 2014. LNCS, 8416, pp. 605-611, (2014)
[6]  
Ceroni A., Georgescu M., Gadiraju U., Djafari Naini K., Fisichella M., Information evolution in wikipedia, Proceedings of the 10th International Symposium on Open Collaboration, OpenSym 2014. ACM, (2014)
[7]  
Ciglan M., Norvag K., Wikipop: Personalized event detection system based on Wikipedia page view statistics, CIKM, pp. 1931-1932, (2010)
[8]  
Das Sarma A., Jain A., Yu C., Dynamic relationship and event discovery, WSDM, pp. 1931-1932, (2011)
[9]  
Efron M., Golovchinsky G., Estimation methods for ranking recent information, ACM SIGIR, pp. 495-504, (2011)
[10]  
Ferschke O., Zesch T., Gurevych I., Wikipedia revision toolkit: Efficiently accessing wikipedia’s edit history, HLT, pp. 97-102, (2011)