TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets

被引:19
作者
Fafalios, Pavlos [1 ]
Iosifidis, Vasileios [1 ]
Ntoutsi, Eirini [1 ]
Dietze, Stefan [1 ]
机构
[1] Leibniz Univ Hannover, L3S Res Ctr, Hannover, Germany
来源
SEMANTIC WEB (ESWC 2018) | 2018年 / 10843卷
关键词
Twitter; RDF; Entity linking; Sentiment analysis; Social media archives;
D O I
10.1007/978-3-319-93417-4_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Publicly available social media archives facilitate research in a variety of fields, such as data science, sociology or the digital humanities, where Twitter has emerged as one of the most prominent sources. However, obtaining, archiving and annotating large amounts of tweets is costly. In this paper, we describe TweetsKB, a publicly available corpus of currently more than 1.5 billion tweets, spanning almost 5 years (Jan'13-Nov'17). Metadata information about the tweets as well as extracted entities, hashtags, user mentions and sentiment information are exposed using established RDF/S vocabularies. Next to a description of the extraction and annotation process, we present use cases to illustrate scenarios for entity-centric information exploration, data integration and knowledge discovery facilitated by TweetsKB.
引用
收藏
页码:177 / 190
页数:14
相关论文
共 33 条
[1]  
[Anonymous], 2011, BIENN C INN DAT SYST
[2]  
[Anonymous], 2017, P 11 INT WORKSH SEM, DOI DOI 10.18653/V1/S17-2088
[3]   Fast and Space-Efficient Entity Linking in Queries [J].
Blanco, Roi ;
Ottaviano, Giuseppe ;
Meij, Edgar .
WSDM'15: PROCEEDINGS OF THE EIGHTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2015, :179-188
[4]   Making sense of social media streams through semantics: A survey [J].
Bontcheva, Kalina ;
Rout, Dominic .
SEMANTIC WEB, 2014, 5 (05) :373-403
[5]  
Breslin J. G., 2006, International Journal of Web Based Communities, V2, P133
[6]   Twitter as a First Draft of the Present - and the Challenges of Preserving It for the Future [J].
Bruns, Axel ;
Weller, Katrin .
PROCEEDINGS OF THE 2016 ACM WEB SCIENCE CONFERENCE (WEBSCI'16), 2016, :183-189
[7]  
Fafalios P, 2017, ACM-IEEE J CONF DIG, P11
[8]   Multi-aspect Entity-Centric Analysis of Big Social Media Archives [J].
Fafalios, Pavlos ;
Iosifidis, Vasileios ;
Stefanidis, Kostas ;
Ntoutsi, Eirini .
RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES (TPDL 2017), 2017, 10450 :261-273
[9]   Exploiting Linked Data for Open and Configurable Named Entity Extraction [J].
Fafalios, Pavlos ;
Baritakis, Manolis ;
Tzitzikas, Yannis .
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2015, 24 (02)
[10]   Onyx: A Linked Data approach to emotion representation [J].
Fernando Sanchez-Rada, J. ;
Iglesias, Carlos A. .
INFORMATION PROCESSING & MANAGEMENT, 2016, 52 (01) :99-114