Temporal Topic Inference for Trend Prediction

被引:4
作者
Aghababaei, Somayyeh [1 ]
Makrehchi, Masoud [1 ]
机构
[1] UOIT, Fac Engn & Appl Sci, 2000 Simcoe St North, Oshawa, ON, Canada
来源
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW) | 2015年
关键词
D O I
10.1109/ICDMW.2015.214
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Publicly available social data has been adopted widely to explore language of crowds and leverage them in real world problem predictions. In microblogs, users extensively share information about their moods, topics of interests, and social events which provide ideal data resource for many applications. We also study footprints of social problems in Twitter data. Hidden topics identified from Twitter content are utilized to predict crime trend. Since our problem has a sequential order, extracting meaningful patterns involves temporal analysis. Prediction model requires to address information evolution, in which data are more related when they are close in time rather than further apart. The study has been presented into two steps: firstly, a temporal topic detection model is introduced to infer predictive hidden topics. The model builds a dynamic vocabulary to detect emerged topics. Topics are compared over time to have diversity and novelty in each time consideration. Secondly, a predictive model is proposed which utilizes identified temporal topics to predict crime trend in prospective timeframe. The model does not suffer from lack of available learning examples. Learning examples are annotated with knowledge inferred from the trend. The experiments have revealed, temporal topic detection outperforms static topic modeling when dealing with sequential data. Topics are more diverse when are inferred in different time slices. In general, the results indicate temporal topics have a strong correlation with crime index changes. Predictability is high in some specific crime types and could be variant depending on the incidents. The study provides insight into the correlation of language and real world problems and impacts of social data in providing predictive indicators.
引用
收藏
页码:877 / 884
页数:8
相关论文
共 26 条
[1]  
AlSumait L, 2009, LECT NOTES ARTIF INT, V5781, P67, DOI 10.1007/978-3-642-04180-8_22
[2]   On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking [J].
AlSumait, Loulwah ;
Barbara, Daniel ;
Domeniconi, Carlotta .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :3-12
[3]  
[Anonymous], 2012, ENV SYST RES, DOI DOI 10.1186/2190-8532-1-2
[4]  
[Anonymous], 2012, P COLING 2012
[5]  
[Anonymous], ARXIV12033463
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]   The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime [J].
Chainey, Spencer ;
Tompson, Lisa ;
Uhlig, Sebastian .
SECURITY JOURNAL, 2008, 21 (1-2) :4-28
[8]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[9]   Termite: Visualization Techniques for Assessing Textual Topic Models [J].
Chuang, Jason ;
Manning, Christopher D. ;
Heer, Jeffrey .
PROCEEDINGS OF THE INTERNATIONAL WORKING CONFERENCE ON ADVANCED VISUAL INTERFACES, 2012, :74-77
[10]  
Eck, 2005, MAPPING CRIME UNDERS