Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting

被引:8
作者
Chen, Tianle [1 ]
Keng, Brian [2 ]
Moreno, Javier [2 ]
机构
[1] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada
[2] Rubikloud Technol Inc, Data Sci, Toronto, ON, Canada
来源
2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW) | 2018年
基金
加拿大自然科学与工程研究理事会;
关键词
Survival Analysis; Time series analysis; Neural networks; Consumer products; Multivariate statistics; Maximum likelihood modeling; Bayesian network models; Forecasting; Marketing; CUSTOMERS;
D O I
10.1109/ICDMW.2018.00121
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Access to a large variety of data across a massive population has made it possible to predict customer purchase patterns and responses to marketing campaigns. In particular, accurate demand forecasts for popular products with frequent repeat purchases are essential since these products are one of the main drivers of profits. However, buyer purchase patterns are extremely diverse and sparse on a per-product level due to population heterogeneity as well as dependence in purchase patterns across product categories. Traditional methods in survival analysis have proven effective in dealing with censored data by assuming parametric distributions on inter-arrival times. Distributional parameters are then fitted, typically in a regression framework. On the other hand, neural-network based models take a non-parametric approach to learn relations from a larger functional class. However, the lack of distributional assumptions make it difficult to model partially observed data. In this paper, we model directly the inter-arrival times as well as the partially observed information at each time step in a survival-based approach using Recurrent Neural Networks (RNN) to model purchase times jointly over several products. Instead of predicting a point estimate for inter-arrival times, the RNN outputs parameters that define a distributional estimate. The loss function is the negative log-likelihood of these parameters given partially observed data. This approach allows one to leverage both fully observed data as well as partial information. By externalizing the censoring problem through a log-likelihood loss function, we show that substantial improvements over state-of-the-art machine learning methods can be achieved. We present experimental results based on two open datasets as well as a study on a real dataset from a large retailer.
引用
收藏
页码:810 / 819
页数:10
相关论文
共 28 条
[1]  
Abadi M., 2016, TENSORFLOW LARGESCAL
[2]  
[Anonymous], 1984, Analysis of survival data
[3]  
Choi Edward, 2015, 151105942 ARXIV
[4]  
Direct Marketing Association (U.S.) Global Insight Inc, 2009, POW DIR MARK ROI SAL
[5]   Counting your customers the easy way: An alternative to the Pareto/NBD model [J].
Fader, PS ;
Hardie, BGS ;
Lee, KL .
MARKETING SCIENCE, 2005, 24 (02) :275-284
[6]  
Finkelstein M, 2008, SPRINGER SER RELIAB, P1
[7]  
Flunkert Valentin, 2017, 170404110 ARXIV
[8]  
Graves A, 2013, INT CONF ACOUST SPEE, P6645, DOI 10.1109/ICASSP.2013.6638947
[9]   REGRESSION MODELING STRATEGIES FOR IMPROVED PROGNOSTIC PREDICTION [J].
HARRELL, FE ;
LEE, KL ;
CALIFF, RM ;
PRYOR, DB ;
ROSATI, RA .
STATISTICS IN MEDICINE, 1984, 3 (02) :143-152
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]