An Architecture for Agile Machine Learning in Real-Time Applications

被引:14
作者
Schleier-Smith, Johann [1 ]
机构
[1] If We Inc, 848 Battery St, San Francisco, CA 94111 USA
来源
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年
关键词
Agile; Recommender Systems; Machine Learning;
D O I
10.1145/2783258.2788628
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning techniques have proved effective in recommender systems and other applications, yet teams working to deploy them lack many of the advantages that those in more established software disciplines today take for granted. The well-known Agile methodology advances projects in a chain of rapid development cycles, with subsequent steps often informed by production experiments. Support for such workflow in machine learning applications remains primitive. The platform developed at if(we) embodies a specific machine learning approach and a rigorous data architecture constraint, so allowing teams to work in rapid iterative cycles. We require models to consume data from a time ordered event history, and we focus on facilitating creative feature engineering. We make it practical for data scientists to use the same model code in development and in production deployment, and make it practical for them to collaborate on complex models. We deliver real-time recommendations at scale, returning top results from among 10,000,000 candidates with sub second response times and incorporating new updates in just a few seconds. Using the approach and architecture described here, our team can routinely go from ideas for new models to production-validated results within two weeks.
引用
收藏
页码:2059 / 2068
页数:10
相关论文
共 29 条
[1]   MillWheel: Fault-Tolerant Stream Processing at Internet Scale [J].
Akidau, Tyler ;
Balikov, Alex ;
Bekiroglu, Kaya ;
Chernyak, Slava ;
Haberman, Josh ;
Lax, Reuven ;
McVeety, Sam ;
Mills, Daniel ;
Nordstrom, Paul ;
Whittle, Sam .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (11) :1033-1044
[2]  
Anderson Michael., 2013, CIDR
[3]  
[Anonymous], 2013, LOG WHAT EVERY SOFTW
[4]  
[Anonymous], AGILE DATA SCI BUILD
[5]  
Betts Dominic., 2013, MICROSOFT PATTERNS P
[6]  
Brooks FP., 1995, The mythical man-month
[7]  
Chandrasekaran S., 2002, Proceedings of the Twenty-eighth International Conference on Very Large Data Bases, P203
[8]  
Chandy K., 2010, Event Processing: Designing IT Systems for Agile Companies, V1st
[9]  
Crankshaw D., 2014, ABS14093809 CORR
[10]   Development and Deployment at Facebook [J].
Feitelson, Dror G. ;
Frachtenberg, Eitan ;
Beck, Kent L. .
IEEE INTERNET COMPUTING, 2013, 17 (04) :8-17