AI Tax: The Hidden Cost of AI Data Center Applications

被引:7
作者
Richins, Daniel [1 ,2 ,5 ]
Doshi, Dharmisha [2 ]
Blackmore, Matthew [2 ,6 ]
Nair, Aswathy Thulaseedharan [2 ]
Pathapati, Neha [2 ]
Patel, Ankit [2 ]
Daguman, Brainard [2 ]
Dobrijalowski, Daniel [3 ,7 ]
Illikkal, Ramesh [2 ]
Long, Kevin [2 ]
Zimmerman, David [2 ]
Reddi, Vijay Janapa [1 ,4 ,5 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] Intel, 1900 Prairie City Rd, Folsom, CA 95630 USA
[3] Intel Poland, Gdansk, Poland
[4] Harvard Univ, Cambridge, MA 02138 USA
[5] Maxwell Dworkin, 33 Oxford St, Cambridge, MA 02138 USA
[6] 2111 NE 25th Ave, Hillsboro, OR 97124 USA
[7] Intel, Jana Z Kolna 11,Tryton Bldg, PL-80864 Gdansk, Poland
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2021年 / 37卷 / 1-4期
关键词
AI tax; end-to-end AI application; TREES;
D O I
10.1145/3440689
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Artificial intelligence and machine learning are experiencing widespread adoption in industry and academia. This has been driven by rapid advances in the applications and accuracy of AI through increasingly complex algorithms and models; this, in turn, has spurred research into specialized hardware AI accelerators. Given the rapid pace of advances, it is easy to forget that they are often developed and evaluated in a vacuum without considering the full application environment. This article emphasizes the need for a holistic, end-to-end analysis of artificial intelligence (AI) workloads and reveals the "AI tax." We deploy and characterize Face Recognition in an edge data center. The application is an AI-centric edge video analytics application built using popular open source infrastructure and machine learning (ML) tools. Despite using state-of-the-art AI and ML algorithms, the application relies heavily on pre- and post-processing code. As AI-centric applications benefit from the acceleration promised by accelerators, we find they impose stresses on the hardware and software infrastructure: storage and network bandwidth become major bottlenecks with increasing AI acceleration. By specializing for AI applications, we show that a purpose-built edge data center can be designed for the stresses of accelerated AI at 15% lower TCO than one derived from homogeneous servers and infrastructure.
引用
收藏
页数:32
相关论文
共 41 条
[1]  
[Anonymous], Specifications-sn2000 series-mellanox docs
[2]  
[Anonymous], 2015, PROC ADVNEURAL INF P
[3]  
[Anonymous], 2017, TRAINING
[4]  
Birman Ken., 1987, Exploiting virtual synchrony in distributed systems, V21
[5]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[6]  
Chow Michael, 2014, 11 USENIX S OP SYST, P217
[7]   Serving DNNs in Real Time at Datacenter Scale with Project Brainwave [J].
Chung, Eric ;
Fowers, Jeremy ;
Ovtcharov, Kalin ;
Papamichael, Michael ;
Caulfield, Adrian ;
Massengill, Todd ;
Liu, Ming ;
Lo, Daniel ;
Alkalay, Shlomi ;
Haselman, Michael ;
Abeydeera, Maleen ;
Adams, Logan ;
Angepat, Hari ;
Boehn, Christian ;
Chiou, Derek ;
Firestein, Oren ;
Forin, Alessandro ;
Gatlin, Kang Su ;
Ghandi, Mahdi ;
Heil, Stephen ;
Holohan, Kyle ;
El Husseini, Ahmad ;
Juhasz, Tamas ;
Kagi, Kara ;
Kovvuri, Ratna K. ;
Lanka, Sitaram ;
van Megen, Friedel ;
Mukhortov, Dima ;
Patel, Prerak ;
Perez, Brandon ;
Rapsang, Amanda Grace ;
Reinhardt, Steven K. ;
Rouhani, Bita Darvish ;
Sapek, Adam ;
Seera, Raja ;
Shekar, Sangeetha ;
Sridharan, Balaji ;
Weisz, Gabriel ;
Woods, Lisa ;
Xiao, Phillip Yi ;
Zhang, Dan ;
Zhao, Ritchie ;
Burger, Doug .
IEEE MICRO, 2018, 38 (02) :8-20
[8]  
DataTorrent, END TO END EXACTLY O
[9]  
etal M. Abadi, 2016, TENSORFLOW LARGE SCA
[10]  
Gao W., 2019, arXiv preprint arXiv:1908.08998