FLNA: An Energy-Efficient Point Cloud Feature Learning Accelerator with Dataflow Decoupling

被引：1

作者：

Lyu, Dongxu ^{[1
]}

Li, Zhenyu ^{[1
]}

Chen, Yuzhou ^{[1
]}

Xu, Ningyi ^{[1
,3
]}

He, Guanghui ^{[1
,2
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China

[2] Shanghai Jiao Tong Univ, AI Inst, MoE, Key Lab Artificial Intelligence, Shanghai, Peoples R China

[3] Huixi Technol, Chongqing, Peoples R China

来源：

2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC | 2023年

基金：

美国国家科学基金会;

关键词：

Point Cloud; Feature Learning Accelerator; Algorithm-architecture Co-design; Sparsity Exploitation;

D O I：

10.1109/DAC56929.2023.10247674

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Grid-based feature learning network plays a key role in recent point-cloud based 3D perception. However, high point sparsity and special operators lead to large memory footprint and long processing latency, posing great challenges to hardware acceleration. We propose FLNA, a novel feature learning accelerator with algorithm-architecture co-design. At algorithm level, the dataflow-decoupled graph is adopted to reduce 86% computation by exploiting inherent sparsity and concat redundancy. At hardware design level, we customize a pipelined architecture with block-wise processing, and introduce transposed SRAM strategy to save 82.1% access power. Implemented on a 40nm technology, FLNA achieves 13.4 - 43.3x speedup over RTX 2080Ti GPU. It rivals the state-of-the-art accelerator by 1.21x energy-efficiency improvement with 50.8% latency reduction.

引用

页数：6