Point Cloud;
Feature Learning Accelerator;
Algorithm-architecture Co-design;
Sparsity Exploitation;
D O I:
10.1109/DAC56929.2023.10247674
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Grid-based feature learning network plays a key role in recent point-cloud based 3D perception. However, high point sparsity and special operators lead to large memory footprint and long processing latency, posing great challenges to hardware acceleration. We propose FLNA, a novel feature learning accelerator with algorithm-architecture co-design. At algorithm level, the dataflow-decoupled graph is adopted to reduce 86% computation by exploiting inherent sparsity and concat redundancy. At hardware design level, we customize a pipelined architecture with block-wise processing, and introduce transposed SRAM strategy to save 82.1% access power. Implemented on a 40nm technology, FLNA achieves 13.4 - 43.3x speedup over RTX 2080Ti GPU. It rivals the state-of-the-art accelerator by 1.21x energy-efficiency improvement with 50.8% latency reduction.