A Scalable BEV Perception Processor for Image/Point Cloud Fusion Applications Using CAM-Based Universal Mapping Unit

被引：1

作者：

Feng, Xiaoyu ^{[1
]}

Lin, Xinyuan ^{[1
]}

Yang, Huazhong ^{[1
]}

Liu, Yongpan ^{[1
]}

Sun, Wenyu ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

来源：

IEEE JOURNAL OF SOLID-STATE CIRCUITS | 2025年 / 60卷 / 03期

关键词：

Point cloud compression; Arrays; Three-dimensional displays; Semantics; Parallel processing; Topology; Feature extraction; Cloud computing; Autonomous vehicles; Network topology; Bird's eye view (BEV); chip-level parallelism; content addressable memory (CAM); diverse memory access; multi-modal fusion;

D O I：

10.1109/JSSC.2024.3514733

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The integration of multi-sensor data like image and point cloud for information complementarity is crucial for 3-D perception scenarios like autonomous driving. Recently, bird's eye view (BEV)-based sensor fusion is attracting more and more attention but the significant computational overhead constrains their widespread application at the edge. First, there are numerous irregular memory access operations in BEV fusion networks. For example, sparse convolutions (SCONVs) in the point cloud branch and irregular BEV plane mapping result in significant memory addressing and mapping overhead. Furthermore, multi-sensor fusion leads to rapid expansion of model size, making it difficult and expensive for single-chip solutions to meet the demands. Based on the above challenges, this work proposes an image and point cloud fusion processor with two highlights: a content addressable memory (CAM)-based deep fusion core to accelerate a variety of irregular BEV operations and chip-level parallelism design supporting flexible interconnect topology. The proposed chip is fabricated in 28-nm CMOS technology. Compared with existing image or point cloud accelerators, the proposed chip achieves higher frequency, 2x higher area efficiency, and 2.61x higher energy efficiency for sparse point cloud processing. To the best of authors' knowledge, this work is the first accelerator for BEV-based multi-modal fusion networks.

引用

页码：1002 / 1013

页数：12

共 33 条

[1] nuScenes: A multimodal dataset for autonomous driving [J].

Caesar, Holger ;

Bankiti, Varun ;

Lang, Alex H. ;

Vora, Sourabh ;

Liong, Venice Erin ;

Xu, Qiang ;

Krishnan, Anush ;

Pan, Yu ;

Baldan, Giancarlo ;

Beijbom, Oscar .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628

[2]

Cao Qiankai, 2022, 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), P106, DOI 10.1109/VLSITechnologyandCir46769.2022.9830178

[3]

Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007

[4]

DU CY, 2023, IEEE J SOLID-ST CIRC, V58, P1, DOI DOI 10.1007/S11104-023-06076-6

[5] A 28nm 1.2GHz 5.27TOPS/W Scalable Vision/Point Cloud Deep Fusion Processor with CAM-based Universal Mapping Unit for BEVFusion Applications [J].

Feng, Xiaoyu ;

Sun, Wenyu ;

Lin, Xinyuan ;

Fan, Shupei ;

Yang, Huazhong ;

Liu, Yongpan .

2024 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE, CICC, 2024,

[6] A 28-nm Energy-Efficient Sparse Neural Network Processor for Point Cloud Applications Using Block-Wise Online Neighbor Searching [J].

Feng, Xiaoyu ;

Sun, Wenyu ;

Tang, Chen ;

Lin, Xinyuan ;

Yue, Jinshan ;

Yang, Huazhong ;

Liu, Yongpan .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (09) :3070-3081

[7]

Graham B., 2017, arXiv

[8]

Huang J, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P2563

[9]

Huang Junjie, 2022, ARXIV

[10] EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection [J].

Huang, Tengteng ;

Liu, Zhe ;

Chen, Xiwu ;

Bai, Xiang .

COMPUTER VISION - ECCV 2020, PT XV, 2020, 12360 :35-52

← 1 2 3 4 →