Hardware-efficient algorithm and architecture design with memory and complexity reduction for semi-global matching

被引:5
作者
Chang, Cheng-Tsung [1 ]
Chen, Pin-Wei [1 ]
Chin, Wen-Long [1 ]
Chou, Shih-Hsiang [1 ]
Yang, Yu-Hua [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Engn Sci, 1 Univ Rd, Tainan, Taiwan
关键词
Stereo matching; Depth estimation; Semi-global matching; Hardware design; HIGH-THROUGHPUT; STEREO; SYSTEM;
D O I
10.1016/j.vlsi.2023.05.005
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Among the stereo matching algorithms, the semi-global matching (SGM) is an efficient and high-accuracy method. However, its huge demand for memory access and high computational complexity makes it difficult to achieve a real-time and efficient processing on hardware. Based on the spatial redundancy found in the matching cost, we propose some effective techniques to reduce the requirement of on-chip and off -chip memory, while simultaneously greatly lower the computational complexity. Experimental results present that the proposed SGM algorithm reduces the computational complexity by 71%-74% and has almost the same quality of disparity map compared with the original 8-path SGM. The proposed 3-path fully-pipelined architecture is implemented on the Xilinx VCU-106 with a throughput of 1920 x 1080/54 fps. We also synthesize and layout it with TSMC 40 nm standard library, leading to an area of 8.1 mm2 with throughput of 1920 x 1080/192 fps. The million disparity estimation per second (MDE/s) of the proposed design reaches up to 50,960, which outperforms conventional ASIC implementations.
引用
收藏
页码:99 / 105
页数:7
相关论文
共 21 条
[1]  
Bong K, 2017, 2017 30TH IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), P18, DOI 10.1109/SOCC.2017.8225998
[2]   Hardware Module for Low-resource and Real-Time Stereo Vision Engine Using Semi-Global Matching Approach [J].
Cambuim, Lucas F. S. ;
Barbosa, Joao P. F. ;
Barros, Edna N. S. .
2017 30TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN (SBCCI 2017): CHOP ON SANDS, 2017, :53-58
[3]   A 4.29nJ/pixel Stereo Depth Coprocessor With Pixel Level Pipeline and Region Optimized Semi-Global Matching for IoT Application [J].
Dong, Pingcheng ;
Chen, Zhuoyu ;
Li, Zhuoao ;
Fu, Yuzhe ;
Chen, Lei ;
An, Fengwei .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (01) :334-346
[4]   Modular Design of High-Throughput, Low-Latency Sorting Units [J].
Farmahini-Farahani, Amin ;
Duwe, Henry J., III ;
Schulte, Michael J. ;
Compton, Katherine .
IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (07) :1389-1402
[5]  
Gehrig SK, 2009, LECT NOTES COMPUT SC, V5815, P134, DOI 10.1007/978-3-642-04667-4_14
[6]  
Hirschm_uller H., 2012, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, VI3, P371
[7]  
Hirschmüller H, 2008, IEEE T PATTERN ANAL, V30, P328, DOI [10.1109/TPAMI.2007.1166, 10.1109/TPAMl.2007.1166]
[8]   IMAGE DATA-COMPRESSION - A REVIEW [J].
JAIN, AK .
PROCEEDINGS OF THE IEEE, 1981, 69 (03) :349-389
[9]   High throughput hardware architecture for accurate semi-global matching [J].
Li, Yan ;
Li, Zhiwei ;
Yang, Chen ;
Zhong, Wei ;
Chen, Song .
INTEGRATION-THE VLSI JOURNAL, 2019, 65 :417-427
[10]   A 1920 x 1080 30-frames/s 2.3 TOPS/W Stereo-Depth Processor for Energy-Efficient Autonomous Navigation of Micro Aerial Vehicles [J].
Li, Ziyun ;
Dong, Qing ;
Saligane, Mehdi ;
Kempke, Benjamin ;
Gong, Luyao ;
Zhang, Zhengya ;
Dreslinski, Ronald ;
Sylvester, Dennis ;
Blaauw, David ;
Kim, Hun-Seok .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (01) :76-90