3D reconstruction from endoscopy images: A survey

被引:9
作者
Yang Z. [1 ,2 ]
Dai J. [2 ]
Pan J. [1 ,2 ]
机构
[1] State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, 37 Xueyuan Road, Haidian District, Beijing
[2] Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, 518000, Guangdong Province
基金
中国国家自然科学基金;
关键词
3D reconstruction; Depth estimation; Endoscopy; Feature matching; Scene representation; SLAM;
D O I
10.1016/j.compbiomed.2024.108546
中图分类号
学科分类号
摘要
Three-dimensional reconstruction of images acquired through endoscopes is playing a vital role in an increasing number of medical applications. Endoscopes used in the clinic are commonly classified as monocular endoscopes and binocular endoscopes. We have reviewed the classification of methods for depth estimation according to the type of endoscope. Basically, depth estimation relies on feature matching of images and multi-view geometry theory. However, these traditional techniques have many problems in the endoscopic environment. With the increasing development of deep learning techniques, there is a growing number of works based on learning methods to address challenges such as inconsistent illumination and texture sparsity. We have reviewed over 170 papers published in the 10 years from 2013 to 2023. The commonly used public datasets and performance metrics are summarized. We also give a taxonomy of methods and analyze the advantages and drawbacks of algorithms. Summary tables and result atlas are listed to facilitate the comparison of qualitative and quantitative performance of different methods in each category. In addition, we summarize commonly used scene representation methods in endoscopy and speculate on the prospects of deep estimation research in medical applications. We also compare the robustness performance, processing time, and scene representation of the methods to facilitate doctors and researchers in selecting appropriate methods based on surgical applications. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 170 条
[51]  
Mur-Artal R., Montiel J.M.M., Tardos J.D., ORB-SLAM: a versatile and accurate monocular SLAM system, IEEE Trans. Robot., 31, 5, pp. 1147-1163, (2015)
[52]  
Sturm J., Engelhard N., Endres F., Burgard W., Cremers D., A benchmark for the evaluation of RGB-D SLAM systems, pp. 573-580, (2012)
[53]  
Zhou T., Brown M., Snavely N., Lowe D.G., Unsupervised learning of depth and ego-motion from video, pp. 1851-1858, (2017)
[54]  
Horn B.K., Closed-form solution of absolute orientation using unit quaternions, JOSA A, 4, 4, pp. 629-642, (1987)
[55]  
Zhou Z., Fan X., Shi P., Xin Y., R-MSFM: Recurrent Multi-Scale Feature Modulation for Monocular Depth Estimating, pp. 12757-12766, (2021)
[56]  
Lowe D.G., Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vis., 60, 2, pp. 91-110, (2004)
[57]  
Bay H., Tuytelaars T., Gool L.V., Surf: Speeded up robust features, European Conference on Computer Vision, pp. 404-417, (2006)
[58]  
Rublee E., Rabaud V., Konolige K., Bradski G., ORB: An efficient alternative to SIFT or SURF, International Conference on Computer Vision, pp. 2564-2571, (2011)
[59]  
Calonder M., Lepetit V., Strecha C., Fua P., Brief: Binary robust independent elementary features, European Conference on Computer Vision, pp. 778-792, (2010)
[60]  
Dong J., Soatto S., Domain-size pooling in local descriptors: DSP-SIFT, pp. 5097-5106, (2015)