A deep learning framework for quality assessment and restoration in video endoscopy

被引:65
作者
Ali, Sharib [1 ,2 ,5 ,6 ]
Zhou, Felix [3 ,5 ]
Bailey, Adam [4 ,5 ,6 ]
Braden, Barbara [4 ,5 ,6 ]
East, James E. [4 ,5 ,6 ]
Lu, Xin [3 ,5 ,6 ]
Rittscher, Jens [1 ,2 ,5 ]
机构
[1] Inst Biomed Engn, Oxford, England
[2] Big Data Inst, Oxford, England
[3] Ludwig Inst Canc Res, Oxford, England
[4] John Radcliffe Hosp, Div Expt Med, Translat Gastroenterol Unit, Oxford, England
[5] Univ Oxford, Old Rd Campus, Oxford, England
[6] Oxford NIHR Biomed Res Ctr, Oxford, England
基金
英国工程与自然科学研究理事会;
关键词
Video endoscopy; Multi-class artifact detection; Multi-class artifact segmentation; Convolution neural networks; Frame restoration;
D O I
10.1016/j.media.2020.101900
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. Artifacts such as motion blur, bubbles, specular reflections, floating objects and pixel saturation impede the visual interpretation and the automated analysis of endoscopy videos. Given the widespread use of endoscopy in different clinical applications, robust and reliable identification of such artifacts and the automated restoration of corrupted video frames is a fundamental medical imaging problem. Existing state-of-the-art methods only deal with the detection and restoration of selected artifacts. However, typically endoscopy videos contain numerous artifacts which motivates to establish a comprehensive solution. In this paper, a fully automatic framework is proposed that can: 1) detect and classify six different artifacts, 2) segment artifact instances that have indefinable shapes, 3) provide a quality score for each frame, and 4) restore partially corrupted frames. To detect and classify different artifacts, the proposed framework exploits fast, multi-scale and single stage convolution neural network detector. In addition, we use an encoder-decoder model for pixel-wise segmentation of irregular shaped artifacts. A quality score is introduced to assess video frame quality and to predict image restoration success. Generative adversarial networks with carefully chosen regularization and training strategies for discriminator-generator networks are finally used to restore corrupted frames. The detector yields the highest mean average precision (mAP) of 45.7 and 34.7, respectively for 25% and 50% IoU thresholds, and the lowest computational time of 88 ms allowing for near real-time processing. The restoration models for blind deblurring, saturation correction and inpainting demonstrate significant improvements over previous methods. On a set of 10 test videos, an average of 68.7% of video frames successfully passed the quality score (>= 0.9) after applying the proposed restoration framework thereby retaining 25% more frames compared to the raw videos. The importance of artifacts detection and their restoration on improved robustness of image analysis methods is also demonstrated in this work. (C) 2020 The Authors. Published by Elsevier B.V.
引用
收藏
页数:25
相关论文
共 61 条
[1]   Towards an automatic correction of over-exposure in photographs: Application to tone-mapping [J].
Abebe, Mekides Assefa ;
Booth, Alexandra ;
Kervec, Jonathan ;
Pouli, Tania ;
Larabi, Mohamed-Chaker .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 168 :3-20
[2]  
Akbari M, 2018, IEEE IMAGE PROC, P3134, DOI 10.1109/ICIP.2018.8451699
[3]   An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy [J].
Ali, Sharib ;
Zhou, Felix ;
Braden, Barbara ;
Bailey, Adam ;
Yang, Suhui ;
Cheng, Guanju ;
Zhang, Pengyi ;
Li, Xiaoqiong ;
Kayser, Maxime ;
Soberanis-Mukul, Roger D. ;
Albarqouni, Shadi ;
Wang, Xiaokang ;
Wang, Chunqing ;
Watanabe, Seiryo ;
Oksuz, Ilkay ;
Ning, Qingtian ;
Yang, Shufan ;
Khan, Mohammad Azam ;
Gao, Xiaohong W. ;
Realdon, Stefano ;
Loshchenov, Maxim ;
Schnabel, Julia A. ;
East, James E. ;
Wagnieres, Georges ;
Loschenov, Victor B. ;
Grisan, Enrico ;
Daul, Christian ;
Blondel, Walter ;
Rittscher, Jens .
SCIENTIFIC REPORTS, 2020, 10 (01)
[4]  
Ali S, 2018, I S BIOMED IMAGING, P729, DOI 10.1109/ISBI.2018.8363677
[5]   Illumination invariant optical flow using neighborhood descriptors [J].
Ali, Sharib ;
Daul, Christian ;
Galbrun, Ernest ;
Blondel, Walter .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2016, 145 :95-110
[6]   Anisotropic motion estimation on edge preserving Riesz wavelets for robust video mosaicing [J].
Ali, Sharib ;
Daul, Christian ;
Galbrun, Ernest ;
Guillemin, Francois ;
Blondel, Walter .
PATTERN RECOGNITION, 2016, 51 :425-442
[7]  
[Anonymous], 2018, IEEE C COMPUTER VISI
[8]  
[Anonymous], 2011, IMAGE PROCESS LINE
[9]  
[Anonymous], 2017, IEEE INT C COMPUT VI, DOI [10.1109/iccv.201, DOI 10.1109/ICCV.2017.322]
[10]  
[Anonymous], 2014, arXiv