Multi-Instance Deep Learning: Discover Discriminative Local Anatomies for Bodypart Recognition

被引:144
作者
Yan, Zhennan [1 ]
Zhan, Yiqiang [2 ]
Peng, Zhigang [2 ]
Liao, Shu [2 ]
Shinagawa, Yoshihisa [2 ]
Zhang, Shaoting [3 ]
Metaxas, Dimitris N. [1 ]
Zhou, Xiang Sean [2 ]
机构
[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA
[2] Siemens Healthcare, Malvern, PA 19355 USA
[3] Univ N Carolina, Dept Comp Sci, Charlotte, NC 28223 USA
基金
美国国家科学基金会;
关键词
CNN; discriminative local information discovery; multi-instance; multi-stage; IMAGE FEATURES; GRADIENTS;
D O I
10.1109/TMI.2016.2524985
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In general image recognition problems, discriminative information often lies in local image patches. For example, most human identity information exists in the image patches containing human faces. The same situation stays in medical images as well. "Bodypart identity" of a transversal slice-which bodypart the slice comes from-is often indicated by local image information, e.g., a cardiac slice and an aorta arch slice are only differentiated by the mediastinum region. In this work, we design a multi-stage deep learning framework for image classification and apply it on bodypart recognition. Specifically, the proposed framework aims at: 1) discover the local regions that are discriminative and non-informative to the image classification problem, and 2) learn a image-level classifier based on these local regions. We achieve these two tasks by the two stages of learning scheme, respectively. In the pre-train stage, a convolutional neural network (CNN) is learned in a multi-instance learning fashion to extract the most discriminative and and non-informative local patches from the training slices. In the boosting stage, the pre-learned CNN is further boosted by these local patches for image classification. The CNN learned by exploiting the discriminative local appearances becomes more accurate than those learned from global image context. The key hallmark of our method is that it automatically discovers the discriminative and non-informative local patches through multi-instance deep learning. Thus, no manual annotation is required. Our method is validated on a synthetic dataset and a large scale CT dataset. It achieves better performances than state-of-the-art approaches, including the standard deep CNN.
引用
收藏
页码:1332 / 1343
页数:12
相关论文
共 50 条
[1]  
Andrews Stuart, 2002, Proceedings of the 15th International Conference on Neural Information Processing Systems. NIPS'02, P561
[2]  
[Anonymous], 2010, P 9 PYTH SCI COMP C, DOI DOI 10.25080/MAJORA-92BF1922-003
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   BING: Binarized Normed Gradients for Objectness Estimation at 300fps [J].
Cheng, Ming-Ming ;
Zhang, Ziming ;
Lin, Wen-Yan ;
Torr, Philip .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3286-3293
[5]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[6]   ACTIVE SHAPE MODELS - THEIR TRAINING AND APPLICATION [J].
COOTES, TF ;
TAYLOR, CJ ;
COOPER, DH ;
GRAHAM, J .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1995, 61 (01) :38-59
[7]  
Criminisi A., 2009, MED IM COMP COMP ASS, P69
[8]  
Criminisi A, 2011, LECT NOTES COMPUT SC, V6533, P106, DOI 10.1007/978-3-642-18421-5_11
[9]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[10]   Global localization of 3D anatomical structures by pre-filtered Hough Forests and discrete optimization [J].
Donner, Rene ;
Menze, Bjoern H. ;
Bischof, Horst ;
Langs, Georg .
MEDICAL IMAGE ANALYSIS, 2013, 17 (08) :1304-1314