Learning Markov Clustering Networks for Scene Text Detection

被引：67

作者：

Liu, Zichuan ^{[1
]}

Lin, Guosheng ^{[1
]}

Yang, Sheng ^{[1
]}

Feng, Jiashi ^{[2
]}

Lin, Weisi ^{[1
]}

Goh, Wang Ling ^{[1
]}

机构：

[1] Nanyang Technol Univ, Singapore, Singapore

[2] Natl Univ Singapore, Singapore, Singapore

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

LOCALIZATION;

D O I：

10.1109/CVPR.2018.00725

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A novel framework named Markov Clustering Network (MCN) is proposed for fast and robust scene text detection. MCN predicts instance-level bounding boxes by firstly converting an image into a Stochastic Flow Graph (SFG) and then performing Markov Clustering on this graph. Our method can detect text objects with arbitrary size and orientation without prior knowledge of object size. The stochastic flow graph encode objects' local correlation and semantic information. An object is modeled as strongly connected nodes, which allows flexible bottom-up detection for scale-varying and rotated objects. MCN generates bounding boxes without using Non-Maximum Suppression, and it can be fully parallelized on GPUs. The evaluation on public benchmarks shows that our method outperforms the existing methods by a large margin in detecting multioriented text objects. MCN achieves new state-of-art performance on challenging MSRA-TD500 dataset with precision of 0.88, recall of 0.79 and F-score of 0.83. Also, MCN achieves real-time inference with frame rate of 34 FPS, which is 1.5 x speedup when compared with the fastest scene text detection algorithm.

引用

页码：6936 / 6944

页数：9

共 30 条

[1]

Abadi M., 2016, TENSORFLOW LARGESCAL

[2]

[Anonymous], 2008, 2008 IEEE Hot Chips 20 Symposium (HCS), DOI 10.1109/HOTCHIPS.2008.7476516

[3]

[Anonymous], 2000, GRAPH CLUSTERING FLO

[4]

[Anonymous], 2018, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2018.00254

[5] PhotoOCR: Reading Text in Uncontrolled Conditions [J].

Bissacco, Alessandro ;

Cummins, Mark ;

Netzer, Yuval ;

Neven, Hartmut .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :785-792

[6]

Girshick R., 2015, P IEEE INT C COMPUTE, P1440, DOI [10.1109/ICCV.2015.169, DOI 10.1109/ICCV.2015.169]

[7] Synthetic Data for Text Localisation in Natural Images [J].

Gupta, Ankush ;

Vedaldi, Andrea ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2315-2324

[8] Deep Direct Regression for Multi-Oriented Scene Text Detection [J].

He, Wenhao ;

Zhang, Xu-Yao ;

Yin, Fei ;

Liu, Cheng-Lin .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :745-753

[9] Text Localization in Natural Images using Stroke Feature Transform and Text Covariance Descriptors [J].

Huang, Weilin ;

Lin, Zhe ;

Yang, Jianchao ;

Wang, Jue .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1241-1248

[10]

Huang WL, 2014, LECT NOTES COMPUT SC, V8692, P497, DOI 10.1007/978-3-319-10593-2_33

← 1 2 3 →