TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting

被引:156
作者
Feng, Wei [1 ,2 ]
He, Wenhao [1 ,2 ]
Yin, Fei [1 ,2 ]
Zhang, Xu-Yao [1 ,2 ]
Liu, Cheng-Lin [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing 100190, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV.2019.00917
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing text spotting methods either focus on horizontal/oriented texts or perform arbitrary shaped text spotting with character-level annotations. In this paper, we propose a novel text spotting framework to detect and recognize text of arbitrary shapes in an end-to-end manner, using only word/line-level annotations for training. Motivated from the name of TextSnake [32], which is only a detection model, we call the proposed text spotting framework TextDragon. In TextDragon, a text detector is designed to describe the shape of text with a series of quadrangles, which can handle text of arbitrary shapes. To extract arbitrary text regions from feature maps, we propose a new differentiable operator named RoISlide, which is the key to connect arbitrary shaped text detection and recognition. Based on the extracted features through RoISlide, a CNN and CTC based text recognizer is introduced to make the framework free from labeling the location of characters. The proposed method achieves state-of-the-art performance on two curved text benchmarks CTW1500 and Total-Text, and competitive results on the ICDAR 2015 Dataset.
引用
收藏
页码:9075 / 9084
页数:10
相关论文
共 49 条
[1]  
[Anonymous], 2017, CORR
[2]  
[Anonymous], 2017, ARXIV170901727
[3]   PhotoOCR: Reading Text in Uncontrolled Conditions [J].
Bissacco, Alessandro ;
Cummins, Mark ;
Netzer, Yuval ;
Neven, Hartmut .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :785-792
[4]   Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework [J].
Busta, Michal ;
Neumann, Lukas ;
Matas, Jiri .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2223-2231
[5]   Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition [J].
Ch'ng, Chee Kheng ;
Chan, Chee Seng .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :935-942
[6]  
Chen H., 2011, 2011 18th IEEE International Conference on Image Processing (ICIP 2011), P2609, DOI 10.1109/ICIP.2011.6116200
[7]  
Deng D, 2018, AAAI CONF ARTIF INTE, P6773
[8]   TextProposals: A text-specific selective search algorithm for word spotting in the wild [J].
Gomez, Lluis ;
Karatzas, Dimosthenis .
PATTERN RECOGNITION, 2017, 70 :60-74
[9]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[10]  
Graves A., 2006, P 23 INT C MACH LEAR, P369