Visual Saliency and Terminology Extraction for Document Classification

被引:0
作者
Benjamin, Duthil [1 ]
Mickael, Coustaty [1 ]
Vincent, Courboulay [1 ]
Ogier, Jean-Marc [1 ]
机构
[1] Univ La Rochelle, Lab Informat Image & Interact, F-17042 La Rochelle, France
来源
GRAPHICS RECOGNITION: CURRENT TRENDS AND CHALLENGES | 2014年 / 8746卷
关键词
Extraction - Information retrieval systems;
D O I
10.1007/978-3-662-44854-0_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The document digitization process becomes a crucial economical issue in our society. Then, it becomes necessary to be able to organize this huge amount of documents. The work proposed in this paper tends to propose a new method to automatically classify documents using a saliency-based segmentation process on one hand, and a terminology extraction and annotation on the other hand. The saliency-based segmentation is used to extract salient regions and by the way logo, while the terminology approach is used to annotate them and to automatically classify the document. The approach does not require human expertise, and use Google Images as a knowledge database. The results obtained on a real database of 1766 documents show the relevance of the approach.
引用
收藏
页码:96 / 108
页数:13
相关论文
共 18 条
[1]  
Ahmed Zeggari, 2008, 2008 IEEE International Symposium on Industrial Electronics (ISIE 2008), P2492, DOI 10.1109/ISIE.2008.4677020
[2]  
Alajlan N, 2007, LECT NOTES COMPUT SC, V4633, P436
[3]  
Da Silva MP, 2010, VISAPP 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, P275
[4]  
Duthil Benjamin, 2011, Database and Expert Systems Applications. Proceedings 22nd International Conference, DEXA 2011, P457, DOI 10.1007/978-3-642-23088-2_34
[5]  
Hongye Wang, 2009, 2009 10th International Conference on Document Analysis and Recognition (ICDAR), P1335, DOI 10.1109/ICDAR.2009.129
[6]   A model of saliency-based visual attention for rapid scene analysis [J].
Itti, L ;
Koch, C ;
Niebur, E .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (11) :1254-1259
[7]   Proper nouns and common nouns: A problem of denomination [J].
Kleiber, G .
META, 1996, 41 (04) :567-589
[8]  
Nourbakhsh F., 2011, 9 IAPR INT WORKSH GR
[9]  
Perreira Da Silva M, 2012, DEV APPL BIOL INSPIR, P273
[10]   Unconstrained logo detection in document images [J].
Pham, TD .
PATTERN RECOGNITION, 2003, 36 (12) :3023-3025