Deep Learning of Visual and Textual Data for Region Detection Applied to Item Coding

被引:0
|
作者
Arroyo, Roberto [1 ]
Tovar, Javier [1 ]
Delgado, Francisco J. [1 ]
Almazan, Emilio J. [1 ]
Serrador, Diego G. [1 ]
Hurtado, Antonio [1 ]
机构
[1] Nielsen Connect AI, Calle Salvador de Madariaga 1, Madrid 28027, Spain
来源
PATTERN RECOGNITION AND IMAGE ANALYSIS, PT I | 2020年 / 11867卷
关键词
Deep learning; CNNs; OCR; Text-maps; Text regions detection; Item coding; Market studies;
D O I
10.1007/978-3-030-31332-6_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a deep learning approach that combines visual appearance and text information in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual image to provide a much richer input to the network. Text-maps are colored with different intensities depending on the relevance of the words recognized over the image. More specifically, these words are previously extracted using Optical Character Recognition (OCR) and they are colored according to the probability of belonging to a textual category of interest. In this sense, the presented solution is especially relevant in the context of item coding for supermarket products, where different types of textual categories must be identified (e.g., ingredients or nutritional facts). We evaluated our approach in the proprietary item coding dataset of Nielsen Brandbank, which is composed of more than 10,000 images for train and 2,000 images for test. The reported results demonstrate that our method focused on visual and textual data outperforms state-of-the-art algorithms only based on appearance, such as standard Faster R-CNN. These improvements are exhibited in precision and recall, which are enhanced in 42 and 33 points respectively.
引用
收藏
页码:329 / 341
页数:13
相关论文
共 50 条
  • [1] Deep Learning for Depression Detection from Textual Data
    Amanat, Amna
    Rizwan, Muhammad
    Javed, Abdul Rehman
    Abdelhaq, Maha
    Alsaqour, Raed
    Pandya, Sharnil
    Uddin, Mueen
    ELECTRONICS, 2022, 11 (05)
  • [2] Robotic retail surveying by deep learning visual and textual data
    Paolanti, Marina
    Romeo, Luca
    Martini, Massimo
    Mancini, Adriano
    Frontoni, Emanuele
    Zingaretti, Primo
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2019, 118 : 179 - 188
  • [3] Pistachio Visual Detection Based on Data Balance and Deep Learning
    Gao J.
    Ni J.
    Yang H.
    Han Z.
    Han, Zhongzhi (hanzhongzhi@qau.edu.cn), 1600, Chinese Society of Agricultural Machinery (52): : 367 - 372
  • [4] Learning Deep Hierarchical Visual Feature Coding
    Goh, Hanlin
    Thome, Nicolas
    Cord, Matthieu
    Lim, Joo-Hwee
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (12) : 2212 - 2225
  • [5] Aggression Detection in Social Media from Textual Data Using Deep Learning Models
    Khan, Umair
    Khan, Salabat
    Rizwan, Atif
    Atteia, Ghada
    Jamjoom, Mona M.
    Samee, Nagwan Abdel
    APPLIED SCIENCES-BASEL, 2022, 12 (10):
  • [6] An efficient approach for textual data classification using deep learning
    Alqahtani, Abdullah
    Khan, Habib Ullah
    Alsubai, Shtwai
    Sha, Mohemmed
    Almadhor, Ahmad
    Iqbal, Tayyab
    Abbas, Sidra
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16
  • [7] Deep Learning Applied to Intracranial Hemorrhage Detection
    Cortes-Ferre, Luis
    Gutierrez-Naranjo, Miguel Angel
    Egea-Guerrero, Juan Jose
    Perez-Sanchez, Soledad
    Balcerzyk, Marcin
    JOURNAL OF IMAGING, 2023, 9 (02)
  • [8] Deep Learning Applied to Forest Fire Detection
    Arteaga, Byron
    Diaz, Mauricio
    Jojoa, Mario
    2020 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2020), 2020,
  • [9] Visual Anomaly Detection by Distributed Deep Learning
    Hu, Ruiguang
    Sun, Peng
    Ge, Yifan
    AOPC 2020: OPTICAL SENSING AND IMAGING TECHNOLOGY, 2020, 11567
  • [10] Deep Learning for Fake News Detection in a Pairwise Textual Input Schema
    Mouratidis, Despoina
    Nikiforos, Maria Nefeli
    Kermanidis, Katia Lida
    COMPUTATION, 2021, 9 (02) : 1 - 15