Scene Description for Visually Impaired People with Multi-Label Convolutional SVM Networks

被引:10
作者
Bazi, Yakoub [1 ]
Alhichri, Haikel [1 ]
Alajlan, Naif [1 ]
Melgani, Farid [2 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Engn, Riyadh 11543, Saudi Arabia
[2] Univ Trento, Dept Informat Engn & Comp Sci, Via Sommarive 9, I-38123 Trento, Italy
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 23期
关键词
visually impaired (VI); computer vision; deep learning; multi-label convolutional support vector machine (M-CSVM); OBJECT DETECTION; RECOGNITION; AID;
D O I
10.3390/app9235062
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this paper, we present a portable camera-based method for helping visually impaired (VI) people to recognize multiple objects in images. This method relies on a novel multi-label convolutional support vector machine (CSVM) network for coarse description of images. The core idea of CSVM is to use a set of linear SVMs as filter banks for feature map generation. During the training phase, the weights of the SVM filters are obtained using a forward-supervised learning strategy unlike the backpropagation algorithm used in standard convolutional neural networks (CNNs). To handle multi-label detection, we introduce a multi-branch CSVM architecture, where each branch will be used for detecting one object in the image. This architecture exploits the correlation between the objects present in the image by means of an opportune fusion mechanism of the intermediate outputs provided by the convolution layers of each branch. The high-level reasoning of the network is done through binary classification SVMs for predicting the presence/absence of objects in the image. The experiments obtained on two indoor datasets and one outdoor dataset acquired from a portable camera mounted on a lightweight shield worn by the user, and connected via a USB wire to a laptop processing unit are reported and discussed.
引用
收藏
页数:13
相关论文
共 39 条
[21]  
He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[22]  
López-de-Ipiña D, 2011, LECT NOTES COMPUT SC, V6719, P266, DOI 10.1007/978-3-642-21535-3_39
[23]   A Virtualization-based Cloud Infrastructure for IMS Core Network [J].
Lu, Feng ;
Pan, Hao ;
Lei, Xiao ;
Liao, Xiaofei ;
Jin, Hai .
2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 1, 2013, :25-32
[24]   Real-Time Indoor Scene Description for the Visually Impaired Using Autoencoder Fusion Strategies with Visible Cameras [J].
Malek, Salim ;
Melgani, Farid ;
Mekhalfi, Mohamed Lamine ;
Bazi, Yakoub .
SENSORS, 2017, 17 (11)
[25]  
Mariolis I, 2015, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), P655, DOI 10.1109/ICAR.2015.7251526
[26]   Fast indoor scene description for blind people with multiresolution random projections [J].
Mekhalfi, Mohamed L. ;
Melgani, Farid ;
Bazi, Yakoub ;
Alajlan, Naif .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 44 :95-105
[27]  
Moranduzzo T, 2015, INT GEOSCI REMOTE SE, P2362, DOI 10.1109/IGARSS.2015.7326283
[28]   Fingerprint Liveness Detection Using Convolutional Neural Networks [J].
Nogueira, Rodrigo Frassetto ;
Lotufo, Roberto de Alencar ;
Machado, Rubens Campos .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2016, 11 (06) :1206-1213
[29]   DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks [J].
Ouyang, Wanli ;
Zeng, Xingyu ;
Wang, Xiaogang ;
Qiu, Shi ;
Luo, Ping ;
Tian, Yonglong ;
Li, Hongsheng ;
Yang, Shuo ;
Wang, Zhe ;
Li, Hongyang ;
Wang, Kun ;
Yan, Junjie ;
Loy, Chen-Change ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (07) :1320-1334
[30]   Object Detection Networks on Convolutional Feature Maps [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Zhang, Xiangyu ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (07) :1476-1481