TapTell: Interactive visual search for mobile task recommendation

被引:4
作者
Zhang, Ning [1 ]
Mei, Tao [2 ]
Hua, Xian-Sheng [3 ]
Guan, Ling [1 ]
Li, Shipeng [2 ]
机构
[1] Ryerson Univ, Ryerson Multimedia Res Lab, Toronto, ON M5B 2K3, Canada
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
[3] Microsoft Res, Redmond, WA 98052 USA
关键词
Visual intent; Mobile visual search; Interactive visual search; Image retrieval; Mobile recommendation; Natural user interface; Mobile user intention; Visual vocabulary;
D O I
10.1016/j.jvcir.2015.02.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mobile devices are becoming ubiquitous. People use them as personal concierge to search information and make decisions. Therefore, understanding user intent and subsequently provide meaningful and personalized suggestions is important. While existing efforts have predominantly focused on understanding the intent expressed by a textual or a voice query, this paper presents a new and alternative perspective which understands user intent visually, i.e., via visual signal captured by the built-in camera. We call this kind of intent "visual intent" as it can be naturally expressed through a visual form. To accomplish the discovery of visual intent on the phone, we develop TapTell, an exemplary real application on Windows Phone seven, by taking advantages of user interaction and rich context to enable interactive visual searches and contextual recommendations. Through the TapTell system, a mobile user can take a photo and indicate an object-of-interest within the photo via different drawing patterns. Then, the system performs a search-based recognition using a proposed large-scale context-embedded vocabulary tree. Finally, contextually relevant entities (i.e., local businesses) are recommended to the user for completing mobile tasks (those tasks which are natural to be raised and subsequently executed when the user utilizes mobile devices). We evaluated TapTell in a variety of scenarios with millions of images, and compared our results to state-of-the-art algorithms for image retrieval. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:114 / 124
页数:11
相关论文
共 25 条
[1]  
[Anonymous], 2009, P 5 INT C MOB MULT C
[2]  
Broder A., 2002, SIGIR Forum, V36, P3, DOI 10.1145/792550.792552
[3]  
Chandrasekhar Vijay, 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P2504, DOI 10.1109/CVPRW.2009.5206733
[4]  
Chen Dongdong, 2010, Proceedings 2010 International Conference on Optoelectronics and Image Processing (ICOIP 2010), P651, DOI 10.1109/ICOIP.2010.66
[5]   Total recall: Automatic query expansion with a generative feature model for object retrieval [J].
Chum, Ondrej ;
Philbin, James ;
Sivic, Josef ;
Isard, Michael ;
Zisserman, Andrew .
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :496-+
[6]  
Church K., 2009, WORKSH VIS INT SOC S
[7]  
Church K., 2009, Proceedings of the 14th international conference on Intelligent user interfaces, IUI '09, P247, DOI DOI 10.1145/1502650.1502686
[8]   Unsupervised segmentation of color-texture regions in images and video [J].
Deng, YN ;
Manjunath, BS .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (08) :800-810
[9]  
Dix A., 2003, Human-Computer Interaction, V3rd
[10]  
Duan L.-Y., 2011, 2 WORKSH MOB VIS SEA