Reliability and effectiveness of clickthrough data for automatic image annotation

被引:0
作者
Theodora Tsikrika
Christos Diou
Arjen P. de Vries
Anastasios Delopoulos
机构
[1] Centrum Wiskunde & Informatica,Multimedia Understanding Group, Electrical and Computer Engineering Department
[2] Aristotle University of Thessaloniki,Informatics and Telematics Institute
[3] Centre for Research and Technology Hellas,undefined
[4] Delft University of Technology,undefined
来源
Multimedia Tools and Applications | 2011年 / 55卷
关键词
Image annotation; Concepts; Supervised learning; Search logs; Clickthrough data; Collective knowledge; Implicit feedback; Reliability;
D O I
暂无
中图分类号
学科分类号
摘要
Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the expensive manual annotation effort. We investigate and evaluate this approach using a collection of 97,628 photographic images. The results indicate that the contribution of search log based training data is positive despite their inherent noise; in particular, the combination of manual and automatically generated training data outperforms the use of manual data alone. It is therefore possible to use clickthrough data to perform large-scale image annotation with little manual annotation effort or, depending on performance, using only the automatically generated training data. An extensive presentation of the experimental results and the accompanying data can be accessed at http://olympus.ee.auth.gr/~diou/civr2009/.
引用
收藏
页码:27 / 52
页数:25
相关论文
共 30 条
[1]  
Baeza-Yates RA(2007)Improving search engines by query clustering J Am Soc Inf Sci Technol 58 1793-1804
[2]  
Hurtado CA(2005)Evaluating implicit measures to improve web search ACM Trans Inf Syst 23 147-168
[3]  
Mendoza M(2007)Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search ACM Trans Inf Syst 41 58-62
[4]  
Fox S(2007)Learning to rank for information retrieval (lr4ir 2007) SIGIR Forum 37 18-28
[5]  
Karnawat K(2003)Implicit feedback for inferring user preference: a bibliography SIGIR Forum 47 56-60
[6]  
Mydland M(2004)Telling humans and computers apart automatically Commun ACM 51 58-67
[7]  
Dumais ST(2008)Designing games with a purpose Commun ACM 321 1465-1468
[8]  
White T(2008)reCAPTCHA: Human-based character recognition via web security measures Science undefined undefined-undefined
[9]  
Joachims T(undefined)undefined undefined undefined undefined-undefined
[10]  
Granka L(undefined)undefined undefined undefined undefined-undefined