Every Picture Tells a Story: Generating Sentences from Images

被引：665

作者：

Farhadi, Ali ^{[1
]}

Hejrati, Mohsen ^{[2
]}

Sadeghi, Mohammad Amin ^{[2
]}

Young, Peter ^{[1
]}

Rashtchian, Cyrus ^{[1
]}

Hockenmaier, Julia ^{[1
]}

Forsyth, David ^{[1
]}

机构：

[1] Univ Illinois, Dept Comp Sci, 1304 W Springfield Ave, Urbana, IL 61801 USA

[2] Inst Studies Theoret Phys & Math, Schl Math, Computer Vision Grp, Tehran, Iran

来源：

COMPUTER VISION-ECCV 2010, PT IV | 2010年 / 6314卷

基金：

美国国家科学基金会;

关键词：

D O I：

10.1007/978-3-642-15561-1_2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Humans can prepare concise descriptions of pictures, focusing on what they find important. We demonstrate that automatic methods cam do so too. We describe a system that can compute a score linking an image to a sentence. This score can be used to attach a descriptive sentence to a given image, or to obtain images that illustrate a given sentence. The score is obtained by comparing an estimate of meaning obtained from the image to one obtained from the sentence. Each estimate of meaning comes from a discriminative procedure that is learned using data. We evaluate on a novel dataset consisting of human-annotated images. While our underlying estimate of meaning is impoverished, it is sufficient to produce very good quantitative results, evaluated with a. novel score that can account for synecdoche.

引用

页码：15 / +

页数：3

共 26 条

[1]

[Anonymous], 2010, P CVPR

[2]

[Anonymous], 2006, ICML

[3]

[Anonymous], 2009, CVPR

[4]

[Anonymous], PROGR BRAIN RES

[5]

Barnard K, 2001, PROC CVPR IEEE, P434

[6]

Berg T.L., 2004, Advances in Neural Information Processing

[7]

Coyne B., 2001, SIGGRAPH 2001

[8]

Curran J., 2003, ACL, P33

[9]

Datta R, 2005, P 7 ACM SIGMM INT WO, P153, DOI [DOI 10.1145/1101826.1101866, 10.1145/1101826.1101866]

[10]

Davis A.K. L., 2009, T PAMI

← 1 2 3 →