Mining Text Snippets for Images on the Web

被引:2
作者
Kannan, Anitha [1 ]
Baker, Simon [1 ]
Ramnath, Krishnan [1 ]
Fiss, Juliet [2 ]
Lin, Dahua [3 ]
Vanderwende, Lucy [1 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
[2] Univ Washington, Seattle, WA 98195 USA
[3] TTI Chicago, Chicago, IL USA
来源
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14) | 2014年
关键词
Text mining for images; Text snippets; Interestingness; Relevance; Diversity; Browsing; Semantic image browsing; Web image augmentation;
D O I
10.1145/2623330.2623346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images are often used to convey many different concepts or illustrate many different stories. We propose an algorithm to mine multiple diverse, relevant, and interesting text snippets for images on the web. Our algorithm scales to all images on the web. For each image, all webpages that contain it are considered. The top-K text snippet selection problem is posed as combinatorial subset selection with the goal of choosing an optimal set of snippets that maximizes a combination of relevancy, interestingness, and diversity. The relevancy and interestingness are scored by machine learned models. Our algorithm is run at scale on the entire image index of a major search engine resulting in the construction of a database of images with their corresponding text snippets. We validate the quality of the database through a large-scale comparative study. We showcase the utility of the database through two web-scale applications: (a) augmentation of images on the web as webpages are browsed and (b) an image browsing experience (similar in spirit to web browsing) that is enabled by interconnecting semantically related images (which may not be visually related) through shared concepts in their corresponding text snippets.
引用
收藏
页码:1534 / 1543
页数:10
相关论文
共 25 条
  • [21] Subtopic Mining Based on Head-Modifier Relation and Co-occurrence of Intents Using Web Documents
    Kim, Se-Jong
    Lee, Jong-Hyeok
    INFORMATION ACCESS EVALUATION: MULTILINGUALITY, MULTIMODALITY, AND VISUALIZATION, 2013, 8138 : 179 - 191
  • [22] An exploratory text-mining approach to analyzing DEI-related issues in eight leading architecture & design firms' publications
    Mohammed, Hassnaa
    More, Prathamesh Pravin
    Saudagar, Onkar Vishnu
    DESIGN JOURNAL, 2025,
  • [23] MICHELINdb: a web-based tool for mining of helminth-microbiota interaction datasets, and a meta-analysis of current research
    Scotti, Riccardo
    Southern, Stuart
    Boinett, Christine
    Jenkins, Timothy P.
    Cortes, Alba
    Cantacessi, Cinzia
    MICROBIOME, 2020, 8 (01)
  • [24] Polymetallic nodules are essential for food-web integrity of a prospective deep-seabed mining area in Pacific abyssal plains
    Stratmann, Tanja
    Soetaert, Karline
    Kersken, Daniel
    van Oevelen, Dick
    SCIENTIFIC REPORTS, 2021, 11 (01)