Conversational Image Search: A Sketch-based Approach

被引：1

作者：

Braghis, Daniel D. ^{[1
]}

Liu, Haiming ^{[1
]}

机构：

[1] Univ Southampton, Sch Elect & Comp Sci, Southampton, Hants, England

来源：

PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024 | 2024年

关键词：

Conversational product search; natural language feedback; multi-modal interaction; sketch-based image retrieval; Stable Diffusion; ControlNet; GPT Assistant;

D O I：

10.1145/3652583.3657594

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conversational image search has emerged as a progressive step beyond traditional keyword-based methodologies, which addresses challenges in human-computer interaction during the information retrieval process. This paper introduces a demonstration called DoodleShoper, a forward-thinking conversational image search assistant centered around sketching, specifically tailored for online product searches. It underscores the importance of visual diversity, often eluding verbal expression while highlighting the efficacy of a sketch-based approach in enhancing user interaction. The proposed modular architecture integrates a state-of-the-art Language Model with advanced Stable Diffusion technologies in the image generation field to offer users a more intuitive and precise conversational search experience. Unlike most conventional methods that directly align prompts or sketches with images, our approach leverages a generative model to produce an intermediate search outcome. This strategic shift streamlines the search process from a zero-shot query - where the query directly corresponds to an image - to a reverse image search task, facilitating the discovery of similar images through multimodal interaction. The implemented demonstration involves refining and expanding the application to diverse user information needs and preferences, including exploring the potential of utilising sketches as an alternative or complementary search environment, a novel concept rooted in current research.

引用

页码：1265 / 1269

页数：5

共 23 条

[1] State-of-the-Art in Open-Domain Conversational AI: A Survey [J].

Adewumi, Tosin ;

Liwicki, Foteini ;

Liwicki, Marcus .

INFORMATION, 2022, 13 (06)

[2]

Al-Thani Haya, 2023, Open -Domain Conversational Search: Addressing Challenges and Limitations Using Reformulation and Data Augmentation

[3] Analysing Mixed Initiatives and Search Strategies during Conversational Search [J].

Aliannejadi, Mohammad ;

Azzopardi, Leif ;

Zamani, Hamed ;

Kanoulas, Evangelos ;

Thomas, Paul ;

Craswell, Nick .

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, :16-26

[4]

Amin Muhammad, 2023, Methods and advancement of content -based fashion image retrieval: A Review

[5] High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and its Applications to High-Level Vision [J].

Bertasius, Gedas ;

Shi, Jianbo ;

Torresani, Lorenzo .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :504-512

[6]

Chowdhury PN, 2022, Arxiv, DOI arXiv:2204.11964

[7] Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval [J].

Dey, Sounak ;

Riba, Pau ;

Dutta, Anjan ;

Llados, Josep ;

Song, Yi-Zhe .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2174-2183

[8]

Kato Mihoko, 2008, Intercultural Communication Studies, V17, P97

[9] How to Approach Ambiguous Queries in Conversational Search: A Survey of Techniques, Approaches, Tools, and Challenges [J].

Keyvan, Kimiya ;

Huang, Jimmy Xiangji .

ACM COMPUTING SURVEYS, 2023, 55 (06)

[10]

Liu Haiming, 2010, Applying Information Foraging Theory to understand user interaction with content -based image retrieval, DOI [10.1145/1840784.1840805, DOI 10.1145/1840784.1840805]

← 1 2 3 →