PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks

被引:2
作者
Bakhtiarnia, Arian [1 ]
Zhang, Qi [1 ]
Iosifidis, Alexandros [1 ]
机构
[1] Aarhus Univ, Dept Elect & Comp Engn, Aarhus, Denmark
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
Efficient deep learning; lightweight deep learning; data augmentation; crowd counting; monocular depth estimation; text-to-image diffusion model;
D O I
10.1109/IJCNN54540.2023.10191962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many deep learning tasks require annotations that are too time consuming for human operators, resulting in small dataset sizes. This is especially true for dense regression problems such as crowd counting which requires the location of every person in the image to be annotated. Techniques such as data augmentation and synthetic data generation based on simulations can help in such cases. In this paper, we introduce PromptMix, a method for artificially boosting the size of existing datasets, that can be used to improve the performance of lightweight networks. First, synthetic images are generated in an end-to-end data-driven manner, where text prompts are extracted from existing datasets via an image captioning deep network, and subsequently introduced to text-to-image diffusion models. The generated images are then annotated using one or more high-performing deep networks, and mixed with the real dataset for training the lightweight network. By extensive experiments on five datasets and two tasks, we show that PromptMix can significantly increase the performance of lightweight networks by up to 26%.
引用
收藏
页数:8
相关论文
共 38 条
[1]  
[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00432
[2]  
[Anonymous], 2018, CVPR, DOI DOI 10.1145/3176258.3176307
[3]  
Antoniou A., 2017, CoRR abs/1711.04340
[4]  
Bakhtiarnia A., 2022, ISC2
[5]  
Bakhtiarnia A., 2022, NEURAL NETWORKS
[6]  
Bakhtiarnia A., 2022, CROWD COUNTING HEAVI
[7]  
Bhat Shariq Farooq, 2021, CVPR
[8]   A review on deep learning for future smart cities [J].
Bhattacharya, Sweta ;
Somayaji, Siva Rama Krishnan ;
Gadekallu, Thippa Reddy ;
Alazab, Mamoun ;
Maddikunta, Praveen Kumar Reddy .
INTERNET TECHNOLOGY LETTERS, 2022, 5 (01)
[9]  
Correia-Silva J. R., 2018, IJCNN
[10]   Dense Scale Network for Crowd Counting [J].
Dai, Feng ;
Liu, Hao ;
Ma, Yike ;
Zhang, Xi ;
Zhao, Qiang .
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, :64-72