EViT: Privacy-Preserving Image Retrieval via Encrypted Vision Transformer in Cloud Computing

被引:16
作者
Feng, Qihua [1 ]
Li, Peiya [2 ]
Lu, Zhixun [2 ]
Li, Chaozhuo [3 ]
Wang, Zefan [2 ]
Liu, Zhiquan [2 ]
Duan, Chunhui [1 ]
Huang, Feiran [2 ]
Weng, Jian [2 ]
Yu, Philip S. [4 ]
机构
[1] Beijing Inst Technol, Beijing 100081, Peoples R China
[2] Jinan Univ, Coll Informat Sci & Technol, Guangzhou 510000, Peoples R China
[3] Beihang Univ, Sch Comp Sci & Technol, Beijing 100191, Peoples R China
[4] Univ Illinois, Dept Comp Sci, Chicago, IL USA
基金
中国国家自然科学基金;
关键词
Feature extraction; Encryption; Codes; Cloud computing; Transform coding; Streaming media; Ciphers; Image retrieval; privacy-preserving; JPEG; vision transformer; self-supervised learning;
D O I
10.1109/TCSVT.2024.3370668
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Image retrieval systems help users to browse and search among extensive images in real time. With the rise of cloud computing, retrieval tasks are usually outsourced to cloud servers. However, the cloud scenario brings a daunting challenge of privacy protection as cloud servers cannot be fully trusted. To this end, image-encryption-based privacy-preserving image retrieval (PPIR) schemes have been developed, which first extract features from cipher-images, and then build retrieval models based on these features. Yet, most existing PPIR approaches extract shallow features and design trivial unsupervised retrieval models, resulting in insufficient expressiveness for the cipher-images. In this paper, we propose a novel paradigm named Encrypted Vision Transformer (EViT), which advances the discriminative representations capability of cipher-images. First, to capture comprehensive ruled information, we extract multi-level local length sequence and global Huffman-Code frequency features from the cipher-images which are encrypted by permutation encryption, sign encryption, and stream cipher during the JPEG compression process. Second, we design the modified self-supervised Vision Transformer with Huffman-embedding and propose two robust data augmentations on cipher-images to improve representation power of the retrieval model. Moreover, our proposal can be easily adapted to unsupervised or supervised settings. Extensive experiments reveal that EViT achieves both excellent encryption and retrieval performance, outperforming current schemes in terms of retrieval accuracy by large margins while protecting image privacy effectively. Code is publicly available at https://github.com/onlinehuazai/EViT.
引用
收藏
页码:7467 / 7483
页数:17
相关论文
共 87 条
[21]   A Decade Survey of Content Based Image Retrieval Using Deep Learning [J].
Dubey, Shiv Ram .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) :2687-2704
[22]   ParsBERT: Transformer-based Model for Persian Language Understanding [J].
Farahani, Mehrdad ;
Gharachorloo, Mohammad ;
Farahani, Marzieh ;
Manthouri, Mohammad .
NEURAL PROCESSING LETTERS, 2021, 53 (06) :3831-3847
[23]   DHAN: Encrypted JPEG image retrieval via DCT histograms-based attention networks [J].
Feng, Qihua ;
Li, Peiya ;
Lu, Zhixun ;
Zhou, Zhibo ;
Wu, Yongdong ;
Weng, Jian ;
Huang, Feiran .
APPLIED SOFT COMPUTING, 2023, 133
[24]  
Feng QH, 2021, ASIAPAC SIGN INFO PR, P1839
[25]   Practical Privacy-Preserving Content-Based Retrieval in Cloud Image Repositories [J].
Ferreira, Bernardo ;
Rodrigues, Joao ;
Leitao, Joao ;
Domingos, Henrique .
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2019, 7 (03) :784-798
[26]   Privacy-Preserving Content-Based Image Retrieval in the Cloud [J].
Ferreira, Bernardo ;
Rodrigues, Joao ;
Leitao, Joao ;
Domingos, Henrique .
2015 IEEE 34TH SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), 2015, :11-20
[27]  
Gao TY, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P6894
[28]   LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference [J].
Graham, Ben ;
El-Nouby, Alaaeldin ;
Touvron, Hugo ;
Stock, Pierre ;
Joulin, Armand ;
Jegou, Herve ;
Douze, Matthijs .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :12239-12249
[29]  
Ha Q, 2020, Arxiv, DOI arXiv:2010.05350
[30]   JPEG Image Encryption With Improved Format Compatibility and File Size Preservation [J].
He, Junhui ;
Huang, Shuhao ;
Tang, Shaohua ;
Huang, Jiwu .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (10) :2645-2658