Human Silhouette and Skeleton Video Synthesis Through Wi-Fi Signals

被引:9
作者
Avola, Danilo [1 ]
Cascio, Marco [1 ]
Cinque, Luigi [1 ]
Fagioli, Alessio [1 ]
Foresti, Gian Luca [2 ]
机构
[1] Sapienza Univ Rome, Dept Comp Sci, Via Salaria 113, I-00198 Rome, Italy
[2] Univ Udine, Dept Comp Sci Math & Phys, Via Sci 206, I-33100 Udine, Italy
关键词
Human silhouette; video synthesis; Wi-Fi signal; skeleton; IMAGE SYNTHESIS; GAN; RECOGNITION; MODEL;
D O I
10.1142/S0129065722500150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing availability of wireless access points (APs) is leading toward human sensing applications based on Wi-Fi signals as support or alternative tools to the widespread visual sensors, where the signals enable to address well-known vision-related problems such as illumination changes or occlusions. Indeed, using image synthesis techniques to translate radio frequencies to the visible spectrum can become essential to obtain otherwise unavailable visual data. This domain-to-domain translation is feasible because both objects and people affect electromagnetic waves, causing radio and optical frequencies variations. In the literature, models capable of inferring radio-to-visual features mappings have gained momentum in the last few years since frequency changes can be observed in the radio domain through the channel state information (CSI) of Wi-Fi APs, enabling signal-based feature extraction, e.g. amplitude. On this account, this paper presents a novel two-branch generative neural network that effectively maps radio data into visual features, following a teacher-student design that exploits a cross-modality supervision strategy. The latter conditions signal-based features in the visual domain to completely replace visual data. Once trained, the proposed method synthesizes human silhouette and skeleton videos using exclusively Wi-Fi signals. The approach is evaluated on publicly available data, where it obtains remarkable results for both silhouette and skeleton videos generation, demonstrating the effectiveness of the proposed cross-modality supervision strategy.
引用
收藏
页数:20
相关论文
共 100 条
[91]   StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks [J].
Zhang, Han ;
Xu, Tao ;
Li, Hongsheng ;
Zhang, Shaoting ;
Wang, Xiaogang ;
Huang, Xiaolei ;
Metaxas, Dimitris .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5908-5916
[92]   FSIM: A Feature Similarity Index for Image Quality Assessment [J].
Zhang, Lin ;
Zhang, Lei ;
Mou, Xuanqin ;
Zhang, David .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2011, 20 (08) :2378-2386
[93]   Through-Wall Human Pose Estimation Using Radio Signals [J].
Zhao, Mingmin ;
Li, Tianhong ;
Abu Alsheikh, Mohammad ;
Tian, Yonglong ;
Zhao, Hang ;
Torralba, Antonio ;
Katabi, Dina .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7356-7365
[94]   RF-Based 3D Skeletons [J].
Zhao, Mingmin ;
Tian, Yonglong ;
Zhao, Hang ;
Abu Alsheikh, Mohammad ;
Li, Tianhong ;
Hristov, Rumen ;
Kabelac, Zachary ;
Katabi, Dina ;
Torralba, Antonio .
PROCEEDINGS OF THE 2018 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION (SIGCOMM '18), 2018, :267-281
[95]   Pluralistic Image Completion [J].
Zheng, Chuanxia ;
Cham, Tat-Jen ;
Cai, Jianfei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1438-1447
[96]   Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro [J].
Zheng, Zhedong ;
Zheng, Liang ;
Yang, Yi .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3774-3782
[97]   Multi-camera transfer GAN for person re-identification [J].
Zhou, Shuren ;
Ke, Maolin ;
Luo, Peng .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 :393-400
[98]   Text Guided Person Image Synthesis [J].
Zhou, Xingran ;
Huang, Siyu ;
Li, Bin ;
Li, Yingming ;
Li, Jiachen ;
Zhang, Zhongfei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3658-3667
[99]   CookGAN: Causality based Text-to-Image Synthesis [J].
Zhu, Bin ;
Ngo, Chong-Wah .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5518-5526
[100]   Conditional StyleGAN modelling and analysis for a machining digital twin [J].
Zotov, Evgeny ;
Tiwari, Ashutosh ;
Kadirkamanathan, Visakan .
INTEGRATED COMPUTER-AIDED ENGINEERING, 2021, 28 (04) :399-415