Cross-Modal Generation of Tactile Friction Coefficient From Audio and Visual Measurements by Transformer

被引:5
作者
Song, Rui [1 ,2 ]
Sun, Xiaoying [1 ]
Liu, Guohong [1 ]
机构
[1] Jilin Univ, Coll Commun Engn, Changchun 130022, Peoples R China
[2] Jilin Univ, Int Ctr Future Sci, Changchun 130022, Peoples R China
关键词
Audio data; electrovibration; surface haptics; tactile friction coefficient; Transformer; visual data; FUSION;
D O I
10.1109/TIM.2023.3311071
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generating tactile data (e.g., friction coefficient) from audio and visual modalities can avoid time-consuming practical measurements and ensure high-fidelity haptic rendering of surface textures. In this article, we present a Transformer-based method for cross-modal generation of the tactile friction coefficient. Using the self-attention mechanism, we jointly encode the amplitude and phase information of audio spectrums and RGB images to extract global and local features. Then, we convert the joint coding features into tactile decoding features using a Transformer module in a bottleneck converter. We continuously decode and reconstruct them to obtain amplitude and phase information of tactile friction coefficients. Finally, we convert this information into 1-D friction coefficients using inverse short-time Fourier transform (ISTFT). Evaluations of the LMT Haptic Material Database confirm the obvious performance improvement of the proposed method. Furthermore, with the generated friction by the Transformer and the custom-designed electrovibration device, a novel rendering method is proposed, which simultaneously utilizes the amplitude and frequency of driving signals to display tactile textures on touchscreens. User experiments are organized to evaluate the rendering fidelity of the generated friction coefficient. The two-way analysis of variance (ANOVA) with repeated measurements indicates that the rendering fidelity of the Transformer method is significantly improved compared with the contrast methods ( p < 0.05).
引用
收藏
页数:12
相关论文
共 46 条
  • [1] Ba JL., 2016, arXiv
  • [2] The significance and use of the friction coefficient
    Blau, PJ
    [J]. TRIBOLOGY INTERNATIONAL, 2001, 34 (09) : 585 - 591
  • [3] Multi-modal Transformer-based Tactile Signal Generation for Haptic Texture Simulation of Materials in Virtual and Augmented Reality
    Cai, Shaoyu
    Zhu, Kening
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY ADJUNCT (ISMAR-ADJUNCT 2022), 2022, : 810 - 811
  • [4] GAN-based image-to-friction generation for tactile simulation of fabric material
    Cai, Shaoyu
    Zhao, Lu
    Ban, Yuki
    Narumi, Takuji
    Liu, Yue
    Zhu, Kening
    [J]. COMPUTERS & GRAPHICS-UK, 2022, 102 : 460 - 473
  • [5] Visual-Tactile Cross-Modal Data Generation Using Residue-Fusion GAN With Feature-Matching and Perceptual Losses
    Cai, Shaoyu
    Zhu, Kening
    Ban, Yuki
    Narumi, Takuji
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) : 7525 - 7532
  • [6] Cao G., 2023, arXiv
  • [7] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [8] Culbertson H, 2014, IEEE HAPTICS SYM, P319, DOI 10.1109/HAPTICS.2014.6775475
  • [9] Multimodal Affective Computing With Dense Fusion Transformer for Inter- and Intra-Modality Interactions
    Deng, Huan
    Yang, Zhenguo
    Hao, Tianyong
    Li, Qing
    Liu, Wenyin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6575 - 6587
  • [10] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805