SignAvatars: A Large-Scale 3D Sign Language Holistic Motion Dataset and Benchmark

被引:0
|
作者
Yu, Zhengdi [1 ,2 ]
Huang, Shaoli [2 ]
Cheng, Yongkang [2 ]
Birdal, Tolga [1 ]
机构
[1] Imperial Coll London, London, England
[2] Tencent AI Lab, Shenzhen, Peoples R China
来源
COMPUTER VISION - ECCV 2024, PT V | 2025年 / 15063卷
基金
英国工程与自然科学研究理事会;
关键词
Sign Language; Digital Avatars; Hand Motion;
D O I
10.1007/978-3-031-72652-1_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present SignAvatars, the first large-scale, multi-prompt 3D sign language (SL) motion dataset designed to bridge the communication gap for Deaf and hard-of-hearing individuals. While there has been an exponentially growing number of research regarding digital communication, the majority of existing communication technologies primarily cater to spoken or written languages, instead of SL, the essential communication method for Deaf and hard-of-hearing communities. Existing SL datasets, dictionaries, and sign language production (SLP) methods are typically limited to 2D as annotating 3D models and avatars for SL is usually an entirely manual and labor-intensive process conducted by SL experts, often resulting in unnatural avatars. In response to these challenges, we compile and curate the SignAvatars dataset, which comprises 70,000 videos from 153 signers, totaling 8.34 million frames, covering both isolated signs and continuous, co-articulated signs, with multiple prompts including HamNoSys, spoken language, and words. To yield 3D holistic annotations, including meshes and biomechanically-valid poses of body, hands, and face, as well as 2D and 3D keypoints, we introduce an automated annotation pipeline operating on our large corpus of SL videos. SignAvatars facilitates various tasks such as 3D sign language recognition (SLR) and the novel 3D SL production (SLP) from diverse inputs like text scripts, individual words, and HamNoSys notation. Hence, to evaluate the potential of SignAvatars, we further propose a unified benchmark of 3D SL holistic motion production. We believe that this work is a significant step forward towards bringing the digital world to the Deaf and hard-of-hearing communities as well as people interacting with them https://signavatars.github.io/.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 50 条
  • [1] A large-scale combinatorial benchmark for sign language recognition
    Gao, Liqing
    Wan, Liang
    Hu, Lianyu
    Han, Ruize
    Liu, Zekang
    Shi, Peng
    Shang, Fanhua
    Feng, Wei
    PATTERN RECOGNITION, 2025, 161
  • [2] A Large-Scale 3D Object Recognition dataset
    Solund, Thomas
    Buch, Anders Glent
    Kruger, Norbert
    Aanaes, Henrik
    PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 73 - 82
  • [3] 3D Object Detection on large-scale dataset
    Zhao, Yan
    Zhu, Jihong
    Liang, Haoyu
    Chen, Lyujie
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [4] Large-scale multiview 3D hand pose dataset
    Gomez-Donoso, Francisco
    Orts-Escolano, Sergio
    Cazorla, Miguel
    IMAGE AND VISION COMPUTING, 2019, 81 : 25 - 33
  • [5] Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset
    Lin, Jing
    Zeng, Ailing
    Lu, Shunlin
    Cai, Yuanhao
    Zhang, Ruimao
    Wang, Haoqian
    Zhang, Lei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] A Large-Scale Shape Benchmark for 3D Object Retrieval: Toyohashi Shape Benchmark
    Tatsuma, Atsushi
    Koyanagi, Hitoshi
    Aono, Masaki
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [7] OASIS: A Large-Scale Dataset for Single Image 3D in the Wild
    Chen, Weifeng
    Qian, Shengyi
    Fan, David
    Kojima, Noriyuki
    Hamilton, Max
    Deng, Jia
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 676 - 685
  • [8] How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
    Duarte, Amanda
    Palaskar, Shruti
    Ventura, Lucas
    Ghadiyaram, Deepti
    DeHaan, Kenneth
    Metze, Florian
    Torres, Jordi
    Giro-i-Nieto, Xavier
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2734 - 2743
  • [9] ClearPose: Large-scale Transparent Object Dataset and Benchmark
    Chen, Xiaotong
    Zhang, Huijie
    Yu, Zeren
    Opipari, Anthony
    Jenkins, Odest Chadwicke
    COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 381 - 396
  • [10] SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification
    Sun, Liwei
    Zhang, Junjie
    Li, Jia
    Wang, Yueming
    Zeng, Dan
    OPTICAL AND QUANTUM ELECTRONICS, 2023, 55 (02)