SignAvatars: A Large-Scale 3D Sign Language Holistic Motion Dataset and Benchmark

被引：0

作者：

Yu, Zhengdi ^{[1
,2
]}

Huang, Shaoli ^{[2
]}

Cheng, Yongkang ^{[2
]}

Birdal, Tolga ^{[1
]}

机构：

[1] Imperial Coll London, London, England

[2] Tencent AI Lab, Shenzhen, Peoples R China

来源：

COMPUTER VISION - ECCV 2024, PT V | 2025年 / 15063卷

基金：

英国工程与自然科学研究理事会;

关键词：

Sign Language; Digital Avatars; Hand Motion;

D O I：

10.1007/978-3-031-72652-1_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present SignAvatars, the first large-scale, multi-prompt 3D sign language (SL) motion dataset designed to bridge the communication gap for Deaf and hard-of-hearing individuals. While there has been an exponentially growing number of research regarding digital communication, the majority of existing communication technologies primarily cater to spoken or written languages, instead of SL, the essential communication method for Deaf and hard-of-hearing communities. Existing SL datasets, dictionaries, and sign language production (SLP) methods are typically limited to 2D as annotating 3D models and avatars for SL is usually an entirely manual and labor-intensive process conducted by SL experts, often resulting in unnatural avatars. In response to these challenges, we compile and curate the SignAvatars dataset, which comprises 70,000 videos from 153 signers, totaling 8.34 million frames, covering both isolated signs and continuous, co-articulated signs, with multiple prompts including HamNoSys, spoken language, and words. To yield 3D holistic annotations, including meshes and biomechanically-valid poses of body, hands, and face, as well as 2D and 3D keypoints, we introduce an automated annotation pipeline operating on our large corpus of SL videos. SignAvatars facilitates various tasks such as 3D sign language recognition (SLR) and the novel 3D SL production (SLP) from diverse inputs like text scripts, individual words, and HamNoSys notation. Hence, to evaluate the potential of SignAvatars, we further propose a unified benchmark of 3D SL holistic motion production. We believe that this work is a significant step forward towards bringing the digital world to the Deaf and hard-of-hearing communities as well as people interacting with them https://signavatars.github.io/.

引用

页码：1 / 19

页数：19

共 50 条

[1] A large-scale combinatorial benchmark for sign language recognition
Gao, Liqing
Wan, Liang
Hu, Lianyu
Han, Ruize
Liu, Zekang
Shi, Peng
Shang, Fanhua
Feng, Wei
PATTERN RECOGNITION, 2025, 161
[2] A Large-Scale 3D Object Recognition dataset
Solund, Thomas
Buch, Anders Glent
Kruger, Norbert
Aanaes, Henrik
PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 73 - 82
[3] 3D Object Detection on large-scale dataset
Zhao, Yan
Zhu, Jihong
Liang, Haoyu
Chen, Lyujie
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[4] Large-scale multiview 3D hand pose dataset
Gomez-Donoso, Francisco
Orts-Escolano, Sergio
Cazorla, Miguel
IMAGE AND VISION COMPUTING, 2019, 81 : 25 - 33
[5] Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset
Lin, Jing
Zeng, Ailing
Lu, Shunlin
Cai, Yuanhao
Zhang, Ruimao
Wang, Haoqian
Zhang, Lei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[6] A Large-Scale Shape Benchmark for 3D Object Retrieval: Toyohashi Shape Benchmark
Tatsuma, Atsushi
Koyanagi, Hitoshi
Aono, Masaki
2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[7] OASIS: A Large-Scale Dataset for Single Image 3D in the Wild
Chen, Weifeng
Qian, Shengyi
Fan, David
Kojima, Noriyuki
Hamilton, Max
Deng, Jia
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 676 - 685
[8] How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Duarte, Amanda
Palaskar, Shruti
Ventura, Lucas
Ghadiyaram, Deepti
DeHaan, Kenneth
Metze, Florian
Torres, Jordi
Giro-i-Nieto, Xavier
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2734 - 2743
[9] ClearPose: Large-scale Transparent Object Dataset and Benchmark
Chen, Xiaotong
Zhang, Huijie
Yu, Zeren
Opipari, Anthony
Jenkins, Odest Chadwicke
COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 381 - 396
[10] SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification
Sun, Liwei
Zhang, Junjie
Li, Jia
Wang, Yueming
Zeng, Dan
OPTICAL AND QUANTUM ELECTRONICS, 2023, 55 (02)

← 1 2 3 4 5 →