A Data-Driven Representation for Sign Language Production

被引:0
|
作者
Walsh, Harry [1 ]
Ravanshad, Abolfazl [2 ]
Rahmani, Mariam [2 ]
Bowden, Richard [1 ]
机构
[1] Univ Surrey, CVSSP, Guildford, Surrey, England
[2] OmniBridge Ai, Washington, DC USA
基金
瑞士国家科学基金会;
关键词
RECOGNITION;
D O I
10.1109/FG59268.2024.10581995
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phonetic representations are used when recording spoken languages, but no equivalent exists for recording signed languages. As a result, linguists have proposed several annotation systems that operate on the gloss or sub-unit level; however, these resources are notably irregular and scarce. Sign Language Production (SLP) aims to automatically translate spoken language sentences into continuous sequences of sign language. However, current state-of-the-art approaches rely on scarce linguistic resources to work. This has limited progress in the field. This paper introduces an innovative solution by transforming the continuous pose generation problem into a discrete sequence generation problem. Thus, overcoming the need for costly annotation. Although, if available, we leverage the additional information to enhance our approach. By applying Vector Quantisation (VQ) to sign language data, we first learn a codebook of short motions that can be combined to create a natural sequence of sign. Where each token in the codebook can be thought of as the lexicon of our representation. Then using a transformer we perform a translation from spoken language text to a sequence of codebook tokens. Each token can be directly mapped to a sequence of poses allowing the translation to be performed by a single network. Furthermore, we present a sign stitching method to effectively join tokens together. We evaluate on the RWTH-PHOENIX-Weather-2014T (PHOENIX14T) and the more challenging meineDGST (mDGS) datasets. An extensive evaluation shows our approach outperforms previous methods, increasing the BLEU-1 back translation score by up to 72%.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Data-driven development of Virtual Sign Language Communication Agents
    Brock, Heike
    Balayn, Agathe
    Nakadai, Kazuhiro
    2018 27TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (IEEE RO-MAN 2018), 2018, : 370 - 377
  • [2] MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production
    Ma, Jian
    Wang, Wenguan
    Yang, Yi
    Zheng, Feng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 7241 - 7254
  • [3] A data-driven approach to the semantics of iconicity in American Sign Language and English
    Thompson, Bill
    Perlman, Marcus
    Lupyan, Gary
    Sevcikova Sehyr, Zed
    Emmorey, Karen
    LANGUAGE AND COGNITION, 2020, 12 (01) : 182 - 202
  • [4] Using Data-Driven Approach for Modeling Timing Parameters of American Sign Language
    Al-Khazraji, Sedeeq
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 497 - 500
  • [5] Best Practice for Sign Language Data Collections Regarding the Needs of Data-Driven Recognition and Translation
    Forster, Jens
    Stein, Daniel
    Ormel, Ellen
    Crasborn, Onno
    Ney, Hermann
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : A92 - A97
  • [6] A Data-Driven Approach to Infer Knowledge Base Representation for Natural Language Relations
    Luo, Kangqi
    Luo, Xusheng
    Chen, Xianyang
    Zhu, Kenny Q.
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1174 - 1180
  • [7] Data-Driven Inference of Representation Invariants
    Miltner, Anders
    Padhi, Saswat
    Millstein, Todd
    Walker, David
    PROCEEDINGS OF THE 41ST ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '20), 2020, : 1 - 15
  • [8] Comparison of Finite-Repertoire and Data-Driven Facial Expressions for Sign Language Avatars
    Kacorri, Hernisa
    Huenerfauth, Matt
    UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION: ACCESS TO INTERACTION, PT II, 2015, 9176 : 393 - 403
  • [9] A User-Centered Evaluation of the Data-Driven Sign Language Avatar System: A Pilot Study
    Imashev, Alfarabi
    Oralbayeva, Nurziya
    Kimmelman, Vadim
    Sandygulova, Anara
    PROCEEDINGS OF THE 10TH CONFERENCE ON HUMAN-AGENT INTERACTION, HAI 2022, 2022, : 194 - 202
  • [10] Data-Driven Human Modeling by Sparse Representation
    Wu, Yiu-Bun
    Liu, Bin
    Liu, Xiuping
    Wang, Charlie C. L.
    COMPUTER-AIDED DESIGN, 2020, 128