Processing Natural Language on Embedded Devices: How Well Do Transformer Models Perform?

被引:0
|
作者
Sarkar, Souvika [1 ]
Babar, Mohammad Fakhruddin [2 ]
Hassan, Md Mahadi [1 ]
Hasan, Monowar [2 ]
Santu, Shubhra Kanti Karmaker [1 ]
机构
[1] Auburn Univ, Auburn, AL 36849 USA
[2] Washington State Univ, Pullman, WA USA
来源
PROCEEDINGS OF THE 15TH ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2024 | 2024年
基金
美国国家科学基金会;
关键词
Transformers; Embedded Systems; NLP; Language Models;
D O I
10.1145/3629526.3645054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based language models such as BERT and its variants are primarily developed with compute-heavy servers in mind. Despite the great performance of BERT models across various NLP tasks, their large size and numerous parameters pose substantial obstacles to offline computation on embedded systems. Lighter replacements of such language models (e.g., DistilBERT and TinyBERT) often sacrifice accuracy, particularly for complex NLP tasks. Until now, it is still unclear (a) whether the state-of-the-art language models, viz., BERT and its variants are deployable on embedded systems with a limited processor, memory, and battery power and (b) if they do, what are the "right" set of configurations and parameters to choose for a given NLP task. This paper presents a performance study of transformer language models under different hardware configurations and accuracy requirements and derives empirical observations about these resource/accuracy trade-offs. In particular, we study how the most commonly used BERT-based language models (viz., BERT, RoBERTa, DistilBERT, and TinyBERT) perform on embedded systems. We tested them on four off-the-shelf embedded platforms (Raspberry Pi, Jetson, UP2, and UDOO) with 2 GB and 4 GB memory (i.e., a total of eight hardware configurations) and four datasets (i.e., HuRIC, GoEmotion, CoNLL, WNUT17) running various NLP tasks. Our study finds that executing complex NLP tasks (such as "sentiment" classification) on embedded systems is feasible even without any GPUs (e.g., Raspberry Pi with 2 GB of RAM). Our findings can help designers understand the deployability and performance of transformer language models, especially those based on BERT architectures.
引用
收藏
页码:211 / 222
页数:12
相关论文
共 32 条
  • [1] On the Explainability of Natural Language Processing Deep Models
    El Zini, Julia
    Awad, Mariette
    ACM COMPUTING SURVEYS, 2023, 55 (05)
  • [2] Natural Language Processing: An Overview of Models, Transformers and Applied Practices
    Canchila, Santiago
    Meneses-Eraso, Carlos
    Casanoves-Boix, Javier
    Cortes-Pellicer, Pascual
    Castello-Sirvent, Fernando
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2024, 21 (03) : 1097 - 1145
  • [3] Fairness Certification for Natural Language Processing and Large Language Models
    Freiberger, Vincent
    Buchmann, Erik
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, INTELLISYS 2024, 2024, 1065 : 606 - 624
  • [4] Dissecting word embeddings and language models in natural language processing
    Verma, Vivek Kumar
    Pandey, Mrigank
    Jain, Tarun
    Tiwari, Pradeep Kumar
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2021, 24 (05) : 1509 - 1515
  • [5] Integrating Natural Language Processing With Vision Transformer for Landscape Character Identification
    Huang, Tingting
    Zhao, Haiyue
    Huang, Bo
    Li, Sha
    Zhu, Jianning
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 5838 - 5852
  • [6] Overview of Character-Based Models for Natural Language Processing
    Adel, Heike
    Asgari, Ehsaneddin
    Schuetze, Hinrich
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 3 - 16
  • [7] Field-embedded database query system based on natural language processing
    Long, Fei
    INTERNATIONAL JOURNAL OF EMBEDDED SYSTEMS, 2023, 16 (5-6) : 331 - 342
  • [8] The journey from natural language processing to large language models: key insights for radiologists
    Salvatore Claudio Fanni
    Lorenzo Tumminello
    Valentina Formica
    Francesca Pia Caputo
    Gayane Aghakhanyan
    Ilaria Ambrosini
    Roberto Francischello
    Lorenzo Faggioni
    Dania Cioni
    Emanuele Neri
    Journal of Medical Imaging and Interventional Radiology, 11 (1):
  • [9] Extracting Requirements Models from Natural-Language Document for Embedded Systems
    Wang, Chunhui
    Hou, Lu
    Chen, Xiaohong
    2022 IEEE 30TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW), 2022, : 18 - 21
  • [10] Fast compression and optimization of deep learning models for natural language processing
    Pietron, Marcin
    Karwatowski, Michal
    Wielgosz, Maciej
    Duda, Jerzy
    2019 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2019), 2019, : 162 - 168