Universal embedding for pre-trained models and data bench

被引:0
|
作者
Cho, Namkyeong [1 ]
Cho, Taewon [2 ]
Shin, Jaesun [2 ]
Jeon, Eunjoo [2 ]
Lee, Taehee [2 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Ctr Math Machine Learning & its Applicat CM2LA, Dept Math, Pohang 37673, Gyeongbuk, South Korea
[2] Samsung SDS, 125 Olymp Ro 35 Gil, Seoul 05510, South Korea
基金
新加坡国家研究基金会;
关键词
Transfer learning; Pretrained models; Graph neural networks;
D O I
10.1016/j.neucom.2024.129107
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The transformer architecture has shown significant improvements in the performance of various natural language processing (NLP) tasks. One of the great advantages of transformer-based model is that they allow for the addition of an extra layer to a pre-trained model (PTM) and fine-tuning, rather than requiring the development of a separate architecture for each task. This approach has provided great promising performance in NLP tasks. Therefore, selecting an appropriate PTM from the model zoo, such as Hugging Face, becomes a crucial task. Despite the importance of PTM selection, it still requires further investigation. The main challenge in PTM selection for NLP tasks is the lack of a publicly available benchmark to evaluate model performance for each task and dataset. To address this challenge, we introduce the first public data benchmark to evaluate the performance of popular transformer-based models on diverse ranges of NLP tasks. Furthermore, we propose graph representations of transformer-based models with node features that represent the matrix weight on each layer. Empirical results demonstrate that our proposed graph neural network (GNN) model outperforms existing PTM selection methods.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Aspect Based Sentiment Analysis using French Pre-Trained Models
    Essebbar, Abderrahman
    Kane, Bamba
    Guinaudeau, Ophelie
    Chiesa, Valeria
    Quenel, Ilhem
    Chau, Stephane
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
  • [42] An Approach to Run Pre-Trained Deep Learning Models on Grayscale Images
    Ahmad, Ijaz
    Shin, Seokjoo
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 177 - 180
  • [43] Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data
    Reza Ahmadi Lashaki
    Zahra Raeisi
    Nasim Razavi
    Mehdi Goodarzi
    Hossein Najafzadeh
    BMC Oral Health, 25 (1)
  • [44] Pre-Trained Models for Non-Intrusive Appliance Load Monitoring
    Wang, Lingxiao
    Mao, Shiwen
    Wilamowski, Bogdan M.
    Nelms, Robert M.
    IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2022, 6 (01): : 56 - 68
  • [45] Graph-Based Audio Classification Using Pre-Trained Models and Graph Neural Networks
    Castro-Ospina, Andres Eduardo
    Solarte-Sanchez, Miguel Angel
    Vega-Escobar, Laura Stella
    Isaza, Claudia
    Martinez-Vargas, Juan David
    SENSORS, 2024, 24 (07)
  • [46] Assessment of Convolutional Neural Network Pre-Trained Models for Detection and Orientation of Cracks
    Qayyum, Waqas
    Ehtisham, Rana
    Bahrami, Alireza
    Camp, Charles
    Mir, Junaid
    Ahmad, Afaq
    MATERIALS, 2023, 16 (02)
  • [47] Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models
    Li, Wenbiao
    Sun, Rui
    Wu, Yunfang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 3 - 15
  • [48] Improving the Reusability of Pre-trained Language Models in Real-world Applications
    Ghanbarzadeh, Somayeh
    Palangi, Hamid
    Huang, Yan
    Moreno, Radames Cruz
    Khanpour, Hamed
    2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 40 - 45
  • [49] Comprehensive study of pre-trained language models: detecting humor in news headlines
    Farah Shatnawi
    Malak Abdullah
    Mahmoud Hammad
    Mahmoud Al-Ayyoub
    Soft Computing, 2023, 27 : 2575 - 2599
  • [50] An Ensemble Voting Method of Pre-Trained Deep Learning Models for Orchid Recognition
    Ou, Chia-Ho
    Hu, Yi-Nuo
    Jiang, Dong-Jie
    Liao, Po-Yen
    2023 IEEE INTERNATIONAL SYSTEMS CONFERENCE, SYSCON, 2023,