Universal embedding for pre-trained models and data bench

被引：0

作者：

Cho, Namkyeong ^{[1
]}

Cho, Taewon ^{[2
]}

Shin, Jaesun ^{[2
]}

Jeon, Eunjoo ^{[2
]}

Lee, Taehee ^{[2
]}

机构：

[1] Pohang Univ Sci & Technol POSTECH, Ctr Math Machine Learning & its Applicat CM2LA, Dept Math, Pohang 37673, Gyeongbuk, South Korea

[2] Samsung SDS, 125 Olymp Ro 35 Gil, Seoul 05510, South Korea

来源：

NEUROCOMPUTING | 2025年 / 619卷

基金：

新加坡国家研究基金会;

关键词：

Transfer learning; Pretrained models; Graph neural networks;

D O I：

10.1016/j.neucom.2024.129107

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The transformer architecture has shown significant improvements in the performance of various natural language processing (NLP) tasks. One of the great advantages of transformer-based model is that they allow for the addition of an extra layer to a pre-trained model (PTM) and fine-tuning, rather than requiring the development of a separate architecture for each task. This approach has provided great promising performance in NLP tasks. Therefore, selecting an appropriate PTM from the model zoo, such as Hugging Face, becomes a crucial task. Despite the importance of PTM selection, it still requires further investigation. The main challenge in PTM selection for NLP tasks is the lack of a publicly available benchmark to evaluate model performance for each task and dataset. To address this challenge, we introduce the first public data benchmark to evaluate the performance of popular transformer-based models on diverse ranges of NLP tasks. Furthermore, we propose graph representations of transformer-based models with node features that represent the matrix weight on each layer. Empirical results demonstrate that our proposed graph neural network (GNN) model outperforms existing PTM selection methods.

引用

页数：21

共 50 条

[41] Aspect Based Sentiment Analysis using French Pre-Trained Models
Essebbar, Abderrahman
Kane, Bamba
Guinaudeau, Ophelie
Chiesa, Valeria
Quenel, Ilhem
Chau, Stephane
ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
[42] An Approach to Run Pre-Trained Deep Learning Models on Grayscale Images
Ahmad, Ijaz
Shin, Seokjoo
3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 177 - 180
[43] Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data
Reza Ahmadi Lashaki
Zahra Raeisi
Nasim Razavi
Mehdi Goodarzi
Hossein Najafzadeh
BMC Oral Health, 25 (1)
[44] Pre-Trained Models for Non-Intrusive Appliance Load Monitoring
Wang, Lingxiao
Mao, Shiwen
Wilamowski, Bogdan M.
Nelms, Robert M.
IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2022, 6 (01): : 56 - 68
[45] Graph-Based Audio Classification Using Pre-Trained Models and Graph Neural Networks
Castro-Ospina, Andres Eduardo
Solarte-Sanchez, Miguel Angel
Vega-Escobar, Laura Stella
Isaza, Claudia
Martinez-Vargas, Juan David
SENSORS, 2024, 24 (07)
[46] Assessment of Convolutional Neural Network Pre-Trained Models for Detection and Orientation of Cracks
Qayyum, Waqas
Ehtisham, Rana
Bahrami, Alireza
Camp, Charles
Mir, Junaid
Ahmad, Afaq
MATERIALS, 2023, 16 (02)
[47] Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models
Li, Wenbiao
Sun, Rui
Wu, Yunfang
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 3 - 15
[48] Improving the Reusability of Pre-trained Language Models in Real-world Applications
Ghanbarzadeh, Somayeh
Palangi, Hamid
Huang, Yan
Moreno, Radames Cruz
Khanpour, Hamed
2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 40 - 45
[49] Comprehensive study of pre-trained language models: detecting humor in news headlines
Farah Shatnawi
Malak Abdullah
Mahmoud Hammad
Mahmoud Al-Ayyoub
Soft Computing, 2023, 27 : 2575 - 2599
[50] An Ensemble Voting Method of Pre-Trained Deep Learning Models for Orchid Recognition
Ou, Chia-Ho
Hu, Yi-Nuo
Jiang, Dong-Jie
Liao, Po-Yen
2023 IEEE INTERNATIONAL SYSTEMS CONFERENCE, SYSCON, 2023,

← 1 2 3 4 5 →