Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

被引：30

作者：

Kolides, Adam ^{[1
]}

Nawaz, Alyna ^{[1
]}

Rathor, Anshu ^{[1
]}

Beeman, Denzel ^{[1
]}

Hashmi, Muzammil ^{[1
]}

Fatima, Sana ^{[1
]}

Berdik, David ^{[1
]}

Al-Ayyoub, Mahmoud ^{[2
]}

Jararweh, Yaser ^{[1
]}

机构：

[1] Duquesne Univ, Pittsburgh, PA USA

[2] Jordan Univ Sci & Technol, Irbid, Jordan

来源：

SIMULATION MODELLING PRACTICE AND THEORY | 2023年 / 126卷

关键词：

Pre-trained models; Self-supervised learning; Natural Language Processing; Computer vision; Image processing; Transformers; Machine learning models; Foundation models in robotics; Transfer learning; In-context learning; Self-attention; Fine-tuning;

D O I：

10.1016/j.simpat.2023.102754

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

With the emergence of foundation models (FMs) that are trained on large amounts of data at scale and adaptable to a wide range of downstream applications, AI is experiencing a paradigm revolution. BERT, T5, ChatGPT, GPT-3, Codex, DALL-E, Whisper, and CLIP are now the foundation for new applications ranging from computer vision to protein sequence study and from speech recognition to coding. Earlier models had a reputation of starting from scratch with each new challenge. The capacity to experiment with, examine, and comprehend the capabilities and potentials of next-generation FMs is critical to undertaking this research and guiding its path. Nevertheless, these models are currently inaccessible as the resources required to train these models are highly concentrated in industry, and even the assets (data, code) required to replicate their training are frequently not released due to their demand in the real-time industry. At the moment, only large tech companies such as OpenAI, Google, Facebook, and Baidu can afford to construct FMs. We attempt to analyze and examine the main capabilities, key implementations, technological fundamentals, and socially constructed possible consequences of these models inside this research. Despite the expected widely publicized use of FMs, we still lack a comprehensive knowledge of how they operate, why they underperform, and what they are even capable of because of their emerging global qualities. To deal with these problems, we believe that much critical research on FMs would necessitate extensive multidisciplinary collaboration, given their essentially social and technical structure. Throughout the investigation, we will also have to deal with the problem of misrepresentation created by these systems. If FMs live up to their promise, AI might see far wider commercial use. As researchers studying the ramifications on society, we believe FMs will lead the way in massive changes. They are closely managed for the time being, so we should have time to comprehend their implications before they become a major concern.

引用

页数：18

共 50 条

[41] Discrimination Bias Detection Through Categorical Association in Pre-Trained Language Models
Dusi, Michele
Arici, Nicola
Gerevini, Alfonso Emilio
Putelli, Luca
Serina, Ivan
IEEE ACCESS, 2024, 12 : 162651 - 162667
[42] TARGET SPEECH EXTRACTION WITH PRE-TRAINED SELF-SUPERVISED LEARNING MODELS
Peng, Junyi
Delcroix, Marc
Ochiai, Tsubasa
Plchot, Oldrich
Araki, Shoko
Cemocky, Jan
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 10421 - 10425
[43] Automatic Title Generation for Learning Resources and Pathways with Pre-trained Transformer Models
Mishra, Prakhar
Diwan, Chaitali
Srinivasa, Srinath
Srinivasaraghavan, G.
INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2021, 15 (04) : 487 - 510
[44] Memory-Tuning: A Unified Parameter-Efficient Tuning Method for Pre-Trained Language Models
Qi, Wang
Liu, Rui
Zuo, Yuan
Li, Fengzhi
Chen, Yong
Wu, Junjie
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2025, 33 : 1 - 10
[45] Learning Social Relationship From Videos via Pre-Trained Multimodal Transformer
Teng, Yiyang
Song, Chenguang
Wu, Bin
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1377 - 1381
[46] How to Estimate Model Transferability of Pre-Trained Speech Models?
Chen, Zih-Ching
Yang, Chao-Han Huck
Li, Bo
Zhang, Yu
Chen, Nanxin
Chang, Shou-Yiin
Prabhavalkar, Rohit
Lee, Hung-yi
Sainath, Tara N.
INTERSPEECH 2023, 2023, : 456 - 460
[47] Adapting Pre-trained Language Models to Rumor Detection on Twitter
Slimi, Hamda
Bounhas, Ibrahim
Slimani, Yahya
JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2021, 27 (10) : 1128 - 1148
[48] Pre-trained Language Models with Limited Data for Intent Classification
Kasthuriarachchy, Buddhika
Chetty, Madhu
Karmakar, Gour
Walls, Darren
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[49] The Impact of Training Methods on the Development of Pre-Trained Language Models
Uribe, Diego
Cuan, Enrique
Urquizo, Elisa
COMPUTACION Y SISTEMAS, 2024, 28 (01): : 109 - 124
[50] Performance Evaluation of CNN and Pre-trained Models for Malware Classification
Habibi, Omar
Chemmakha, Mohammed
Lazaar, Mohamed
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (08) : 10355 - 10369

← 1 2 3 4 5 →