Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

被引：29

作者：

Kolides, Adam ^{[1
]}

Nawaz, Alyna ^{[1
]}

Rathor, Anshu ^{[1
]}

Beeman, Denzel ^{[1
]}

Hashmi, Muzammil ^{[1
]}

Fatima, Sana ^{[1
]}

Berdik, David ^{[1
]}

Al-Ayyoub, Mahmoud ^{[2
]}

Jararweh, Yaser ^{[1
]}

机构：

[1] Duquesne Univ, Pittsburgh, PA USA

[2] Jordan Univ Sci & Technol, Irbid, Jordan

来源：

SIMULATION MODELLING PRACTICE AND THEORY | 2023年 / 126卷

关键词：

Pre-trained models; Self-supervised learning; Natural Language Processing; Computer vision; Image processing; Transformers; Machine learning models; Foundation models in robotics; Transfer learning; In-context learning; Self-attention; Fine-tuning;

D O I：

10.1016/j.simpat.2023.102754

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

With the emergence of foundation models (FMs) that are trained on large amounts of data at scale and adaptable to a wide range of downstream applications, AI is experiencing a paradigm revolution. BERT, T5, ChatGPT, GPT-3, Codex, DALL-E, Whisper, and CLIP are now the foundation for new applications ranging from computer vision to protein sequence study and from speech recognition to coding. Earlier models had a reputation of starting from scratch with each new challenge. The capacity to experiment with, examine, and comprehend the capabilities and potentials of next-generation FMs is critical to undertaking this research and guiding its path. Nevertheless, these models are currently inaccessible as the resources required to train these models are highly concentrated in industry, and even the assets (data, code) required to replicate their training are frequently not released due to their demand in the real-time industry. At the moment, only large tech companies such as OpenAI, Google, Facebook, and Baidu can afford to construct FMs. We attempt to analyze and examine the main capabilities, key implementations, technological fundamentals, and socially constructed possible consequences of these models inside this research. Despite the expected widely publicized use of FMs, we still lack a comprehensive knowledge of how they operate, why they underperform, and what they are even capable of because of their emerging global qualities. To deal with these problems, we believe that much critical research on FMs would necessitate extensive multidisciplinary collaboration, given their essentially social and technical structure. Throughout the investigation, we will also have to deal with the problem of misrepresentation created by these systems. If FMs live up to their promise, AI might see far wider commercial use. As researchers studying the ramifications on society, we believe FMs will lead the way in massive changes. They are closely managed for the time being, so we should have time to comprehend their implications before they become a major concern.

引用

页数：18

共 50 条

[1] Pre-Trained Language Models and Their Applications
Wang, Haifeng
Li, Jiwei
Wu, Hua
Hovy, Eduard
Sun, Yu
ENGINEERING, 2023, 25 : 51 - 65
[2] Pre-trained models: Past, present and future
Han, Xu
Zhang, Zhengyan
Ding, Ning
Gu, Yuxian
Liu, Xiao
Huo, Yuqi
Qiu, Jiezhong
Yao, Yuan
Zhang, Ao
Zhang, Liang
Han, Wentao
Huang, Minlie
Jin, Qin
Lan, Yanyan
Liu, Yang
Liu, Zhiyuan
Lu, Zhiwu
Qiu, Xipeng
Song, Ruihua
Tang, Jie
Wen, Ji-Rong
Yuan, Jinhui
Zhao, Wayne Xin
Zhu, Jun
AI OPEN, 2021, 2 : 225 - 250
[3] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
Myers, Devon
Mohawesh, Rami
Chellaboina, Venkata Ishwarya
Sathvik, Anantha Lakshmi
Venkatesh, Praveen
Ho, Yi-Hui
Henshaw, Hanna
Alhawawreh, Muna
Berdik, David
Jararweh, Yaser
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26
[4] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
Devon Myers
Rami Mohawesh
Venkata Ishwarya Chellaboina
Anantha Lakshmi Sathvik
Praveen Venkatesh
Yi-Hui Ho
Hanna Henshaw
Muna Alhawawreh
David Berdik
Yaser Jararweh
Cluster Computing, 2024, 27 : 1 - 26
[5] A Survey on Time-Series Pre-Trained Models
Ma, Qianli
Liu, Zhen
Zheng, Zhenjing
Huang, Ziyang
Zhu, Siying
Yu, Zhongzhong
Kwok, James T.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 7536 - 7555
[6] Pre-trained models for natural language processing: A survey
Qiu XiPeng
Sun TianXiang
Xu YiGe
Shao YunFan
Dai Ning
Huang XuanJing
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
[7] Universal embedding for pre-trained models and data bench
Cho, Namkyeong
Cho, Taewon
Shin, Jaesun
Jeon, Eunjoo
Lee, Taehee
NEUROCOMPUTING, 2025, 619
[8] Pre-trained models for natural language processing: A survey
XiPeng Qiu
TianXiang Sun
YiGe Xu
YunFan Shao
Ning Dai
XuanJing Huang
Science China Technological Sciences, 2020, 63 : 1872 - 1897
[9] A survey on moral foundation theory and pre-trained language models: current advances and challenges
Zangari, Lorenzo
Greco, Candida Maria
Picca, Davide
Tagarelli, Andrea
AI & SOCIETY, 2025,
[10] Leveraging pre-trained language models for mining microbiome-disease relationships
Nikitha Karkera
Sathwik Acharya
Sucheendra K. Palaniappan
BMC Bioinformatics, 24

← 1 2 3 4 5 →