Unraveling the mysteries of AI chatbots

被引:0
作者
Raj Bridgelall
机构
[1] North Dakota State University,Transportation, Logistics & Finance, College of Business
来源
Artificial Intelligence Review | / 57卷
关键词
Generative artificial intelligence; Large language models; ChatGPT; Bard; Transformer architecture; Prompt engineering;
D O I
暂无
中图分类号
学科分类号
摘要
This primer provides an overview of the rapidly evolving field of generative artificial intelligence, specifically focusing on large language models like ChatGPT (OpenAI) and Bard (Google). Large language models have demonstrated unprecedented capabilities in responding to natural language prompts. The aim of this primer is to demystify the underlying theory and architecture of large language models, providing intuitive explanations for a broader audience. Learners seeking to gain insight into the technical underpinnings of large language models must sift through rapidly growing and fragmented literature on the topic. This primer brings all the main concepts into a single digestible document. Topics covered include text tokenization, vocabulary construction, token embedding, context embedding with attention mechanisms, artificial neural networks, and objective functions in model training. The primer also explores state-of-the-art methods in training large language models to generalize on specific applications and to align with human intentions. Finally, an introduction to the concept of prompt engineering highlights the importance of effective human-machine interaction through natural language in harnessing the full potential of artificial intelligence chatbots. This comprehensive yet accessible primer will benefit students and researchers seeking foundational knowledge and a deeper understanding of the inner workings of existing and emerging artificial intelligence models. The author hopes that the primer will encourage further responsible innovation and informed discussions about these increasingly powerful tools.
引用
收藏
相关论文
共 36 条
[1]  
Bahdanau D(2014)Neural machine translation by jointly learning to align and translate 3rd Int Conf Learn Represent 2 359-366
[2]  
Cho K(2022)Tutorial on support vector machines Res Square 60 84-90
[3]  
Bengio Y(2020)Language models are few-shot learners Adv Neural Inform Proc Syst 32 604-624
[4]  
Bridgelall R(2023)Neurosymbolic AI: the 3rd wave Artif Intell Rev 46 234-240
[5]  
Brown TB(2016)Deep residual learning for image recognition IEEE Conf Comput Vis Pattern Recogn 35 24824-24837
[6]  
Mann B(1989)Multilayer feedforward networks are universal approximators Neural Netw 17 402-413
[7]  
Ryder N(2017)Imagenet classification with deep convolutional neural networks Commun ACM undefined undefined-undefined
[8]  
Subbiah M(2020)A survey of the usages of deep learning for natural language processing IEEE Trans Neural Networks Learn Syst undefined undefined-undefined
[9]  
Kaplan J(1970)A computer movie simulating urban growth in the detroit region Econ Geogr undefined undefined-undefined
[10]  
Dhariwal P(2019)Artificial neural networks Adv Methodol Technol Artific Intell Comput Simul Human Comput Int undefined undefined-undefined