An Investigation of Applying Large Language Models to Spoken Language Learning

被引:7
作者
Gao, Yingming [1 ]
Nuchged, Baorian [2 ]
Li, Ya [1 ]
Peng, Linkai [3 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
[2] Univ Texas Austin, Dept Linguist, Austin, TX 78712 USA
[3] NetEase Youdao, Beijing 100193, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 01期
关键词
large language models; prompt engineering; computer-assisted language learning; spoken language learning; spoken language intelligence;
D O I
10.3390/app14010224
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
People have long desired intelligent conversational systems that can provide assistance in practical scenarios. The latest advancements in large language models (LLMs) are making significant strides toward turning this aspiration into a tangible reality. LLMs are believed to hold the most potential and value in education, especially in the creation of AI-driven virtual teachers that facilitate language learning. This study focuses on assessing the effectiveness of LLMs within the educational domain, specifically in the areas of spoken language learning, which encompass phonetics, phonology, and second language acquisition. To this end, we first introduced a new multiple-choice question dataset to evaluate the effectiveness of LLMs in the aforementioned scenarios, including the understanding and application of spoken language knowledge. Moreover, we investigated the influence of various prompting techniques such as zero- and few-shot methods (prepending the question with question-answer exemplars), chain-of-thought (CoT) prompting, in-domain exemplars, and external tools. We conducted a comprehensive evaluation of popular LLMs (20 distinct models) using these methods. The experimental results showed that the task of extracting conceptual knowledge posed few challenges for these LLMs, whereas the task of application questions was relatively difficult. In addition, some widely proven effective prompting methods combined with domain-specific examples resulted in significant performance improvements compared to the zero-shot baselines. Additionally, some other preliminary experiments also demonstrated the strengths and weaknesses of different LLMs. The findings of this study can shed light on the application of LLMs to spoken language learning.
引用
收藏
页数:18
相关论文
共 66 条
[1]  
Anil R, 2023, Arxiv, DOI [arXiv:2305.10403, 10.48550/arXiv.2305.10403, DOI 10.48550/ARXIV.2305.10403]
[2]  
Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]
[3]  
Baevski A, 2020, ADV NEUR IN, V33
[4]   Multimodal Machine Learning: A Survey and Taxonomy [J].
Baltrusaitis, Tadas ;
Ahuja, Chaitanya ;
Morency, Louis-Philippe .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) :423-443
[5]  
Bang Y, 2023, Arxiv, DOI [arXiv:2302.04023, DOI 10.48550/ARXIV.2302.04023]
[6]   On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? [J].
Bender, Emily M. ;
Gebru, Timnit ;
McMillan-Major, Angelina ;
Shmitchell, Shmargaret .
PROCEEDINGS OF THE 2021 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2021, 2021, :610-623
[7]  
Biderman S, 2023, Arxiv, DOI [arXiv:2304.01373, 10.48550/ARXIV.2304.01373, DOI 10.48550/ARXIV.2304.01373]
[8]  
Borgeaud S, 2022, Arxiv, DOI [arXiv:2112.04426, DOI 10.48550/ARXIV.2112.04426]
[9]  
Borsos Z, 2022, Arxiv, DOI arXiv:2209.03143
[10]  
Bubeck S, 2023, Arxiv, DOI [arXiv:2303.12712, 10.48550/ARXIV.2303.12712]