Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets

被引：0

作者：

Druart, Lucas ^{[1
,2
]}

Vielzeuf, Valentin ^{[2
]}

Esteve, Yannick ^{[1
]}

机构：

[1] Lab Informat Avignon, Avignon, France

[2] Orange Innovat, Rennes, France

来源：

TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II | 2024年 / 15049卷

关键词：

spoken dialogue systems; automatic annotation; large language models; spoken language understanding;

D O I：

10.1007/978-3-031-70566-3_18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In spoken Task-Oriented Dialogue (TOD) systems, the choice of the semantic representation describing the users' requests is key to a smooth interaction. Indeed, the system uses this representation to reason over a database and its domain knowledge to choose its next action. The dialogue course thus depends on the information provided by this semantic representation. While textual datasets provide fine-grained semantic representations, spoken dialogue datasets fall behind. This paper provides insights into automatic enhancement of spoken dialogue datasets' semantic representations. Our contributions are three fold: (1) assess the relevance of Large Language Model fine-tuning, (2) evaluate the knowledge captured by the produced annotations and (3) highlight semi-automatic annotation implications.

引用

页码：199 / 209

页数：11

共 26 条

[1]

Banarescu L., 2013, P 7 LINGUISTIC ANNOT

[2] Benchmarking benchmarks: introducing new automatic indicators for benchmarking Spoken Language Understanding corpora [J].

Bechet, Frederic ;

Raymond, Christian .

INTERSPEECH 2019, 2019, :4145-4149

[3]

Bonial C, 2020, PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), P684

[4]

Budzianowski P, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P5016

[5]

Cai Shu, 2013, Short Papers, P748

[6]

Chen X, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P5090

[7]

Devillers L, 2004, INT C LANG RES EV

[8] Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems [J].

Faruqui, Manaal ;

Hakkani-Tur, Dilek .

COMPUTATIONAL LINGUISTICS, 2022, 48 (01) :221-232

[9]

Geng S., 2023, P 2023 C EMP METH NA

[10] Automatic annotation of context and speech acts for dialogue corpora [J].

Georgila, Kallirroi ;

Lemon, Oliver ;

Henderson, James ;

Moore, Johanna D. .

NATURAL LANGUAGE ENGINEERING, 2009, 15 :315-353

← 1 2 3 →