Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing

被引：4

作者：

Jimenez, Rafael Zequeira ^{[1
]}

Llagostera, Anna ^{[2
]}

Naderi, Babak ^{[1
]}

Moeller, Sebastian ^{[3
]}

Berger, Jens ^{[2
]}

机构：

[1] Tech Univ Berlin, Berlin, Germany

[2] Rohde & Schwarz SwissQual AG, Zuchwil, Switzerland

[3] Tech Univ Berlin, DFKI Projektburo Berlin, Berlin, Germany

来源：

COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 ) | 2019年

关键词：

inter-rater reliability; speech quality assessment; crowdsourcing; listeners' agreement; subjectivity in crowdsourcing;

D O I：

10.1145/3308560.3317084

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.

引用

页码：1138 / 1143

页数：6

共 50 条

[21] The Inter-Rater Reliability of Technical Skills Assessment and Retention of Rater Training
Gawad, Nada
Fowler, Amanda
Mimeault, Richard
Raiche, Isabelle
JOURNAL OF SURGICAL EDUCATION, 2019, 76 (04) : 1088 - 1093
[22] Intra-rater and inter-rater reliability of six musculoskeletal preparticipatory screening tests
Zumana, Nosipho
Olivier, Benita
Godlwana, Lonwabo
Martin, Candice
SOUTH AFRICAN JOURNAL OF PHYSIOTHERAPY, 2019, 75 (01)
[23] Intra- and inter-rater reliability in a comparative study of cross-sectional and spiral computed tomography pelvimetry methods
Phexell, Erika
Akesson, Anna
Soderberg, Marcus
Bolejko, Anetta
ACTA RADIOLOGICA OPEN, 2019, 8 (06):
[24] Vibration sensibility of the median nerve in a population with chronic whiplash associated disorder: Intra- and inter-rater reliability study
Tyros, I.
Soundy, A.
Heneghan, N. R.
MANUAL THERAPY, 2016, 25 : 81 - 86
[25] Challenges in Inter-rater Agreement on Lamina Propria Fibrosis in Esophageal Biopsies
Thaker, Ameet I.
Smith, Jacob
Pathak, Mona
Park, Jason Y.
PEDIATRIC AND DEVELOPMENTAL PATHOLOGY, 2023, 26 (02) : 106 - 114
[26] Inter-Rater Agreement Estimates for Data With High Prevalence of a Single Response
Waugh, Shirley Moore
He, Jianghua
JOURNAL OF NURSING MEASUREMENT, 2019, 27 (02) : 152 - 161
[27] Inter-rater reliability of the Play Assessment for Group Settings
Lautamo, Tiina
Heikkila, Maija
SCANDINAVIAN JOURNAL OF OCCUPATIONAL THERAPY, 2011, 18 (01) : 3 - 10
[28] Inter-Rater and Intra-Rater Reliability of the Bahasa Melayu Version of Rose Angina Questionnaire
Hassan, N. B.
Choudhury, S. R.
Naing, L.
Conroy, R. M.
Rahman, A. R. A.
ASIA-PACIFIC JOURNAL OF PUBLIC HEALTH, 2007, 19 (03) : 45 - 51
[29] Measuring the Quality of Annotations for a Subjective Crowdsourcing Task
Justo, Raquel
Ines Torres, M.
Alcaide, Jose M.
PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2017), 2017, 10255 : 58 - 68
[30] Inter-Rater Reliability of Preprocessing EEG Data: Impact of Subjective Artifact Removal on Associative Memory Task ERP Results
Shirk, Steven D.
McLaren, Donald G.
Bloomfield, Jessica S.
Powers, Alex
Duffy, Alec
Mitchell, Meghan B.
Ezzati, Ali
Ally, Brandon A.
Atri, Alireza
FRONTIERS IN NEUROSCIENCE, 2017, 11

← 1 2 3 4 5 →