Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing

被引：4

作者：

Jimenez, Rafael Zequeira ^{[1
]}

Llagostera, Anna ^{[2
]}

Naderi, Babak ^{[1
]}

Moeller, Sebastian ^{[3
]}

Berger, Jens ^{[2
]}

机构：

[1] Tech Univ Berlin, Berlin, Germany

[2] Rohde & Schwarz SwissQual AG, Zuchwil, Switzerland

[3] Tech Univ Berlin, DFKI Projektburo Berlin, Berlin, Germany

来源：

COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 ) | 2019年

关键词：

inter-rater reliability; speech quality assessment; crowdsourcing; listeners' agreement; subjectivity in crowdsourcing;

D O I：

10.1145/3308560.3317084

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.

引用

页码：1138 / 1143

页数：6

共 50 条

[31] Inter-rater reliability of the radiographic assessment of simple bone cysts
Cho, S.
Yankanah, R.
Babyn, P.
Stimec, J.
Doria, A. S.
Stephens, D.
Wright, J. G.
JOURNAL OF CHILDRENS ORTHOPAEDICS, 2019, 13 (02) : 226 - 235
[32] Inter-rater reliability of dynamic risk assessment with sexual offenders
Webster, Stephen D.
Mann, Ruth E.
Carter, Adam J.
Long, Julia
Milner, Rebecca J.
O'Brien, Matt D.
Wakeling, Helen C.
Ray, Nicola L.
PSYCHOLOGY CRIME & LAW, 2006, 12 (04) : 439 - 452
[33] National inter-rater agreement of standardised simulated-patient-based assessments
Sam, Amir H.
Reid, Michael D.
Thakerar, Viral
Gurnell, Mark
Westacott, Rachel
Reed, Malcolm W. R.
Brown, Celia A.
MEDICAL TEACHER, 2021, 43 (03) : 341 - 346
[34] Inter-rater reliability and clinical relevance of subjective and objective interpretation of videofluoroscopy findings
Kuuskoski, Jonna
Vanhatalo, Jaakko
Hirvonen, Jussi
Rekola, Jami
Aaltonen, Leena-Maija
Jarvenpaa, Pia
LARYNGOSCOPE INVESTIGATIVE OTOLARYNGOLOGY, 2024, 9 (04):
[35] Ultrasound diagnosis of endometrial cancer by subjective pattern recognition in women with postmenopausal bleeding: prospective inter-rater agreement and reliability study
Wong, M.
Thanatsis, N.
Amin, T.
Bean, E.
Madhvani, K.
Jurkovic, D.
ULTRASOUND IN OBSTETRICS & GYNECOLOGY, 2021, 57 (03) : 471 - 477
[36] Inter-Rater Agreement of Pressure Ulcer Risk and Prevention Measures in the National Database of Nursing Quality Indicators® (NDNQI)
Waugh, Shirley Moore
Bergquist-Beringer, Sandra
RESEARCH IN NURSING & HEALTH, 2016, 39 (03) : 164 - 174
[37] Use of superpixels for improvement of inter-rater and intra-rater reliability during annotation of medical images
Gut, Daniel
Trombini, Marco
Kucybala, Iwona
Krupa, Kamil
Rozynek, Milosz
Dellepiane, Silvana
Tabor, Zbislaw
Wojciechowski, Wadim
MEDICAL IMAGE ANALYSIS, 2024, 94
[38] Novice vs expert: Intra- and inter-rater reliability of ultrasonographic measurements of the median nerve cross-sectional area of Filipinos
Coronel, Maria-Kassandra
Chan, Robert
Cordero, Jared Alphonse
Domingo, Paz Ines
Sy, Therese Danielle Cordero
Cruz, Christian Albert
Pua, John Hubert
JOURNAL OF THE PERIPHERAL NERVOUS SYSTEM, 2021, 26 (03) : 336 - 336
[39] Inter-Rater Agreement in Traditional Chinese Medicine: On the Potential Contribution of Popplewell's Work
Birch, Stephen
JOURNAL OF ALTERNATIVE AND COMPLEMENTARY MEDICINE, 2019, 25 (11) : 1077 - 1079
[40] Applying Inter-Rater Reliability and Agreement in collaborative Grounded Theory studies in software engineering?
Diaz, Jessica
Perez, Jorge
Gallardo, Carolina
Gonzalez-Prieto, Angel
JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 195

← 1 2 3 4 5 →