Evaluation of inter-observer reliability in the case of trichotomous and four-level animal-based welfare indicators with two observers

被引:2
作者
Torsiello, Benedetta [1 ]
Giammarino, Mauro [2 ]
Quatto, Piero [3 ]
Battini, Monica [4 ]
Mattiello, Silvana [4 ]
Battaglini, Luca [1 ]
Renna, Manuela [5 ]
机构
[1] Univ Torino, Dipartimento Sci Agr Forestali & Alimentari, Grugliasco, Italy
[2] Asl TO3, Dipartimento Prevenz, Serv Vet, Area Sanita Anim, Piossasco, Italy
[3] Univ Milano Bicocca, Dipartimento Econ Metodi Quantitat & Strategie Imp, Milan, Italy
[4] Univ Milan, Dipartimento Sci Agr & Ambientali Prod, Terr, Agroenergia, Milan, Italy
[5] Univ Torino, Dipartimento Sci Vet, Grugliasco, Italy
关键词
Agreement index; animal-based measure; inter-observer reliability; bootstrap method; three- and four-level indicators; INTERRATER RELIABILITY; ASSESSMENT PROTOCOL; SCORING SYSTEM; WEIGHTED KAPPA; AGREEMENT; COEFFICIENT; FARMS; PREVALENCE;
D O I
10.1080/1828051X.2024.2367681
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
This study focuses on assessing inter-observer reliability (IOR) between two observers in the case of trichotomous and four-level animal-based welfare indicators assessed at individual level. The Body Condition Score (BCS) and Knee calluses (KNC) were chosen as trichotomous indicators; data were collected in fourteen intensively managed dairy goat farms in Italy (ITF1 to ITF7) and Portugal (PTF1 to PTF7) and in extensively managed dairy goat farms exploiting three alpine pastures (AP1, AP2 and AP3) in Italy. The Ear posture (EP) and Eye white (EW) were chosen as four-level indicators; data were collected in three intensively managed dairy cattle farms (F1, F2 and F3) in Italy. The performance of the most documented agreement indices was compared. In the case of trichotomous indicators, Scott's pi, Cohen's K, Cohen's KC, Cohen's weighted K and Krippendorff's alpha were affected by the paradox effect: when the concordance rate (P0) was high, they sometimes gave very low or even negative values (e.g. P0(BCS-ITF3) = 74%; Scott's pi = 0.05; Cohen's K = 0.09; Krippendorff's alpha = 0.06; P0(BCS-AP3) = 74%; Scott's pi = -0.12; Cohen's K = Krippendorff's alpha = -0.11). Bangdiwala's B, Gwet's gamma(AC1) and Quatto's weighted S were not affected by this phenomenon and provided values very close to P0 (e.g. P0(KNC-PTF1) = 88%; Bangdiwala's B = Gwet's gamma(AC1) = 0.85; P0(BCS-AP1) = 82%; Bangdiwala's B = Gwet's gamma(AC1) = 0.79). In the case of four-level indicators, Cohen's K and Krippendorff's alpha were not affected by the paradox behaviour. However, Cohen's KC in some cases exceeded the observed P0 (e.g. P0(EP-F3) = 78%; Cohen's KC = 1). Gwet's gamma(AC1) showed the best results for four-level indicators (e.g. P0(EP-F1) = 88%; Gwet's gamma(AC1) = 0.86), followed by Quatto's S and Holley and Guilford's G (e.g. P0(EP-F1) = 88%; Quatto's S = Holley and Guilford's G = 0.84). To evaluate IOR between two observers, Bangdiwala's B, Gwet's gamma(AC1) and Quatto's weighted S are suggested for trichotomous indicators, while Gwet's gamma(AC1), Quatto's S and Holley and Guilford's G are suggested for four-level indicators.
引用
收藏
页码:938 / 960
页数:23
相关论文
共 59 条
[1]  
Altman DG, 2000, STAT MED, V19, P3275, DOI 10.1002/1097-0258(20001215)19:23<3275::AID-SIM626>3.0.CO
[2]  
2-M
[3]   Delta:: A new measure of agreement between two raters [J].
Andrés, AM ;
Marzo, PF .
BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2004, 57 :1-19
[4]  
AWIN, 2015, AWIN welfare assessment protocol for goat, DOI [DOI 10.13130/AWINGOATS2015, https://doi.org/10.13130/AWINGOATS2015, 10.13130/AWINGOATS2015]
[5]  
AWIN, 2015, AWIN welfare assessment protocol for donkeys, DOI [10.13130/AWINdonkeys2015, DOI 10.13130/AWINDONKEYS2015]
[6]  
AWIN, AWIN welfare assessment protocol for horses. 2015, DOI [10.13130/AWINHORSES2015, DOI 10.13130/AWINHORSES2015]
[7]  
AWIN, 2015, AWIN welfare assessment protocol for sheep, DOI [DOI 10.13130/AWINSHEEP2015, 10.13130/AWINsheep2015]
[8]  
Bajpai S, 2015, Journal of the Indian Academy of Applied Psychology, V41, P20
[9]   The agreement chart as an alternative to the receiver-operating characteristic curve for diagnostic tests [J].
Bangdiwala, Shrikant I. ;
Haedo, Ana S. ;
Natal, Marcela L. ;
Villaveces, Andres .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2008, 61 (09) :866-874
[10]  
Bangdiwala SI., 1985, P 45 INT STAT I M AM, P307