Use of Generalizability Theory for Exploring Reliability of and Sources of Variance in Assessment of Technical Skills: A Systematic Review and Meta-Analysis

被引:18
作者
Andersen, Steven Arild Wuyts [1 ,2 ,3 ]
Nayahangan, Leizl Joy [4 ]
Park, Yoon Soo [5 ]
Konge, Lars [6 ]
机构
[1] Copenhagen Acad Med Educ & Simulat CAMES, Ctr Human Resources & Educ, Copenhagen, Denmark
[2] Ohio State Univ, Dept Otolaryngol, Columbus, OH USA
[3] Dept Otorhinolaryngol Head & Neck Surg, Otorhinolaryngol, Copenhagen, Denmark
[4] Ctr Human Resources & Educ, CAMES, Copenhagen, Denmark
[5] Massachusetts Gen Hosp, Hlth Profess Educ Res, Harvard Med Sch, Boston, MA USA
[6] Univ Copenhagen, Head Res CAMES, Med Educ, Ctr Human Resources & Educ, Copenhagen, Denmark
关键词
OBJECTIVE STRUCTURED ASSESSMENT; OTTAWA SURGICAL COMPETENCE; WORKPLACE-BASED ASSESSMENT; GLOBAL RATING-SCALES; PROCEDURAL SKILLS; VALID ASSESSMENT; RELIABLE ASSESSMENT; O-SCORE; PERFORMANCE; SURGERY;
D O I
10.1097/ACM.0000000000004150
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Purpose Competency-based education relies on the validity and reliability of assessment scores. Generalizability (G) theory is well suited to explore the reliability of assessment tools in medical education but has only been applied to a limited extent. This study aimed to systematically review the literature using G-theory to explore the reliability of structured assessment of medical and surgical technical skills and to assess the relative contributions of different factors to variance. Method In June 2020, 11 databases, including PubMed, were searched from inception through May 31, 2020. Eligible studies included the use of G-theory to explore reliability in the context of assessment of medical and surgical technical skills. Descriptive information on study, assessment context, assessment protocol, participants being assessed, and G-analyses was extracted. Data were used to map G-theory and explore variance components analyses. A meta-analysis was conducted to synthesize the extracted data on the sources of variance and reliability. Results Forty-four studies were included; of these, 39 had sufficient data for meta-analysis. The total pool included 35,284 unique assessments of 31,496 unique performances of 4,154 participants. Person variance had a pooled effect of 44.2% (95% confidence interval [CI], 36.8%-51.5%). Only assessment tool type (Objective Structured Assessment of Technical Skills-type vs task-based checklist-type) had a significant effect on person variance. The pooled reliability (G-coefficient) was 0.65 (95% CI, .59-.70). Most studies included decision studies (39, 88.6%) and generally seemed to have higher ratios of performances to assessors to achieve a sufficiently reliable assessment. Conclusions G-theory is increasingly being used to examine reliability of technical skills assessment in medical education, but more rigor in reporting is warranted. Contextual factors can potentially affect variance components and thereby reliability estimates and should be considered, especially in high-stakes assessment. Reliability analysis should be a best practice when developing assessment of technical skills.
引用
收藏
页码:1609 / 1619
页数:11
相关论文
共 69 条
[1]   Observational tools for assessment of procedural skills: a systematic review [J].
Ahmed, Kamran ;
Miskovic, Danilo ;
Darzi, Ara ;
Athanasiou, Thanos ;
Hanna, George B. .
AMERICAN JOURNAL OF SURGERY, 2011, 202 (04) :469-U161
[2]   The validity and reliability of a Direct Observation of Procedural Skills assessment tool: assessing colonoscopic skills of senior endoscopists [J].
Barton, John Roger ;
Corbett, Sally ;
van der Vleuten, Cees Petronella .
GASTROINTESTINAL ENDOSCOPY, 2012, 75 (03) :591-597
[3]   Assessing the surgical skills of trainees in the operating theatre: a prospective observational study of the methodology [J].
Beard, J. D. ;
Marriott, J. ;
Purdie, H. ;
Crossley, J. .
HEALTH TECHNOLOGY ASSESSMENT, 2011, 15 (01) :1-+
[4]   Construct Validity and Reliability of Structured Assessment of endoVascular Expertise in a Simulated Setting [J].
Bech, B. ;
Lonn, L. ;
Falkenberg, M. ;
Bartholdy, N. J. ;
Rader, S. B. ;
Schroeder, T. V. ;
Ringsted, C. .
EUROPEAN JOURNAL OF VASCULAR AND ENDOVASCULAR SURGERY, 2011, 42 (04) :539-548
[5]  
Bilgic E., 2014, RELIABILITY GOALS SC
[6]   Reliable assessment of operative performance [J].
Bilgic, Elif ;
Watanabe, Yusuke ;
McKendy, Katherine ;
Munshi, Amani ;
Ito, Yoichi M. ;
Fried, Gerald M. ;
Feldman, Liane S. ;
Vassiliou, Melina C. .
AMERICAN JOURNAL OF SURGERY, 2016, 211 (02) :426-430
[7]   Reliable Assessment of Performance in Surgery: A Practical Approach to Generalizability Theory [J].
Bilgic, Elif ;
Watanabe, Yusuke ;
McKendy, Katherine M. ;
Ito, Yoichi ;
Vassiliou, Melina C. .
JOURNAL OF SURGICAL EDUCATION, 2015, 72 (05) :774-775
[8]  
Bloch R., GENERALIZABILITY THE
[9]   Generalizability theory for the perplexed: A practical introduction and guide: AMEE Guide No. 68 [J].
Bloch, Ralph ;
Norman, Geoffrey .
MEDICAL TEACHER, 2012, 34 (11) :960-992
[10]   Generalizability Theory and Classical Test Theory [J].
Brennan, Robert L. .
APPLIED MEASUREMENT IN EDUCATION, 2011, 24 (01) :1-21