Validity: on the meaningful interpretation of assessment data

被引:934
作者
Downing, SM [1 ]
机构
[1] Univ Illinois, Coll Med, Dept Med Educ, Chicago, IL 60612 USA
关键词
education; medical; undergraduate; standards; educational measurement; reproducibility of results;
D O I
10.1046/j.1365-2923.2003.01594.x
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Context All assessments in medical education require evidence of validity to be interpreted meaningfully. In contemporary usage, all validity is construct validity, which requires multiple sources of evidence; construct validity is the whole of validity, but has multiple facets. Five sources - content, response process, internal structure, relationship to other variables and consequences - are noted by the Standards for Educational and Psychological Testing as fruitful areas to seek validity evidence. Purpose The purpose of this article is to discuss construct validity in the context of medical education and to summarize, through example, some typical sources of validity evidence for a written and a performance examination. Summary Assessments are not valid or invalid; rather, the scores or outcomes of assessments have more or less evidence to support ( or refute) a specific interpretation ( such as passing or failing a course). Validity is approached as hypothesis and uses theory, logic and the scientific method to collect and assemble data to support or fail to support the proposed score interpretations, at a given point in time. Data and logic are assembled into arguments - pro and con - for some specific interpretation of assessment data. Examples of types of validity evidence, data and information from each source are discussed in the context of a high-stakes written and performance examination in medical education. Conclusion All assessments require evidence of the reasonableness of the proposed interpretation, as test data in education have little or no intrinsic meaning. The constructs purported to be measured by our assessments are important to students, faculty, administrators, patients and society and require solid scientific evidence of their meaning.
引用
收藏
页码:830 / 837
页数:8
相关论文
共 31 条
  • [1] [American Educational Research Association American Psychological Association National Council on Measurement in Education], 1999, STAND ED PSYCH TEST
  • [2] Angoff W.H., 1971, ED MEASUREMENT, V2nd, P508
  • [3] Quality assurance methods for performance-based assessments
    Boulet, JR
    McKinley, DW
    Whelan, GP
    Hambleton, RK
    [J]. ADVANCES IN HEALTH SCIENCES EDUCATION, 2003, 8 (01) : 27 - 47
  • [4] Brennan R.L., 2001, GENERALIZABILITY THE
  • [5] CONVERGENT AND DISCRIMINANT VALIDATION BY THE MULTITRAIT-MULTIMETHOD MATRIX
    CAMPBELL, DT
    FISKE, DW
    [J]. PSYCHOLOGICAL BULLETIN, 1959, 56 (02) : 81 - 105
  • [6] Cronbach L.J., 1971, ED MEASUREMENT, P443, DOI DOI 10.1037/14353-009
  • [7] Cronbach L.J., 1989, INTELLIGENCE, P147
  • [8] Cronbach L.J., 1988, TEST VALIDITY, P3
  • [9] CONSTRUCT VALIDITY IN PSYCHOLOGICAL TESTS
    CRONBACH, LJ
    MEEHL, PE
    [J]. PSYCHOLOGICAL BULLETIN, 1955, 52 (04) : 281 - 302
  • [10] Generalisability: a key to unlock professional assessment
    Crossley, J
    Davies, H
    Humphris, G
    Jolly, B
    [J]. MEDICAL EDUCATION, 2002, 36 (10) : 972 - 978