Assessing the Assessment in Emergency Care Training

Mary E. W. Dankbaar; Karen M. Stegers-Jager; Frank Baarveld; Jeroen J. G. van Merrienboer; Geoff R. Norman; Frans L. Rutten; Jan L. C. M. van Saase; Stephanie C. E. Schuit

doi:10.1371/journal.pone.0114663

Assessing the Assessment in Emergency Care Training

Mary E. W. Dankbaar^*, Karen M. Stegers-Jager, Frank Baarveld, Jeroen J. G. van Merrienboer, Geoff R. Norman, Frans L. Rutten, Jan L. C. M. van Saase, Stephanie C. E. Schuit

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Objective: Each year over 1.5 million health care professionals attend emergency care courses. Despite high stakes for patients and extensive resources involved, little evidence exists on the quality of assessment. The aim of this study was to evaluate the validity and reliability of commonly used formats in assessing emergency care skills. Methods: Residents were assessed at the end of a 2-week emergency course; a subgroup was videotaped. Psychometric analyses were conducted to assess the validity and inter-rater reliability of the assessment instrument, which included a checklist, a 9-item competency scale and a global performance scale. Results: A group of 144 residents and 12 raters participated in the study; 22 residents were videotaped and re-assessed by 8 raters. The checklists showed limited validity and poor inter-rater reliability for the dimensions "correct'' and "timely'' (ICC=.30 and. 39 resp.). The competency scale had good construct validity, consisting of a clinical and a communication subscale. The internal consistency of the (sub)scales was high (alpha=.93/.91/.86). The inter-rater reliability was moderate for the clinical competency subscale (.49) and the global performance scale (.50), but poor for the communication subscale (.27). A generalizability study showed that for a reliable assessment 5-13 raters are needed when using checklists, and four when using the clinical competency scale or the global performance scale. Conclusions: This study shows poor validity and reliability for assessing emergency skills with checklists but good validity and moderate reliability with clinical competency or global performance scales. Involving more raters can improve the reliability substantially. Recommendations are made to improve this high stakes skill assessment.

Original language	English
Article number	e114663
Journal	PLOS ONE
Volume	9
Issue number	12
DOIs	https://doi.org/10.1371/journal.pone.0114663
Publication status	Published - 18 Dec 2014

Access to Document

10.1371/journal.pone.0114663Licence: CC BY

Cite this

@article{2906425baf614347a101242bb4b302ac,

title = "Assessing the Assessment in Emergency Care Training",

abstract = "Objective: Each year over 1.5 million health care professionals attend emergency care courses. Despite high stakes for patients and extensive resources involved, little evidence exists on the quality of assessment. The aim of this study was to evaluate the validity and reliability of commonly used formats in assessing emergency care skills. Methods: Residents were assessed at the end of a 2-week emergency course; a subgroup was videotaped. Psychometric analyses were conducted to assess the validity and inter-rater reliability of the assessment instrument, which included a checklist, a 9-item competency scale and a global performance scale. Results: A group of 144 residents and 12 raters participated in the study; 22 residents were videotaped and re-assessed by 8 raters. The checklists showed limited validity and poor inter-rater reliability for the dimensions {"}correct'' and {"}timely'' (ICC=.30 and. 39 resp.). The competency scale had good construct validity, consisting of a clinical and a communication subscale. The internal consistency of the (sub)scales was high (alpha=.93/.91/.86). The inter-rater reliability was moderate for the clinical competency subscale (.49) and the global performance scale (.50), but poor for the communication subscale (.27). A generalizability study showed that for a reliable assessment 5-13 raters are needed when using checklists, and four when using the clinical competency scale or the global performance scale. Conclusions: This study shows poor validity and reliability for assessing emergency skills with checklists but good validity and moderate reliability with clinical competency or global performance scales. Involving more raters can improve the reliability substantially. Recommendations are made to improve this high stakes skill assessment.",

author = "Dankbaar, {Mary E. W.} and Stegers-Jager, {Karen M.} and Frank Baarveld and {van Merrienboer}, {Jeroen J. G.} and Norman, {Geoff R.} and Rutten, {Frans L.} and {van Saase}, {Jan L. C. M.} and Schuit, {Stephanie C. E.}",

year = "2014",

month = dec,

day = "18",

doi = "10.1371/journal.pone.0114663",

language = "English",

volume = "9",

journal = "PLOS ONE",

issn = "1932-6203",

publisher = "Public Library of Science",

number = "12",

}

TY - JOUR

T1 - Assessing the Assessment in Emergency Care Training

AU - Dankbaar, Mary E. W.

AU - Stegers-Jager, Karen M.

AU - Baarveld, Frank

AU - van Merrienboer, Jeroen J. G.

AU - Norman, Geoff R.

AU - Rutten, Frans L.

AU - van Saase, Jan L. C. M.

AU - Schuit, Stephanie C. E.

PY - 2014/12/18

Y1 - 2014/12/18

N2 - Objective: Each year over 1.5 million health care professionals attend emergency care courses. Despite high stakes for patients and extensive resources involved, little evidence exists on the quality of assessment. The aim of this study was to evaluate the validity and reliability of commonly used formats in assessing emergency care skills. Methods: Residents were assessed at the end of a 2-week emergency course; a subgroup was videotaped. Psychometric analyses were conducted to assess the validity and inter-rater reliability of the assessment instrument, which included a checklist, a 9-item competency scale and a global performance scale. Results: A group of 144 residents and 12 raters participated in the study; 22 residents were videotaped and re-assessed by 8 raters. The checklists showed limited validity and poor inter-rater reliability for the dimensions "correct'' and "timely'' (ICC=.30 and. 39 resp.). The competency scale had good construct validity, consisting of a clinical and a communication subscale. The internal consistency of the (sub)scales was high (alpha=.93/.91/.86). The inter-rater reliability was moderate for the clinical competency subscale (.49) and the global performance scale (.50), but poor for the communication subscale (.27). A generalizability study showed that for a reliable assessment 5-13 raters are needed when using checklists, and four when using the clinical competency scale or the global performance scale. Conclusions: This study shows poor validity and reliability for assessing emergency skills with checklists but good validity and moderate reliability with clinical competency or global performance scales. Involving more raters can improve the reliability substantially. Recommendations are made to improve this high stakes skill assessment.

AB - Objective: Each year over 1.5 million health care professionals attend emergency care courses. Despite high stakes for patients and extensive resources involved, little evidence exists on the quality of assessment. The aim of this study was to evaluate the validity and reliability of commonly used formats in assessing emergency care skills. Methods: Residents were assessed at the end of a 2-week emergency course; a subgroup was videotaped. Psychometric analyses were conducted to assess the validity and inter-rater reliability of the assessment instrument, which included a checklist, a 9-item competency scale and a global performance scale. Results: A group of 144 residents and 12 raters participated in the study; 22 residents were videotaped and re-assessed by 8 raters. The checklists showed limited validity and poor inter-rater reliability for the dimensions "correct'' and "timely'' (ICC=.30 and. 39 resp.). The competency scale had good construct validity, consisting of a clinical and a communication subscale. The internal consistency of the (sub)scales was high (alpha=.93/.91/.86). The inter-rater reliability was moderate for the clinical competency subscale (.49) and the global performance scale (.50), but poor for the communication subscale (.27). A generalizability study showed that for a reliable assessment 5-13 raters are needed when using checklists, and four when using the clinical competency scale or the global performance scale. Conclusions: This study shows poor validity and reliability for assessing emergency skills with checklists but good validity and moderate reliability with clinical competency or global performance scales. Involving more raters can improve the reliability substantially. Recommendations are made to improve this high stakes skill assessment.

U2 - 10.1371/journal.pone.0114663

DO - 10.1371/journal.pone.0114663

M3 - Article

C2 - 25521702

SN - 1932-6203

VL - 9

JO - PLOS ONE

JF - PLOS ONE

IS - 12

M1 - e114663

ER -