The Hidden Value of Narrative Comments for Assessment: A Quantitative Reliability Analysis of Qualitative Data

Shiphra Ginsburg; Cees P. M. van der Vleuten; Kevin W. Eva

doi:10.1097/ACM.0000000000001669

The Hidden Value of Narrative Comments for Assessment: A Quantitative Reliability Analysis of Qualitative Data

Shiphra Ginsburg^*, Cees P. M. van der Vleuten, Kevin W. Eva

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

146 Downloads (Pure)

Abstract

Purpose

In-training evaluation reports (ITERs) are ubiquitous in internal medicine (IM) residency. Written comments can provide a rich data source, yet are often overlooked. This study determined the reliability of using variable amounts of commentary to discriminate between residents.

Method

ITER comments from two cohorts of PGY-1s in IM at the University of Toronto (graduating 2010 and 2011; n = 46-48) were put into sets containing 15 to 16 residents. Parallel sets were created: one with comments from the full year and one with comments from only the first three assessments. Each set was rank-ordered by four internists external to the program between April 2014 and May 2015 (n = 24). Generalizability analyses and a decision study were performed.

Results

For the full year of comments, reliability coefficients averaged across four rankers were G = 0.85 and G = 0.91 for the two cohorts. For a single ranker, G = 0.60 and G = 0.73. Using only the first three assessments, reliabilities remained high at G = 0.66 and G = 0.60 for a single ranker. In a decision study, if two internists ranked the first three assessments, reliability would be G = 0.80 and G = 0.75 for the two cohorts.

Conclusions

Using written comments to discriminate between residents can be extremely reliable even after only several reports are collected. This suggests a way to identify residents early on who may require attention. These findings contribute evidence to support the validity argument for using qualitative data for assessment.

Original language	English
Pages (from-to)	1617-1621
Number of pages	5
Journal	Academic Medicine
Volume	92
Issue number	11
DOIs	https://doi.org/10.1097/ACM.0000000000001669
Publication status	Published - Nov 2017

Keywords

TRAINING EVALUATION REPORTS
STUDENTS
PERSPECTIVES
PERFORMANCE
MILESTONES
RESIDENTS
VALIDITY
FORM

Access to Document

10.1097/ACM.0000000000001669

Full text Final published version, 361 KBLicence: Taverne

Cite this

@article{56953392bbe14137b9a76085ab8e9e67,

title = "The Hidden Value of Narrative Comments for Assessment: A Quantitative Reliability Analysis of Qualitative Data",

abstract = "PurposeIn-training evaluation reports (ITERs) are ubiquitous in internal medicine (IM) residency. Written comments can provide a rich data source, yet are often overlooked. This study determined the reliability of using variable amounts of commentary to discriminate between residents.MethodITER comments from two cohorts of PGY-1s in IM at the University of Toronto (graduating 2010 and 2011; n = 46-48) were put into sets containing 15 to 16 residents. Parallel sets were created: one with comments from the full year and one with comments from only the first three assessments. Each set was rank-ordered by four internists external to the program between April 2014 and May 2015 (n = 24). Generalizability analyses and a decision study were performed.ResultsFor the full year of comments, reliability coefficients averaged across four rankers were G = 0.85 and G = 0.91 for the two cohorts. For a single ranker, G = 0.60 and G = 0.73. Using only the first three assessments, reliabilities remained high at G = 0.66 and G = 0.60 for a single ranker. In a decision study, if two internists ranked the first three assessments, reliability would be G = 0.80 and G = 0.75 for the two cohorts.ConclusionsUsing written comments to discriminate between residents can be extremely reliable even after only several reports are collected. This suggests a way to identify residents early on who may require attention. These findings contribute evidence to support the validity argument for using qualitative data for assessment.",

keywords = "TRAINING EVALUATION REPORTS, STUDENTS, PERSPECTIVES, PERFORMANCE, MILESTONES, RESIDENTS, VALIDITY, FORM",

author = "Shiphra Ginsburg and {van der Vleuten}, {Cees P. M.} and Eva, {Kevin W.}",

year = "2017",

month = nov,

doi = "10.1097/ACM.0000000000001669",

language = "English",

volume = "92",

pages = "1617--1621",

journal = "Academic Medicine",

issn = "1040-2446",

publisher = "LIPPINCOTT WILLIAMS & WILKINS",

number = "11",

}

TY - JOUR

T1 - The Hidden Value of Narrative Comments for Assessment

T2 - A Quantitative Reliability Analysis of Qualitative Data

AU - Ginsburg, Shiphra

AU - van der Vleuten, Cees P. M.

AU - Eva, Kevin W.

PY - 2017/11

Y1 - 2017/11

N2 - PurposeIn-training evaluation reports (ITERs) are ubiquitous in internal medicine (IM) residency. Written comments can provide a rich data source, yet are often overlooked. This study determined the reliability of using variable amounts of commentary to discriminate between residents.MethodITER comments from two cohorts of PGY-1s in IM at the University of Toronto (graduating 2010 and 2011; n = 46-48) were put into sets containing 15 to 16 residents. Parallel sets were created: one with comments from the full year and one with comments from only the first three assessments. Each set was rank-ordered by four internists external to the program between April 2014 and May 2015 (n = 24). Generalizability analyses and a decision study were performed.ResultsFor the full year of comments, reliability coefficients averaged across four rankers were G = 0.85 and G = 0.91 for the two cohorts. For a single ranker, G = 0.60 and G = 0.73. Using only the first three assessments, reliabilities remained high at G = 0.66 and G = 0.60 for a single ranker. In a decision study, if two internists ranked the first three assessments, reliability would be G = 0.80 and G = 0.75 for the two cohorts.ConclusionsUsing written comments to discriminate between residents can be extremely reliable even after only several reports are collected. This suggests a way to identify residents early on who may require attention. These findings contribute evidence to support the validity argument for using qualitative data for assessment.

AB - PurposeIn-training evaluation reports (ITERs) are ubiquitous in internal medicine (IM) residency. Written comments can provide a rich data source, yet are often overlooked. This study determined the reliability of using variable amounts of commentary to discriminate between residents.MethodITER comments from two cohorts of PGY-1s in IM at the University of Toronto (graduating 2010 and 2011; n = 46-48) were put into sets containing 15 to 16 residents. Parallel sets were created: one with comments from the full year and one with comments from only the first three assessments. Each set was rank-ordered by four internists external to the program between April 2014 and May 2015 (n = 24). Generalizability analyses and a decision study were performed.ResultsFor the full year of comments, reliability coefficients averaged across four rankers were G = 0.85 and G = 0.91 for the two cohorts. For a single ranker, G = 0.60 and G = 0.73. Using only the first three assessments, reliabilities remained high at G = 0.66 and G = 0.60 for a single ranker. In a decision study, if two internists ranked the first three assessments, reliability would be G = 0.80 and G = 0.75 for the two cohorts.ConclusionsUsing written comments to discriminate between residents can be extremely reliable even after only several reports are collected. This suggests a way to identify residents early on who may require attention. These findings contribute evidence to support the validity argument for using qualitative data for assessment.

KW - TRAINING EVALUATION REPORTS

KW - STUDENTS

KW - PERSPECTIVES

KW - PERFORMANCE

KW - MILESTONES

KW - RESIDENTS

KW - VALIDITY

KW - FORM

U2 - 10.1097/ACM.0000000000001669

DO - 10.1097/ACM.0000000000001669

M3 - Article

C2 - 28403004

SN - 1040-2446

VL - 92

SP - 1617

EP - 1621

JO - Academic Medicine

JF - Academic Medicine

IS - 11

ER -