Comparison of Academic, Administrative and Community Rater Scores at a Multiple Mini Interview Using Generalisability Theory

Chew Fei Sow; Carlos Collares; Allan Pau; Cees Van Der Vleuten

doi:10.21315/eimj2023.15.3.4

Comparison of Academic, Administrative and Community Rater Scores at a Multiple Mini Interview Using Generalisability Theory

Chew Fei Sow^*, Carlos Collares, Allan Pau, Cees Van Der Vleuten

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Multiple Mini Interviews (MMIs) are sampling approaches that use multiple short stations to select prospective students for professional programmes. Each station uses different interview scenarios and raters to effectively assess candidates' noncognitive skills. This study compared the performances of three sets of raters; academic, administrative staff, and community members, in an MMI for student selection using performance comparisons and Generalisability Theory to estimate the different sources of variance and generalisability (reliability) coefficients. The study aims to analyse the differences in performance scores from these raters and their psychometric projections on reliability with different samples of raters and stations. Eleven candidates participated in the 10-station MMI, each with an eight-minute duration, two minutes of preparation, and an academic assessment using a marking rubric. The entire interview was video recorded. The administrative staff and community members watched the videos independently and graded all candidates' performances using the same marking rubric. Generalisability and Decision studies were used to analyse the collected data. Community members were the strictest, while academics were the most lenient. There were statistically significant differences between rater categories in six stations. The generalisability coefficient of 0.85 of onerater results from the Decision study suggested good reliability of the 10-station MMI. The Decision study found that generalisability coefficients improved more with an increasing number of raters rather than number of stations. Four stations contributed to unreliability in each rater category and a combination of the rater categories. Information on number of stations, number of raters, and type of rater combination required to achieve good reliability enabled informed decisions on the process and implementation of the MMI. The station simulation that influenced unreliability helped us improve station writing and identify focus areas for training and development.

Original language	English
Pages (from-to)	41-53
Number of pages	13
Journal	Education in Medicine Journal
Volume	15
Issue number	3
DOIs	https://doi.org/10.21315/eimj2023.15.3.4
Publication status	Published - 1 Jan 2023

Keywords

Community raters
Compare assessors
Generalisability-Theory
Multiple Mini Interview
Students selection

Access to Document

10.21315/eimj2023.15.3.4Licence: CC BY

Cite this

@article{1fbc532177654f2280dfbe66c54c0385,

title = "Comparison of Academic, Administrative and Community Rater Scores at a Multiple Mini Interview Using Generalisability Theory",

abstract = "Multiple Mini Interviews (MMIs) are sampling approaches that use multiple short stations to select prospective students for professional programmes. Each station uses different interview scenarios and raters to effectively assess candidates' noncognitive skills. This study compared the performances of three sets of raters; academic, administrative staff, and community members, in an MMI for student selection using performance comparisons and Generalisability Theory to estimate the different sources of variance and generalisability (reliability) coefficients. The study aims to analyse the differences in performance scores from these raters and their psychometric projections on reliability with different samples of raters and stations. Eleven candidates participated in the 10-station MMI, each with an eight-minute duration, two minutes of preparation, and an academic assessment using a marking rubric. The entire interview was video recorded. The administrative staff and community members watched the videos independently and graded all candidates' performances using the same marking rubric. Generalisability and Decision studies were used to analyse the collected data. Community members were the strictest, while academics were the most lenient. There were statistically significant differences between rater categories in six stations. The generalisability coefficient of 0.85 of onerater results from the Decision study suggested good reliability of the 10-station MMI. The Decision study found that generalisability coefficients improved more with an increasing number of raters rather than number of stations. Four stations contributed to unreliability in each rater category and a combination of the rater categories. Information on number of stations, number of raters, and type of rater combination required to achieve good reliability enabled informed decisions on the process and implementation of the MMI. The station simulation that influenced unreliability helped us improve station writing and identify focus areas for training and development.",

keywords = "Community raters, Compare assessors, Generalisability-Theory, Multiple Mini Interview, Students selection",

author = "Sow, {Chew Fei} and Carlos Collares and Allan Pau and {Van Der Vleuten}, Cees",

note = "Publisher Copyright: {\textcopyright} Malaysian Association of Education in Medicine and Health Sciences and 41 Penerbit Universiti Sains Malaysia. 2023.",

year = "2023",

month = jan,

day = "1",

doi = "10.21315/eimj2023.15.3.4",

language = "English",

volume = "15",

pages = "41--53",

journal = "Education in Medicine Journal",

issn = "2180-1932",

publisher = "Penerbit Universiti Sains Malaysia",

number = "3",

}

TY - JOUR

T1 - Comparison of Academic, Administrative and Community Rater Scores at a Multiple Mini Interview Using Generalisability Theory

AU - Sow, Chew Fei

AU - Collares, Carlos

AU - Pau, Allan

AU - Van Der Vleuten, Cees

N1 - Publisher Copyright: © Malaysian Association of Education in Medicine and Health Sciences and 41 Penerbit Universiti Sains Malaysia. 2023.

PY - 2023/1/1

Y1 - 2023/1/1

N2 - Multiple Mini Interviews (MMIs) are sampling approaches that use multiple short stations to select prospective students for professional programmes. Each station uses different interview scenarios and raters to effectively assess candidates' noncognitive skills. This study compared the performances of three sets of raters; academic, administrative staff, and community members, in an MMI for student selection using performance comparisons and Generalisability Theory to estimate the different sources of variance and generalisability (reliability) coefficients. The study aims to analyse the differences in performance scores from these raters and their psychometric projections on reliability with different samples of raters and stations. Eleven candidates participated in the 10-station MMI, each with an eight-minute duration, two minutes of preparation, and an academic assessment using a marking rubric. The entire interview was video recorded. The administrative staff and community members watched the videos independently and graded all candidates' performances using the same marking rubric. Generalisability and Decision studies were used to analyse the collected data. Community members were the strictest, while academics were the most lenient. There were statistically significant differences between rater categories in six stations. The generalisability coefficient of 0.85 of onerater results from the Decision study suggested good reliability of the 10-station MMI. The Decision study found that generalisability coefficients improved more with an increasing number of raters rather than number of stations. Four stations contributed to unreliability in each rater category and a combination of the rater categories. Information on number of stations, number of raters, and type of rater combination required to achieve good reliability enabled informed decisions on the process and implementation of the MMI. The station simulation that influenced unreliability helped us improve station writing and identify focus areas for training and development.

AB - Multiple Mini Interviews (MMIs) are sampling approaches that use multiple short stations to select prospective students for professional programmes. Each station uses different interview scenarios and raters to effectively assess candidates' noncognitive skills. This study compared the performances of three sets of raters; academic, administrative staff, and community members, in an MMI for student selection using performance comparisons and Generalisability Theory to estimate the different sources of variance and generalisability (reliability) coefficients. The study aims to analyse the differences in performance scores from these raters and their psychometric projections on reliability with different samples of raters and stations. Eleven candidates participated in the 10-station MMI, each with an eight-minute duration, two minutes of preparation, and an academic assessment using a marking rubric. The entire interview was video recorded. The administrative staff and community members watched the videos independently and graded all candidates' performances using the same marking rubric. Generalisability and Decision studies were used to analyse the collected data. Community members were the strictest, while academics were the most lenient. There were statistically significant differences between rater categories in six stations. The generalisability coefficient of 0.85 of onerater results from the Decision study suggested good reliability of the 10-station MMI. The Decision study found that generalisability coefficients improved more with an increasing number of raters rather than number of stations. Four stations contributed to unreliability in each rater category and a combination of the rater categories. Information on number of stations, number of raters, and type of rater combination required to achieve good reliability enabled informed decisions on the process and implementation of the MMI. The station simulation that influenced unreliability helped us improve station writing and identify focus areas for training and development.

KW - Community raters

KW - Compare assessors

KW - Generalisability-Theory

KW - Multiple Mini Interview

KW - Students selection

U2 - 10.21315/eimj2023.15.3.4

DO - 10.21315/eimj2023.15.3.4

M3 - Article

SN - 2180-1932

VL - 15

SP - 41

EP - 53

JO - Education in Medicine Journal

JF - Education in Medicine Journal

IS - 3

ER -