Purpose Residency programs around the world use multisource feedback (MSF) to evaluate learners' performance. Studies of the reliability of MSF show mixed results. This study aimed to identify the reliability of MSF as practiced across occasions with varying numbers of assessors from different professional groups (physicians and nonphysicians) and the effect on the reliability of the assessment for different competencies when completed by both groups. Method The authors collected data from 2008 to 2012 from electronically completed MSF questionnaires. In total, 428 residents completed 586 MSF occasions, and 5,020 assessors provided feedback. The authors used generalizability theory to analyze the reliability of MSF for multiple occasions, different competencies, and varying numbers of assessors and assessor groups across multiple occasions. Results A reliability coefficient of 0.800 can be achieved with two MSF occasions completed by at least 10 assessors per group or with three MSF occasions completed by 5 assessors per group. Nonphysicians' scores for the Scholar and Health advocate competencies and physicians' scores for the Health advocate competency had a negative effect on the composite reliability. Conclusions A feasible number of assessors per MSF occasion can reliably assess residents' performance. Scores from a single occasion should be interpreted cautiously. However, every occasion can provide valuable feedback for learning. This research confirms that the (unique) characteristics of different assessor groups should be considered when interpreting MSF results. Reliability seems to be influenced by the included assessor groups and competencies. These findings will enhance the utility of MSF during residency training.