Preffi 2.0 is an evidence-based Dutch quality assessment instrument for health promotion interventions. It is mainly intended for both planning and assessing one's own projects but can also be used to assess other people's projects (external use). This article reports a study on the reliability of Preffi as an external quality assessment instrument. Preffi is used to assess quality at three levels: (i) specific criteria, (ii) clusters of criteria and (iii) entire projects. The study compared Preffi-based assessments of 20 projects by three practitioners with their intuitive assessments of the same projects and with assessments by three experts, which were to be used as external criteria. The intuitive assessments only related to the cluster and project levels. Our main hypothesis was that intuitive assessments by practitioners would be less reliable and accurate than their Preffi-based assessments and the experts' assessments. On the whole, we failed to confirm this hypothesis: the experts' assessments proved less reliable and accurate than the practitioners' intuitive and Preffi-based assessments and differed too much from each other to be used as external criteria. The Preffi-based assessments by the practitioners had an acceptable generalizability coefficient (G) and accuracy (standard error of measurement). At the level of the entire project, two assessors are needed to produce sufficiently reliable and accurate assessments, whereas three are needed for assessment at cluster level. The study also showed that different assessors use different perspectives and base their assessment on a variety of aspects. This was regarded as inevitable and even useful by the assessors themselves. Discussions between assessors are important to achieve consensus. The article suggests some improvements to Preffi to further increase its reliability.