Background: The objective of the present study was to assess interobserver reproducibility (in terms of reliability and agreement) of active and passive measurements of knee RoM using a long arm goniometer, performed by trained physical therapists in a clinical setting in total knee arthroplasty patients, within the first four days after surgery. Methods: Test-retest analysis Setting: University hospital departments of orthopaedics and physical therapy Participants: Two experienced physical therapists assessed 30 patients, three days after total knee arthroplasty. Main outcome measure: RoM measurement using a long-arm (50 cm) goniometer Agreement was calculated as the mean difference between observers +/- 95% CI of this mean difference. The intraclass correlation coefficient (ICC) was calculated as a measure of reliability, based on two-way random effects analysis of variance. Results: The lowest level of agreement was that for measurement of passive flexion with the patient in supine position (mean difference 1.4; limits of agreement 16.2 to 19 for the difference between the two observers. The highest levels of agreement were found for measurement of passive flexion with the patient in sitting position and for measurement of passive extension (mean difference 2.7; limits of agreement -6.7 to 12.1 and mean difference 2.2; limits of agreement -6.2 to 10.6 degrees, respectively). The ability to differentiate between subjects ranged from 0.62 for measurement of passive extension to 0.89 for measurements of active flexion (ICC values). Conclusion: Interobserver agreement for flexion as well as extension was only fair. When two different observers assess the same patients in the acute phase after total knee arthroplasty using a long arm goniometer, differences in RoM of less than eight degrees cannot be distinguished from measurement error. Reliability was found to be acceptable for comparison on group level, but poor for individual comparisons over time.