The small sample properties of two types of chow tests are investigated in the context of multiple time series systems. It is found that the tests may have substantially distorted size if the sample size is not large relative to the number of parameters in the model under study. In particular the tests reject far too often in this situation. It is shown that bootstrap versions of the tests have much better properties in this respect. In other words, the bootstrap can be used to size-adjust the tests.