This work introduces an unsupervised approach to scene analysis and anomaly detection in traffic video data, as captured from static surveillance cameras. A hybrid local-global scheme is introduced, so as to capture both local and global information, by extracting features in superpixel-generated spatiotemporal volumes, which are then merged into regions with dynamically varying boundaries. The resulting regions' shapes vary according to the underlying motion in the scene, as captured by the superpixels. Representative descriptors are then calculated in these regions, and multiple local Hierarchical Dirichlet Process (HDP) models are deployed in them, one for each region, for the unsupervised characterization of normal and "abnormal" events. The extraction of meaningful descriptors in these regions, instead of the whole frame, increases the resolution of the algorithm, while avoiding noise induced artifacts, and thus resulting in the successful detection of a wide range of "anomalies", both in the local and global scales. Experiments on benchmark datasets containing various scenarios in traffic scenes prove our method's efficacy and generality, leading to higher accuracy than the current State of the Art (SoA), and at a lower computational cost. Systematic quantitative experimental results and comparisons are provided on benchmark datasets, setting up a valuable baseline for future comparisons and improvements.
- Anomaly detection
- Traffic scenes