Detecting anomalous behaviour in audio-video sensor networks
This project investigates the identification and classification of
behaviours as normal or abnormal, safe or threatening, from an irregular
and often heterogeneous sensor network. Although some general principles
may be applied, there is unlikely to be a unified approach that can be
applied across different sensor domains; for example, tracking moving ‘blobs’
in radar or sonar data is quite distinct in dimension, mathematical and applied
treatment from tracking human subjects in CCTV image data.
In this work project we focus specifically on the problems of using
electro-optic (video, IR, LiDAR) and audible data to monitor behaviour and detect
anomalies, therefore addressing a number of specified areas of interest and challenges.
Our current work addresses the problem of statistical anomaly detection in video
pattern recognition. This can be applied to tracking people and objects in surveillance
networks; learning statistical representations of normality which evolve to produce
pattern-of-life estimates for targets; detecting deviations from normality and
using the spatial context of behaviour to filter out spurious anomalies. The same
context driven approach can be applied to various domains for example, airport or city surveillance.
Initial progress has focused on developing algorithms to incorporating contextual understanding
into low-level tracking algorithms. Specifically, we have focused on concourse protection where
the challenge is to identify a threat before it reaches the entry point of an asset (e.g.
checkpoint, forward operating base). We have developed a novel head-pose classifier for
surveillance video that is able to identify where a person is looking. Our approach has shown
encouraging results on two public video datasets (Benfold and Caviar), as shown below:
By integrating head-pose information from our classifier into a Kalman Filter our ‘Intentional Tracker’ uses this contextual prior when making predictions about target movement. We have demonstrated that using this intentional prior allows us to make better predictions of expected behaviour. This theory generalises to different types of target (e.g. vehicles using indicators) and other spatio-temporal priors, which will become the focus of our work over the coming months. Using contextual information our algorithms construct better models of normal behaviour. This enables them to make more accurate predictions of expected behaviour through which anomalous activity can be identified by its low probability.
Publications:
- An adaptive motion model for person tracking with instantaneous head-pose features. R H Baxter, M J V Leach, S S Mukherjee, N M Robertson, IEEE Signal Processing Letters, Vol. 22, Iss. 5, p578-582, 2015. doi: 10.1109/LSP.2014.2364458
- Tracking with Intent. R H Baxter, M J V Leach, N M Robertson. Sensor Signal Processing for Defence, 2014, DOI: 10.1109/SSPD.2014.6943323
- Detecting Social Groups in Crowded Surveillance Videos Using Visual Attention. M J V Leach. R H Baxter, N M Robertson, E P Sparks. CVPR Workshop on Computational Models for Social Interaction and Behavior. 2014, DOI: 10.1109/CVPRW.2014.75