Vidient proposes a computer-aided video/audio surveillance system to detect, monitor, understand and anticipate crowd behavior. Audio and video sensors were selected because they are low cost, portable, passive, and if mounted properly, are difficult to detect visually. A robust crowd detection and interpretation method using a Bayesian fusion framework to integrate multiple video/audio processing modules is proposed. The detection modules can adapt automatically to a diversity of environments, and to time-varying changes such as daylight and weather conditions. Crowd behavior interpretation and prediction will be achieved by automating crowd models that are well understood in the field of psychology. Sensor inputs will be interpreted within the traditional framework of cognitive and physical realms. Psychological models that describe crowd behavior will be reconstituted and transformed to the automated video and audio surveillance domain. Crowd parameters, including crowd size, density, shape, location, mood, gender ratio, organization and segmentation will be extracted from video and audio signals. For example, a histogram of clothing types could be used to calculate the ratio of females to males, or movement speed and direction of a crowd or its components may reveal insights on the crowds mood panic might trigger sudden crowd movements.
Keywords: Video Surveillance, Bayesian Fusion, Sensor Fusion, Crowd Modeling, Cognitive Model.