This study addressed the problem of detecting and recognizing online the occurrence of human interactions recorded by a network of multiple cameras.
Interactions were represented by forming temporal trajectories, coupling together the body motion of each individual and her/his proximity relationships with others, and also sound whenever available. Such trajectories were modeled with kernel state-space (KSS) models. Their advantage is being suitable for the online interaction detection, recognition, and fusing information from multiple cameras, while enabling a fast implementation based on online recursive updates. In order to compare interaction trajectories in the space of KSS models, researchers designed so-called pairwise kernels with a special symmetry. For detection, researchers exploited the geometry of linear operators in Hilbert space, and extended to KSS models the concept of parity space, originally defined for linear models. For fusion, researchers combined KSS models with kernel construction and multiview learning techniques. Researchers evaluated the approach on four single-view publicly available data sets, and also introduced, and will make public, a new challenging human interactions data set collected using a network of three cameras. The results show that the approach has promise to become an effective building block for the analysis of real-time human behavior from multiple cameras. (Publisher abstract modified)