Clustering and Synchronizing Multi-Camera Video via Audio

Nicholas J. Bryan

Adobe Research

Paris Smaragdis

Adobe Research

Gautham Mysore

Adobe Research

Through the proliferation of smartphones and low-cost portable electronics, video and audio recording devices have become ubiquitous. As a result, tens, hundreds, or even thousands of people can simultaneously record a single moment in history, creating large collections of unorganized and unprocessed audio and video recordings. To properly playback, edit, and analyze these collections, distinct event identification and time synchronization is required. To address this problem, we employ landmark-based audio fingerprinting to both identify and synchronize multi-camera video recordings within a large collection of video and/or audio files. Compared to prior work, we offer improvements towards event identification and a new synchronization refinement method that resolves inconsistent estimates and allows non-overlapping content to be synchronized within larger groups of recordings. The method is shown to be fast, accurate, and equivalent to an efficient and scalable time-difference-of-arrival method using cross-correlation performed on a non-linearly transformed signal.


Adobe MAX 2011




Project Publications

Clustering and Synchronizing Multi-camera Video via Landmark Cross-Correlation

Bryan, N., Smaragdis, P., Mysore, G. (Mar. 25, 2012)
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)