Interactive Sound Source Separation

Nicholas J. Bryan

Adobe Research

Gautham Mysore

Adobe Research

In applications such as audio denoising, music transcription, music remixing, and audio-based forensics, it is desirable to decompose a single-channel recording into its respective sources. One of the current most effective class of methods to do so is based on non-negative matrix factorization and related latent variable models. Such techniques, however, typically perform poorly when no isolated training data is given and do not allow user feedback to correct for poor results. To overcome these issues, we allow a user to interactively constrain a latent variable model by painting on a time-frequency display of sound to guide the learning process. The annotations are used within the framework of posterior regularization to impose linear grouping constraints that would otherwise be difficult to achieve via standard priors. For the constraints considered, an efficient expectation-maximization algorithm is derived with closed-form multiplicative updates, drawing connections to non-negative matrix factorization methods, and allowing for high-quality interactive-rate separation without explicit training data

Adobe MAX 2013

For more info, please seeĀ http://isse.sourceforge.net.

Project Publications

Interactive Refinement Of Supervised And Semi-Supervised Sound Source Separation Estimates

Bryan, N., Mysore, G. (May. 1, 2013)
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

ISSE: An Interactive Source Separation Editor

Bryan, N., Mysore, G., Wang, G. (Apr. 26, 2014)
ACM Human Factors in Computing Systems (CHI)

Source Separation of Polyphonic Music With Interactive User-Feedback on a Piano Roll Display

Bryan, N., Mysore, G., Wang, G. (Nov. 4, 2013)
International Society for Music Information Retrieval Conference (ISMIR)