Seminar Details

Seminar Details


Monday

Mar 26

3:30 pm

Part 1: Overview of Adaptive Algorithms Lab at the University of Toronto --- Part 2: ALGONQUIN: Fast Variational Methods for Robust Speech Recognition or Variational Methods Explained using Cartoons

Brendan Frey (Joint with Computer Science & Engine

Seminar

University of Waterloo - Department of Computer Science

Part 1: Overview of Adaptive Algorithms Lab at the University of Toronto.

I'll review the projects underway in my new group at the University of Toronto. These projects include Bayesian networks for video processing, codes on graphs and iterative algorithms, probabilistic phase unwrapping for MRI and SAR imaging, variational techniques for speech recognition in noisy environments, automated medical diagnosis and the open QMR network.


Part 2: ALGONQUIN: Fast Variational Methods for Robust Speech Recognition or Variational Methods Explained using Cartoons.

I show how variational techniques for inference in probability models of the clean speech, noise and channel can be used to denoise speech features, like the spectrum, the log-spectrum and the cepstrum. For Wall Street Journal speech contaminated by the time-varying noise of an airplane engine shutting down, ALGONQUIN is able to reduce the word error rate from 28.8% to 12.6%. This result is especially encouraging considering that a "standard" spectral subtraction method obtains a WER of 25% and a "matched recognizer" (trained on the noisy data) obtains a WER of 9.7%, close to ALGONQUIN's WER. Recognition rates can be improved by training the recognizer on data obtained by adding noise to clean speech and then denoising the data. Using this technique, for 10dB of additive white noise, ALGONQUIN reduces the WER from 55.1% to 9.9%. This improves on the "matched recognizer", which obtains a WER of 14.0%.


Joint work with Li Deng, Alex Acero and David Heckerman; University of Waterloo, University of Illinois at Urbana and Microsoft Research.