Decoding is the task of applying speech recognition models to a speech stream (audio file, microphone, ...). Common recognizers for which we have models readily available are Sphinx-4 and Pocketsphinx. (see here) Models are available on request for German (or use the models that come with Sphinx for English).

Your task is to understand and describe the speech recognizer architecture and to evaluate the effect of decoder settings (search beam size, insertion probabilities, types of language models) on the recognition performance. No programming is required but good command of the commandline and Linux/POSIX tools.

-- TimoBaumann - 06 Apr 2016
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback