Decoding is the task of applying speech recognition models to a speech stream (audio file, microphone, ...). Common recognizers for which we have models readily available are Sphinx-4 and Pocketsphinx. (see here) Models are available on request for German (or use the models that come with Sphinx for English).

Your task is to understand and describe the speech recognizer architecture and to evaluate the effect of decoder settings (search beam size, insertion probabilities, types of language models) on the recognition performance. No programming is required but good command of the commandline and Linux/POSIX tools.

-- TimoBaumann - 06 Apr 2016
