Pitch tracking is the task of determining the fundamental frequency of the speech sounds in vocalisation. A speaker's pitch changes over time and some stretches of speech have pitch while others are voiceless. Pitch tracking is useful to determine stress and intonation (i.e., the sentence-level prosody), as a basic feature for emotion recognition, or to just distinguish female from male speakers, among other things.
There are many pitch tracking algorithms (use e.g. the two algorithms that
Wavesurfer provides from the
snack library, or have a look at
Praat) and algorithms have various settings (e.g. expected pitch range). Your task is to evaluate algorithms in a meaningful way. While there is a (small) database of manually tracked data (Keele corpus, will be made available upon request) and another corpus with reference Laryngogram data (for which pitch tracking can be considered as very accurate) from the
University of Graz. Evaluation will also have to rely on relative comparison of the algorithms' performance.
Some programming will be required, but conceptually, the project is relatively simple: compare the time-series produced by different trackers to a gold standard and/or to each other.
--
TimoBaumann - 06 Apr 2016