Pitch tracking is the task of determining the fundamental frequency of the speech sounds in vocalisation. A speaker's pitch changes over time and some stretches of speech have pitch while others are voiceless. Pitch tracking is useful to determine stress and intonation (i.e., the sentence-level prosody), as a basic feature for emotion recognition, or to just distinguish female from male speakers, among other things.

There are many pitch tracking algorithms (use e.g. the two algorithms that Wavesurfer provides from the snack library, or have a look at Praat) and algorithms have various settings (e.g. expected pitch range). Your task is to evaluate algorithms in a meaningful way. While there is a (small) database of manually tracked data (Keele corpus, will be made available upon request) and another corpus with reference Laryngogram data (for which pitch tracking can be considered as very accurate) from the University of Graz. Evaluation will also have to rely on relative comparison of the algorithms' performance.

Some programming will be required, but conceptually, the project is relatively simple: compare the time-series produced by different trackers to a gold standard and/or to each other.

-- TimoBaumann - 06 Apr 2016
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback