Sub-word based Language Modeling for Amharic

NatS OberSeminar May 24, 2007 14:15 Uhr, F-235

A language model is a probability distribution q(s) over word sequence s that models how often each sequence s occurs as a sentence. Language models have a wide area of application in speech and natural language processing. Having good language models is, therefore, important to improve performance of many speech and natural language processing systems. Our aim is to develop language models which are appropriate for morphologically rich languages such as Amharic, which suffer greatly from out of vocabulary words and data sparseness problem.

In this presentation, I will talk about the experiment made on sub-word based language modeling for Amharic and discuss the results.

