Domain Adaptation in machine translation

since June 2013

Current state of the art techniques for domain adaptation in statistical machine translation include the usage of mixture models that give a lower weight to the training domain and a higher one to the test domain. Also, mining unknown words by building dictionaries using different resources helps in improving the translation system. There is a related problem in machine learning named transductive transfer learning which learns a scoring function given two different domains and a single task. However, not many techniques from the machine learning community have been implemented. Methods like bootstrapping and structural correspondence learning have been used in tasks like parser adaptation and opinion mining adaptation, but have not been implemented in the task of translation adaptation. One of the reasons lies in the mismatch between the domain adaptation of statistical machine translation models and the transductive transfer learning settings which imply using a feature space and a label space.

The main focus of this research is on fulfilling the theoretical conditions needed in order to apply transfer learning algorithms to domain adaptation in SMT and implementing and applying with success these algorithms in the setting of using low resource languages like Romanian and divergent domains like Biology and Geography

Persons involved: Mirela-Stefania Duma (Ph.D. work), Cristina Vertan, Walther v. Hahn, Wolfgang Menzel
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback