Evaluation of NLP systems is on its way to a deep and detailed standardization. Methods and techniques are developed, but only for the evaluation of ready-to-sell products; the evaluation of a system that is still under development is not standardized and not even ad hoc tools are available for this purpose. The evaluation of VERBMOBIL required the development of an adequate evaluation technique and a tool that could deal both with the need to validate the system as a quasi-product and to produce useful feedback to the developers for further improvement of the system. This paper explains the methodological and technical choices that led to the implementation of a graphic evaluation tool (GET), discusses the GET and shows the results that have been gathered by its use. The paper includes a discussion of the complex problem of evaluating translations.
To download the paper click the title, please!
You will receive
file format: rtf
In case of any problems please send a mail to email@example.com