VerbmobilTreebank

Description

We could help you with treebanks for English and German (and to some degree for Japanese). They were developed in Tuebingen in the framework of Verbmobil, a speech-to-speech translation project. For this reason, the treebanks contain spontaneous speech data in the domains scheduling of business appointments, travel scheduling, and hotel reservations.

The English treebank contains ca. 30,000 sentences, the German treebank ca. 38,000 sentences. The Japanese treebank is somewhat smaller, it contains ca. 18,000 sentences. The annotations for all treebanks cover the levels of morpho-syntax, syntactic phrase structure, and function-argument structure. The annotation schemes are purely context-free, i.e. they do not contain crossing branches or traces.

Additionally, for each treebank, there exists an extensive stylebook, which describes how different phenomena are annotated.

As the treebanks are only becoming available now (due to project restrictions), I am not sure what the license conditions for commercial use will be.

Contact

-- MichaelDaum - 04 Apr 2002
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback