VerbmobilTreebank
Description
We could help you with treebanks for English and German (and to some
degree for Japanese). They were developed in Tuebingen in the framework
of Verbmobil, a speech-to-speech translation project. For this reason,
the treebanks contain spontaneous speech data in the domains scheduling
of business appointments, travel scheduling, and hotel reservations.
The English treebank contains ca. 30,000 sentences, the German treebank
ca. 38,000 sentences. The Japanese treebank is somewhat smaller, it
contains ca. 18,000 sentences. The annotations for all treebanks cover
the levels of morpho-syntax, syntactic phrase structure, and
function-argument structure. The annotation schemes are purely
context-free, i.e. they do not contain crossing branches or traces.
Additionally, for each treebank, there exists an extensive stylebook,
which describes how different phenomena are annotated.
As the treebanks are only becoming available now (due to project
restrictions), I am not sure what the license conditions for commercial
use will be.
Contact
--
MichaelDaum - 04 Apr 2002