Among many other things, the grammar is supposed to cover the entire lexicon
of modern German. This is obviously impossible for open word classes,
so that templates must be used for unknown words. Nevertheless, it is
advantageous to treat as many words as possible as "known". Here are
some sources of freely available lexical information.
GermaNet "GermaNet relates German nouns, verbs, and adjectives semantically by grouping words belonging to the same concept and by defining semantic relations between concepts. It has much in common with the English WordNet and might be viewed as an on-line thesaurus defining an explicit ontology."
Multext "A series of projects whose goals are to develop standards and specifications for the encoding and processing of linguistic corpora". They make available a very complete annotated lexicon for German. Most valuable to us: A list of 23,000 German nouns, fully classified by declension. The project data are no longer hosted at the MULTEXT site, but can be downloaded here.
FBI-HH-B-243/02: Automatic Recognition and Morphological Classification of Unknown German Nouns (Preslav Nakov, Galia Angelova, Walther von Hahn). This report contains a list of about 2000 patterns for guessing the declension class of German nouns.
GEONet Names Server The National Imagery and Mapping Agency publishes data about every conceivable geographic feature on Earth, listing about 5,900,000 geographic names. This is the ultimate online source of place names.
Still missing: a good comprehensive source of first and last names.