CHRISTINE

Description

The new project aims to do for spoken English what SUSANNE did for written English. This includes the detailed annotation of grammar in the ordinary sense. It is clear that there are (at least) statistical differences between the ways in which speech and writing exploit the range of grammatical constructions provided by the language, and at present we have little hard evidence on the precise nature of these differences. But spoken language has additional types of structural phenomenon which are not usually found in writing, for which new annotation standards are needed.

Probably most significant are speech management phenomena, whereby wording is edited `on the fly': computer speech processing needs ways of distinguishing between the wording made obsolete by later edits and the wording which replaces it. Other structurally significant issues more or less peculiar to the speech mode are discourse items used to mark pragmatic force, and hesitation phenomena, whose incidence relative to surrounding structure is potentially an important cue for automatic analysis. Roger Moore of DRA Malvern has written of the `overwhelming need for agreed standards of ... annotation ... [for] normal, everyday, non-prepared speech [which] is replete with repetition, false-starts, repairs, partial utterances, "uhms" and "errs" etc.'

Features

  • spontaneuos language treebank
  • successor of SusanneCorpus,
  • 20488 sentences, 89617 tokens
  • no crossing edges

Contact

-- MichaelDaum - 04 Apr 2002
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback