NATS·Main·ELSNETCourses?·AbstractSchaefer
Natural Language Systems, Department of Informatics, University of Hamburg
Search: 

XML-based Integration of Natural Language Processing Components

Ulrich Schäfer, DFKI, Language Technology Lab, Saarbrücken, Germany

Abstract

While XML and its predecessor SGML have been used extensively for (offline) corpus annotation, today more and more natural language processors output XML online. This course will focus on the XML-based integration of NLP components that can help to increase robustness and reduce ambiguity in natural language processing systems.

After a brief introduction to XML, Unicode, DTD and XML Schema, we will focus on technologies and applications for integrating XML output of multiple NLP processors. We will study XML formats and integration issues for part-of-speech tagging, morpho-syntax, named entities, chunking, parsing, semantics and ontologies, including related, current standardization efforts (e.g. ISO, W3C), but also general concepts such as standoff annotation and multi-dimensional markup.

We will then introduce XML integration and query languages such as XPath, XSLT and XQuery, and, as a practical exercise, use them to integrate real NLP markup. Finally, existing tools and architecture frameworks for XML NLP markup integration will be presented.

-- last modified CristinaVertan -- 05 Mar 2006

r1.1 - 05 Mar 2006 - 16:53 GMT - CristinaVertan
Copyright (c) 1999-2006 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Powered by TWiki/Beijing 01 Feb 2003 (NatsWiki), Syndicate this site.