|
|
University of Hamburg -
Computer Science -
NATS -
Ingo's home page
Programming stuff of Ingo Schröder |
||||||||||||||||||
|
Overview
This page is about some programming I did. Please note that the
status of the individual projects vary to a large degree. Some
are just hacked together while others are carefully
designed. Use at your own risk.
Constraint parserLast updated 2000/12/14 This is by far my largest software project; it's the most ambigous and the most challenging. It's also the most complicated. I started, designed and implemented major parts of the constraint parsing system of our project DAWAI. However, the system grew over time and a lot of people contributed to the code. Right now, we have more than 55,000 lines of C code and more than 27,000 lines of Tcl/Tk code (which I didn't write). Find more information on the software page of the DAWAI project.ICOPOST: POS taggersLast updated 2001/08/21 ICOPOST is a collection of free implementations of POS taggers. It has its own page.A container libraryLast updated 2000/12/14 A Container Library (ACL) is an off-spring of the CDG system. It is a C library for different data containers and consists of code covering the following items:
Non deterministic tokenizer for natural language parsingLast updated 1999/03/09 Nothing yet...An experimental travelling sales person problem solverLast updated 2001/02/21
tsp.c is a small C program to solve
euclidic travelling sales person problems. It has the following
search methods implemented: local search with node exchange,
local search with edge exchange, dpeth first systematic search,
nearest neighbor heuristics and successive extension
heuristics. It's not well commented or tested. Be warned.
Two small Perl scripts may help you while experimenting:
compute-distance.pl
computes the distance of a given path and
generate-euclidic-tsp.pl
can generate random instances of the euclidic TSP problem.
PP-attachment learnerLast updated 2000/12/14
This is something I did in 1998 in order to better understand
two approaches to the PP attachment problem:
transformation-based learning
and
back-off models.
In case you don't know the PP attachment problem it deals with
question how one can automatically decide the question where a
prepositional phrase should be attached to. For instance, in a
sentence like
He knocked the burglar with the telescope.
the prepositional phrase
with the telescope
can belong to the noun burglar or to verb
knocked.
The first program
pp-tbl.pl
re-implements the transformation-based learning method
described in the article
Eric Brill and Philip Resnik: A Rule-Based Approach To Prepositional Phrase Attachment Disambiguation, COLING 94It achieves an accuracy of 80.2%/79.2% (210/1780 rules) on the test set (500 instances) and 99.55% on the training set (12266 instances). The second program pp-backed-off.pl is simpler and is described in the article Michael John Collins and James Brooks, Prepositional Phrase Attachment through a Backed-Off Model, Proceedings of the 3rd Workshop on Very Large Corpora, 1995It achieves an accuracy of 81.2% on the test set and 99.55% on the training set. Both programs have been implemented in Perl with the emphasis on rapid prototyping and simplicity. TeX hyphenation algorithmLast updated 2002/03/08 (added archive only) This program (hyphenation-0.0.8) re-implements (in C) the (La)TeX hyphenation algorithm which is described in The TeXbook by Donald Knuth, Appendix H. Although simple, it works quite well and processes more than 15,000 words per second. I wrote this program just for myself. Don't expect well-documented source code or a user manual. Sorry. I'll try to answer mails though. Feel free to fix bugs ;-) Here're some difficult German examples:
Here's a link to a
page
where I describe an experiment in transformation-based learning
(see PP attachment)
of rules for hyphenation that we did with some students some
time ago.
Creating images from tree descriptionsLast updated 2001/08/24
Phrase structure trees for natural language utterances can
be visualized on demand of WWW requests.
The logical description of the trees takes the following
form (given in EBNF):
This script
tree.pl
can be used as a CGI program or from the shell
command line. It uses the Perl module
GD.pm
which in turn capsulates the
gd library.
The following call created the image
(PNG, GIF) to the right:
Creating images from NEGRA tree descriptionsLast updated 2001/08/24
The
NEGRA corpus
is a corpus of GERMAN newspaper text with a discontinuous phrase
structure annotation.
This script
negratree.pl
uses the Perl module
GD.pm
which in turn capsulates the
gd library
and creates small PNG images (such as the one to the right)
of the tree.
The following description belongs to the image
(PNG, GIF) to the right:
Wege NN -- NK 502 , $, -- -- 0 die PRELS -- SB 501 im APPRART -- AC 500 Blauen NN -- NK 500 enden VVFIN -- HD 501 #500 PP -- MO 501 #501 S -- RC 502 #502 NP -- -- 0 Converter from GIF to HTMLLast updated 2000/12/14
The program
gif2html.c
takes a GIF image as input and outputs an HTML page that looks like the
image. This
page (100KB)
was made from the image to the right.
Here's the usage information:
gif2html GIFFILE produces an HTML page with a giant table
that mimics the image and
gif2html GIFFILE TEXTFILE
uses colored characters from TEXTFILE instead of a table. Here's
an example for the
second form (500KB).
Fractal images toolkitLast updated 2000/12/14
I wrote a set of small programs that can generate different
fractal images.
The
following page
shows a zoom into the mandelbrot set as an animated GIF (ca. 1.1MB).
In order to create this I wrote a small C program
fractal-zoom.c
that computed a sequence of 600 images; you'll need the
GD
library. I then used
gifmerge
to combine the single images into an animated GIF.
Here's a link to my
Java fractal image generator.
Nine puzzle as CGI scriptLast updated 1996/09/21
Something wrong with these web pages? Please mail me. last revised 14/12/2000 Ingo Schröder http://nats-www.informatik.uni-hamburg.de/~ingo/prg/ |