This module offers an interface for an external chunker.
|
different modes the chunker can operate in.
|
|
chunk types.
Referenced by chunkerChunkTypeOfString(), and getFakeChunkType(). |
|
compute the chunks.
References CDG_ERROR, CDG_INFO, cdgPrintf(), GraphemNodeStruct::chunk, Chunk, Chunker, chunkerChunkDelete(), chunkerCloneChunk(), chunkerReplaceGraphemes(), LexemGraphStruct::chunks, ChunkerStruct::chunks, evalChunker(), EvalChunker, FakeChunker, getChunks(), getFakeChunks(), GraphemNode, ChunkerStruct::lg, ChunkerStruct::mode, ChunkStruct::nodes, NULL, RealChunker, and ChunkStruct::subChunks. Referenced by cmdChunk(), and cnTag(). |
|
chunkerChunkDelete: destruct a chunk and all its subchunks. parameters: chunk = a chunk to be deallocated Definition at line 504 of file chunker.c. References Chunk, chunkerChunkDelete(), ChunkStruct::nodes, and ChunkStruct::subChunks. Referenced by chunkerChunk(), chunkerChunkDelete(), getFakeChunksAt(), lgDelete(), and resetChunker(). |
|
return the string representation of a chunk type. Definition at line 1513 of file chunker.c. References ChunkType, NChunk, NoChunk, PChunk, and VChunk. Referenced by getChunks(). |
|
chunkerCloneChunk: construct a copy of a given chunk including clones of subChunks. parameters: chunk = the original returns: the copy. Definition at line 477 of file chunker.c. References Chunk, chunkerCloneChunk(), ChunkStruct::from, ChunkStruct::head, newChunk(), ChunkStruct::nodes, NULL, ChunkStruct::parent, ChunkStruct::subChunks, ChunkStruct::to, and ChunkStruct::type. Referenced by chunkerChunk(), chunkerCloneChunk(), lgCopyTagScores(), and mergeChunk(). |
|
validation command for chunkerCommand.
References CDG_ERROR, cdgFreeString(), cdgPrintf(), chunkerArgs, FALSE, NULL, and TRUE. Referenced by chunkerInitialize(). |
|
chunkerDelete: destroy the chunker representation. parameters: chunker = the object to be destructed Definition at line 439 of file chunker.c. References Chunker, and resetChunker(). Referenced by cmdChunk(), and cnTag(). |
|
finalize the chunker module. This is called by cdgFinalize. (No good module without a finalizer and a initializer.)
|
|
initialize the chunker module. This is called only once by cdgInitialize when the application starts up.
References chunkerCommand, chunkerCommandValidate(), chunkerMode, chunkerUseChunker, EvalChunker, FakeChunker, NULL, and RealChunker. Referenced by cdgInitialize(). |
|
construct a new chunker.
References ChunkerStruct::args, CDG_ERROR, cdgPrintf(), Chunker, chunkerArgs, chunkerCommand, chunkerMode, chunkerUseChunker, ChunkerStruct::chunks, DefaultChunker, FakeChunker, FALSE, initChunker(), ChunkerStruct::lg, ChunkerStruct::mode, ChunkerStruct::nrLevels, ChunkerStruct::nrWords, NULL, ChunkerStruct::parse, and ChunkerStruct::pid. Referenced by cmdChunk(), and cnTag(). |
|
print the chunks of a lattice.
References cdgPrintf(), and printChunk(). Referenced by cmdChunk(), cnPrint(), cnTag(), and lgPrint(). |
|
chunkerReplaceGraphemes: replace all grapheme references in a chunk with those given in a lexemgraph. parameters: chunk = the structure using the arcs lg = the lexemgraph using equivalent arcs Definition at line 1771 of file chunker.c. References Chunk, findGrapheme(), ChunkStruct::from, ChunkStruct::head, ChunkStruct::nodes, ChunkStruct::subChunks, and ChunkStruct::to. Referenced by chunkerChunk(), and lgCopyTagScores(). |
|
return the string representation of a chunk type. Definition at line 1530 of file chunker.c. References Chunk, NChunk, PChunk, ChunkStruct::type, and VChunk. Referenced by embedChunk(), evalTerm(), getChunks(), getFakeChunksAt(), mergeChunk(), postProcessChunks(), and printChunk(). |
|
cmpArcs: return true if arc1 starts before arc2 Definition at line 525 of file chunker.c. Referenced by embedChunk(). |
|
cmpChunks: return true if c1 starts before c2. Definition at line 533 of file chunker.c. References GraphemNodeStruct::arc, Chunk, Chunker, and ChunkStruct::from. Referenced by embedChunk(), getChunks(), getFakeChunks(), getFakeChunksAt(), and mergeChunk(). |
|
cmpGraphemes: return true if g1 start before g2 Definition at line 517 of file chunker.c. References GraphemNodeStruct::arc, and GraphemNode. Referenced by mergeChunk(). |
|
are two chunks isomorph.
References GraphemNodeStruct::arc, Chunk, FALSE, ChunkStruct::from, ChunkStruct::subChunks, ChunkStruct::to, TRUE, and ChunkStruct::type. Referenced by evalChunker(). |
|
count the number of chunks.
References Chunk, NChunk, PChunk, ChunkStruct::subChunks, ChunkStruct::type, and VChunk. Referenced by evalChunker(). |
|
embedChunk: embed chunk as a subchunk into the target chunk parameters target = the resulting chunk source = the chunk to be embedded returns: the target chunk. Definition at line 846 of file chunker.c. References GraphemNodeStruct::arc, CDG_DEBUG, cdgPrintf(), Chunk, Chunker, chunkerStringOfChunkType(), cmpArcs(), cmpChunks(), ChunkStruct::from, ChunkStruct::nodes, ChunkStruct::subChunks, and ChunkStruct::to. Referenced by getFakeChunksAt(). |
|
evaluate computed agains annotated chunks.
References GraphemNodeStruct::arc, CDG_INFO, CDG_WARNING, cdgPrintf(), Chunk, Chunker, ChunkerStruct::chunks, compareChunks(), countChunks(), findChunk(), ChunkStruct::from, NChunk, NoChunk, NULL, PChunk, printChunk(), ChunkStruct::to, ChunkStruct::type, and VChunk. Referenced by chunkerChunk(). |
|
search the chunk that spans over the given indices.
References GraphemNodeStruct::arc, CDG_DEBUG, cdgPrintf(), Chunk, ChunkStruct::from, NULL, printChunk(), ChunkStruct::subChunks, and ChunkStruct::to. Referenced by evalChunker(), and getChunks(). |
|
find an equivalent grapheme in a given lexemgraph. parameters: lg = the lexemgraph arc = the arc (possibly not used in the lexemgraph) returns: an equivalent arc. Definition at line 1745 of file chunker.c. References GraphemNodeStruct::arc, GraphemNode, LexemGraphStruct::graphemnodes, and NULL. Referenced by chunkerReplaceGraphemes(). |
|
getCategories: get all POS-tags of undeleted lexem nodes parameter: gn = a lexem node returns: a list of POS tags. Definition at line 646 of file chunker.c. References CDG_DEBUG, cdgPrintf(), GraphemNode, LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, GraphemNodeStruct::lexemes, GraphemNodeStruct::lexemgraph, LexemNode, LexemNodeStruct::no, NULL, and LexemNodeStruct::tagscore. Referenced by getCategory(), and printChunk(). |
|
getCategory: get one POS-tag, warn if there are more than one parameter: gn = a grapheme returns: the first POS-tag available. Definition at line 614 of file chunker.c. References CDG_WARNING, cdgPrintf(), getCategories(), GraphemNode, and NULL. Referenced by getChunks(), postProcessChunks(), and printChunk(). |
|
this is the entry function to the real chunker.
References GraphemNodeStruct::arc, CDG_DEBUG, CDG_ERROR, CDG_WARNING, cdgFreeString(), cdgPrintf(), Chunk, Chunker, chunkerChunkTypeOfString(), chunkerStringOfChunkType(), cmpChunks(), findChunk(), ChunkStruct::from, getCategory(), GraphemNode, LexemGraphStruct::graphemnodes, ChunkStruct::head, ChunkerStruct::lg, newChunk(), ChunkStruct::nodes, NULL, ChunkerStruct::pipe1, ChunkerStruct::pipe2, ChunkStruct::subChunks, and ChunkStruct::to. Referenced by chunkerChunk(). |
|
this is the entry function to the fake chunker.
References Chunker, cmpChunks(), getFakeChunksAt(), NULL, parseGetRoots(), and postProcessChunks(). Referenced by chunkerChunk(). |
|
get the chunks under the given root node.
References GraphemNodeStruct::arc, CDG_DEBUG, cdgPrintf(), Chunk, Chunker, chunkerChunkDelete(), chunkerStringOfChunkType(), cmpChunks(), embedChunk(), ChunkStruct::from, getFakeChunkType(), GraphemNode, ChunkStruct::head, ChunkerStruct::mainlevel, mergeChunk(), NChunk, newChunk(), ChunkStruct::nodes, NULL, ChunkStruct::parent, ChunkerStruct::parse, parseGetGrapheme(), parseGetLabel(), PChunk, ChunkStruct::to, ChunkStruct::type, UnknownChunk, and VChunk. Referenced by getFakeChunks(). |
|
getFakeChunkType Definition at line 741 of file chunker.c. References Chunker, ChunkType, NChunk, NoChunk, parseGetCategory(), PChunk, UnknownChunk, and VChunk. Referenced by getFakeChunksAt(). |
|
initialize the chunker with the given data.
References CDG_ERROR, cdgPrintf(), Chunker, EvalChunker, FakeChunker, FALSE, initFakeChunker(), initRealChunker(), ChunkerStruct::mode, and RealChunker. Referenced by chunkerNew(). |
|
initialize a fake chunker with the given data.
References CDG_ERROR, cdgPrintf(), Chunker, FALSE, LexemGraphStruct::lattice, ChunkerStruct::lg, ChunkerStruct::mainlevel, ChunkerStruct::nrLevels, ChunkerStruct::nrWords, ChunkerStruct::parse, resetChunker(), and TRUE. Referenced by initChunker(). |
|
initialize a real chunker with the given data.
References ChunkerStruct::args, CDG_DEBUG, CDG_ERROR, cdgPrintf(), Chunker, FALSE, ChunkerStruct::mainlevel, ChunkerStruct::pid, ChunkerStruct::pipe1, ChunkerStruct::pipe2, and TRUE. Referenced by initChunker(). |
|
mergeChunk: add the source to the target chunk. the target chunk spans the words of both chunks. parameters: chunker = the current chunker target = the resulting chunk source = the chunk to be added to the target returns: the target chunk. Definition at line 805 of file chunker.c. References GraphemNodeStruct::arc, CDG_DEBUG, cdgPrintf(), Chunk, Chunker, chunkerCloneChunk(), chunkerStringOfChunkType(), cmpChunks(), cmpGraphemes(), ChunkStruct::from, ChunkStruct::nodes, ChunkStruct::subChunks, and ChunkStruct::to. Referenced by getFakeChunksAt(). |
|
construct a new chunk and initialize it.
References Chunk, ChunkStruct::from, ChunkStruct::head, ChunkStruct::nodes, NULL, ChunkStruct::parent, ChunkStruct::subChunks, ChunkStruct::to, and ChunkStruct::type. Referenced by chunkerCloneChunk(), getChunks(), and getFakeChunksAt(). |
|
parseGetCategory: get the POS-tag of a given word index parameters: chunker = the current chunker index = index of a word in the parse. returns: the POS-tag string or NULL if not defined Definition at line 694 of file chunker.c. References Chunker, LexemNodeStruct::lexem, LexemNode, NULL, and parseGetLevelValue(). Referenced by getFakeChunkType(). |
|
parseGetGrapheme: get the grapheme node of a given word index parameters: chunker = the current chunker index = index of a word in the parse. returns: the arc. Definition at line 720 of file chunker.c. References CDG_ERROR, cdgPrintf(), Chunker, GraphemNode, NULL, and parseGetLevelValue(). Referenced by getFakeChunksAt(). |
|
parseGetLabel: get the label of the dependency of a word (on the main level) parameters: chunker = the current chunker index = index of a word in the parse returns: the label of that dependency Definition at line 585 of file chunker.c. References Chunker, ChunkerStruct::mainlevel, ChunkerStruct::nrLevels, and ChunkerStruct::parse. Referenced by getFakeChunksAt(). |
|
parseGetLevelValue: get the dependency arc of a word. parameters: chunker = the current chunker index = index of a word in the parse. returns: the level value of this word Definition at line 600 of file chunker.c. References Chunker, ChunkerStruct::mainlevel, NULL, and ChunkerStruct::parse. Referenced by parseGetCategory(), and parseGetGrapheme(). |
|
parseGetModifiee: get the word this one is modifying (on the main level) parameters: chunker = the current chunker index = the modifier index in the word vector of the current parse returns: the modifiee index Definition at line 571 of file chunker.c. References Chunker, ChunkerStruct::mainlevel, ChunkerStruct::nrLevels, and ChunkerStruct::parse. Referenced by parseGetRoots(). |
|
parseGetRoots: get all unbound words (on the main level) parameters: chunker = the current chunker returns: a list of word indices or NULL if there are no root bindings (?) Note: you become the owner of the returned list container, so deallocate it after you've consumed the result. Definition at line 548 of file chunker.c. References Chunker, ChunkerStruct::nrWords, NULL, and parseGetModifiee(). Referenced by getFakeChunks(). |
|
postProcessChunks: get rid of unwanted chunks. parameters: inputList = items to be filtered Definition at line 1091 of file chunker.c. References GraphemNodeStruct::arc, CDG_DEBUG, cdgPrintf(), Chunk, Chunker, chunkerStringOfChunkType(), ChunkStruct::from, getCategory(), NoChunk, ChunkStruct::to, ChunkStruct::type, and UnknownChunk. Referenced by getFakeChunks(). |
|
printChunk: print a single chunk and all its subchunks parameters: mode = print mode, e.g. CDG_INFO chunk = the chunk to be printed Definition at line 1423 of file chunker.c. References GraphemNodeStruct::arc, cdgPrintf(), Chunk, chunkerStringOfChunkType(), getCategories(), getCategory(), GraphemNode, ChunkStruct::head, NoChunk, ChunkStruct::nodes, NULL, ChunkStruct::subChunks, ChunkStruct::to, and ChunkStruct::type. Referenced by chunkerPrintChunks(), evalChunker(), and findChunk(). |
|
resetChunker: set the chunker in a state of innocence. parameters: chunker = the object of desire Definition at line 404 of file chunker.c. References ChunkerStruct::args, cdgFreeString(), Chunker, chunkerChunkDelete(), ChunkerStruct::chunks, ChunkerStruct::lg, ChunkerStruct::nrLevels, ChunkerStruct::nrWords, NULL, ChunkerStruct::parse, ChunkerStruct::pid, ChunkerStruct::pipe1, ChunkerStruct::pipe2, and terminateChild(). Referenced by chunkerDelete(), and initFakeChunker(). |
|
terminateChild This function waits for child with the specified pid to terminate. Return values: -1 error 0 child died already 1 child died after SIGTERM 2 child died after SIGKILL Definition at line 137 of file chunker.c. References CDG_ERROR, CDG_WARNING, cdgPrintf(), and NULL. Referenced by resetChunker(). |
|
NULL terminated array of command arguments used for real chunking Definition at line 81 of file chunker.c. Referenced by chunkerCommandValidate(), and chunkerNew(). |
|
string representation of the current command used for real chunking Definition at line 78 of file chunker.c. Referenced by chunkerInitialize(), and chunkerNew(). |
|
set the default chunker mode,
Referenced by chunkerInitialize(), and chunkerNew(). |
|
indicates wether the chunker is used or not Definition at line 72 of file chunker.c. Referenced by chunkerInitialize(), and chunkerNew(). |