#include <lexemgraph.h>
Collaboration diagram for LexemGraphStruct:
The Vectors graphemnodes and nodes contains all grapheme nodes and lexeme nodes. Grapheme nodes are an intermediary data structure between word arcs and lexeme nodes that is not strictly necessary.
The fields min and max correspond to the fields with the same names in the Lattice.
The field distance holds an Array of the distance between any two lexeme nodes. The distance is measured in words, hence any two adjacent lexeme nodes have distance~1. The distance array is also used to check whether two lexeme nodes are compatible with each other, i.e. whether there is a path through the lexeme graph that includes them both.
The fields noOfPathsFromStart and noOfPathsToEnd hold arrays that map each grapheme node to the number of complete paths from the start or to the end of the entire graph that go through it. This is used to determine whether a lexeme node can be deleted or not.
The field noOfPaths holds the total number of paths from start to end possible in the word graph. This should reflect the state of the Vector isDeletedNode.
WARNING: The last three fields use GNU's long
long
int
type as a cheap way to get 64-bit integers because the number of paths in realistic word graphs really does need that. However, the tool SWIG cannot deal with this type, and as a result XCDG cannot access the LexemGraph structure at all. Several functions in this library exist only to work around this restriction. Ultimately these fields should be converted to a proper Bigint type.
The Vector isDeletedNode marks those lexeme nodes that have been deleted and should be ignored. For instance, when an LV is added to a partial solution, all lexeme nodes that are not lgCompatibleNodes() to its lexeme nodes should be marked as deleted.
Definition at line 72 of file lexemgraph.h.
Data Fields | |
List | chunks |
Array | distance |
Vector | graphemnodes |
ByteVector | isDeletedNode |
Lattice | lattice |
int | max |
int | min |
Vector | nodes |
long long | noOfPaths |
long long * | noOfPathsFromStart |
long long * | noOfPathsToEnd |
Vector | tags |
|
set of all chunks of the lattice Definition at line 87 of file lexemgraph.h. Referenced by chunkerChunk(), cnPrint(), lgClone(), lgCopyTagScores(), lgDelete(), lgNewInit(), and lgPrint(). |
|
matrix of distances Definition at line 81 of file lexemgraph.h. Referenced by lgClone(), lgComputeDistances(), lgDelete(), lgDistanceOfNodes(), lgNewInit(), and lgUpdateArcs(). |
|
vector of grapheme nodes Definition at line 74 of file lexemgraph.h. Referenced by cmdDistance(), cnBuildIter(), cnBuildNodes(), cnBuildTriple(), cnGetGraphemNodeFromArc(), cnTag(), computeNoOfPathsFromStart(), computeNoOfPathsToEnd(), findGrapheme(), getChunks(), lgClone(), lgComputeDistances(), lgComputeNoOfPaths(), lgContains(), lgDelete(), lgMostProbablePath(), lgNewFinal(), lgNewInit(), and lgNewIter(). |
|
vector of boolean flags Definition at line 85 of file lexemgraph.h. Referenced by cnPrint(), cnPrintActiveLVs(), cnRenew(), getCategories(), lgAreDeletableNodes(), lgClone(), lgComputeNoOfPaths(), lgCopySelection(), lgDelete(), lgDeleteNode(), lgDeleteNodes(), lgIsDeletedNode(), lgNewFinal(), lgNewInit(), lgNewIter(), lgPrint(), and lgPrintNode(). |
|
underlying word graph Definition at line 73 of file lexemgraph.h. Referenced by cnGetLattice(), cnPrintInfo(), initFakeChunker(), lgClone(), lgMostProbablePath(), lgNew(), lgNewFinal(), lgNewInit(), lgPrint(), lgSpuriousUppercase(), and lgUpdateArcs(). |
|
maximum end position Definition at line 77 of file lexemgraph.h. Referenced by cnBuildFinal(), cnBuildNodes(), cnBuildTriple(), cnTag(), computeNoOfPathsToEnd(), evalTerm(), lgClone(), lgIsEndNode(), lgMakePath(), lgNewFinal(), lgNewInit(), lgNewIter(), and lgWidth(). |
|
minimum start position Definition at line 76 of file lexemgraph.h. Referenced by cnBuildFinal(), cnTag(), computeNoOfPathsFromStart(), lgClone(), lgComputeNoOfPaths(), lgIsStartNode(), lgMakePath(), lgNewInit(), and lgNewIter(). |
|
vector of lexeme nodes Definition at line 75 of file lexemgraph.h. Referenced by cnBuildEdges(), cnBuildIter(), cnBuildNodes(), cnOptimizeNode(), cnRenew(), cnTag(), lgClone(), lgComputeNoOfPaths(), lgCopySelection(), lgCopyTagScores(), lgDelete(), lgMakePath(), lgNewInit(), lgNewIter(), lgPrint(), lgRequireLexeme(), lgRequireLexemes(), and lgWidth(). |
|
total number of paths through graph Definition at line 84 of file lexemgraph.h. Referenced by cnPrint(), cnTag(), lgAreDeletableNodes(), lgComputeNoOfPaths(), lgDeleteNode(), lgDeleteNodes(), and lgNewFinal(). |
|
vector of numbers of paths Definition at line 82 of file lexemgraph.h. Referenced by cnOptimizeNode(), computeNoOfPathsFromStart(), lgAreDeletableNodes(), lgClone(), lgComputeNoOfPaths(), lgDelete(), and lgNewInit(). |
|
vector of numbers of paths Definition at line 83 of file lexemgraph.h. Referenced by cnOptimizeNode(), computeNoOfPathsToEnd(), lgAreDeletableNodes(), lgClone(), lgComputeNoOfPaths(), lgDelete(), and lgNewInit(). |
|
set of POS tags Definition at line 86 of file lexemgraph.h. Referenced by lgCopyTagScores(), lgDelete(), lgNewFinal(), and lgNewInit(). |