Main Page | Modules | Alphabetical List | Data Structures | File List | Data Fields | Related Pages

LexemGraphStruct Struct Reference

#include <lexemgraph.h>

Collaboration diagram for LexemGraphStruct:

Collaboration graph
[legend]

Detailed Description

A LexemGraph is a word graph enriched with lexical information. In particular, it contains several lexemes for each arc of the Lattice that is lexically ambiguous. Each of these pairs of lexical entry and time span is called a lexeme node. The field lattice points to the underlying word graph.

The Vectors graphemnodes and nodes contains all grapheme nodes and lexeme nodes. Grapheme nodes are an intermediary data structure between word arcs and lexeme nodes that is not strictly necessary.

The fields min and max correspond to the fields with the same names in the Lattice.

The field distance holds an Array of the distance between any two lexeme nodes. The distance is measured in words, hence any two adjacent lexeme nodes have distance~1. The distance array is also used to check whether two lexeme nodes are compatible with each other, i.e. whether there is a path through the lexeme graph that includes them both.

The fields noOfPathsFromStart and noOfPathsToEnd hold arrays that map each grapheme node to the number of complete paths from the start or to the end of the entire graph that go through it. This is used to determine whether a lexeme node can be deleted or not.

The field noOfPaths holds the total number of paths from start to end possible in the word graph. This should reflect the state of the Vector isDeletedNode.

WARNING: The last three fields use GNU's long long int type as a cheap way to get 64-bit integers because the number of paths in realistic word graphs really does need that. However, the tool SWIG cannot deal with this type, and as a result XCDG cannot access the LexemGraph structure at all. Several functions in this library exist only to work around this restriction. Ultimately these fields should be converted to a proper Bigint type.

The Vector isDeletedNode marks those lexeme nodes that have been deleted and should be ignored. For instance, when an LV is added to a partial solution, all lexeme nodes that are not lgCompatibleNodes() to its lexeme nodes should be marked as deleted.

Definition at line 72 of file lexemgraph.h.

Data Fields

List chunks
Array distance
Vector graphemnodes
ByteVector isDeletedNode
Lattice lattice
int max
int min
Vector nodes
long long noOfPaths
long long * noOfPathsFromStart
long long * noOfPathsToEnd
Vector tags


Field Documentation

List LexemGraphStruct::chunks
 

set of all chunks of the lattice Definition at line 87 of file lexemgraph.h.

Referenced by chunkerChunk(), cnPrint(), lgClone(), lgCopyTagScores(), lgDelete(), lgNewInit(), and lgPrint().

Array LexemGraphStruct::distance
 

matrix of distances Definition at line 81 of file lexemgraph.h.

Referenced by lgClone(), lgComputeDistances(), lgDelete(), lgDistanceOfNodes(), lgNewInit(), and lgUpdateArcs().

Vector LexemGraphStruct::graphemnodes
 

vector of grapheme nodes Definition at line 74 of file lexemgraph.h.

Referenced by cmdDistance(), cnBuildIter(), cnBuildNodes(), cnBuildTriple(), cnGetGraphemNodeFromArc(), cnTag(), computeNoOfPathsFromStart(), computeNoOfPathsToEnd(), findGrapheme(), getChunks(), lgClone(), lgComputeDistances(), lgComputeNoOfPaths(), lgContains(), lgDelete(), lgMostProbablePath(), lgNewFinal(), lgNewInit(), and lgNewIter().

ByteVector LexemGraphStruct::isDeletedNode
 

vector of boolean flags Definition at line 85 of file lexemgraph.h.

Referenced by cnPrint(), cnPrintActiveLVs(), cnRenew(), getCategories(), lgAreDeletableNodes(), lgClone(), lgComputeNoOfPaths(), lgCopySelection(), lgDelete(), lgDeleteNode(), lgDeleteNodes(), lgIsDeletedNode(), lgNewFinal(), lgNewInit(), lgNewIter(), lgPrint(), and lgPrintNode().

Lattice LexemGraphStruct::lattice
 

underlying word graph Definition at line 73 of file lexemgraph.h.

Referenced by cnGetLattice(), cnPrintInfo(), initFakeChunker(), lgClone(), lgMostProbablePath(), lgNew(), lgNewFinal(), lgNewInit(), lgPrint(), lgSpuriousUppercase(), and lgUpdateArcs().

int LexemGraphStruct::max
 

maximum end position Definition at line 77 of file lexemgraph.h.

Referenced by cnBuildFinal(), cnBuildNodes(), cnBuildTriple(), cnTag(), computeNoOfPathsToEnd(), evalTerm(), lgClone(), lgIsEndNode(), lgMakePath(), lgNewFinal(), lgNewInit(), lgNewIter(), and lgWidth().

int LexemGraphStruct::min
 

minimum start position Definition at line 76 of file lexemgraph.h.

Referenced by cnBuildFinal(), cnTag(), computeNoOfPathsFromStart(), lgClone(), lgComputeNoOfPaths(), lgIsStartNode(), lgMakePath(), lgNewInit(), and lgNewIter().

Vector LexemGraphStruct::nodes
 

vector of lexeme nodes Definition at line 75 of file lexemgraph.h.

Referenced by cnBuildEdges(), cnBuildIter(), cnBuildNodes(), cnOptimizeNode(), cnRenew(), cnTag(), lgClone(), lgComputeNoOfPaths(), lgCopySelection(), lgCopyTagScores(), lgDelete(), lgMakePath(), lgNewInit(), lgNewIter(), lgPrint(), lgRequireLexeme(), lgRequireLexemes(), and lgWidth().

long long LexemGraphStruct::noOfPaths
 

total number of paths through graph Definition at line 84 of file lexemgraph.h.

Referenced by cnPrint(), cnTag(), lgAreDeletableNodes(), lgComputeNoOfPaths(), lgDeleteNode(), lgDeleteNodes(), and lgNewFinal().

long long* LexemGraphStruct::noOfPathsFromStart
 

vector of numbers of paths Definition at line 82 of file lexemgraph.h.

Referenced by cnOptimizeNode(), computeNoOfPathsFromStart(), lgAreDeletableNodes(), lgClone(), lgComputeNoOfPaths(), lgDelete(), and lgNewInit().

long long* LexemGraphStruct::noOfPathsToEnd
 

vector of numbers of paths Definition at line 83 of file lexemgraph.h.

Referenced by cnOptimizeNode(), computeNoOfPathsToEnd(), lgAreDeletableNodes(), lgClone(), lgComputeNoOfPaths(), lgDelete(), and lgNewInit().

Vector LexemGraphStruct::tags
 

set of POS tags Definition at line 86 of file lexemgraph.h.

Referenced by lgCopyTagScores(), lgDelete(), lgNewFinal(), and lgNewInit().


The documentation for this struct was generated from the following file:
CDG 0.95 (20 Oct 2004)