|
computes LexemGraph::noOfPathsFromStart This function computes the number of paths leading to gn from the start of lg. If gn corresponds to a start node, this is simply the number of lexeme nodes sprung from gn. Otherwise it is that number multiplied by the sum of the numbers of paths leading from the start to immediately preceding grapheme nodes. If gn is deleted, the number is always zero. Definition at line 128 of file lexemgraph.c. References GraphemNodeStruct::arc, CDG_ERROR, cdgPrintf(), GraphemNode, LexemGraphStruct::graphemnodes, GraphemNodeStruct::live, LexemGraphStruct::min, GraphemNodeStruct::no, and LexemGraphStruct::noOfPathsFromStart. Referenced by lgComputeNoOfPaths(). |
|
computes LexemGraph::noOfPathsToEnd This function computes the number of paths leading from gn to the end of lg. If gn corresponds to an end node, this is simply the number of lexeme nodes sprung from g. Otherwise it is that number multiplied by the sum of the numbers of paths leading to the end from immediately following grapheme nodes. If gn is deleted, the number is always zero. Definition at line 180 of file lexemgraph.c. References GraphemNodeStruct::arc, CDG_ERROR, cdgPrintf(), GraphemNode, LexemGraphStruct::graphemnodes, GraphemNodeStruct::live, LexemGraphStruct::max, GraphemNodeStruct::no, and LexemGraphStruct::noOfPathsToEnd. Referenced by lgComputeNoOfPaths(). |
|
Clone a grapheme node. The field GraphemNode::lexemes is not set; the caller has to do that. (The two-way links between grapheme nodes and lexeme nodes can be set easier when all nodes are known.) Definition at line 1713 of file lexemgraph.c. References GraphemNodeStruct::arc, GraphemNodeStruct::chunk, GraphemNode, GraphemNodeStruct::lexemes, GraphemNodeStruct::lexemgraph, GraphemNodeStruct::no, and NULL. Referenced by lgClone(). |
|
This function checks whether all lexeme nodes passed in lexemes can be deleted at the same time. This is the case if doing so will leave at least one complete path though the lexeme graph, according to the current state of deletions. For this end, the function checks whether the sum of the number of paths through each lexeme node is smaller than the total number of paths in lg.
References CDG_ERROR, cdgPrintf(), GraphemNode, LexemGraphStruct::isDeletedNode, GraphemNodeStruct::lexemes, LexemNode, LexemNodeStruct::no, GraphemNodeStruct::no, LexemGraphStruct::noOfPaths, LexemGraphStruct::noOfPathsFromStart, LexemGraphStruct::noOfPathsToEnd, NULL, and TRUE. Referenced by cnOptimizeNode(). |
|
This checks if the lexemes have been deleted TRUE is returned, if not FALSE Definition at line 957 of file lexemgraph.c. References FALSE, LexemNode, lgIsDeletedNode(), NULL, and TRUE. Referenced by cnBuildIter(), cnOptimizeNode(), cnPrint(), and lgComputeDistances(). |
|
Clone a lexeme graph. This performs a totally deep copy; even the underlying lattice, lexicon items etc. are cloned. Definition at line 1735 of file lexemgraph.c. References LexemNodeStruct::arc, LexemGraphStruct::chunks, LexemGraphStruct::distance, gnClone(), LexemNodeStruct::grapheme, GraphemNode, LexemGraphStruct::graphemnodes, LexemGraphStruct::isDeletedNode, LexemGraphStruct::lattice, LexemNodeStruct::lexem, LexemNode, lgComputeDistances(), lgComputeNoOfPaths(), lgCopyTagScores(), LexemNodeStruct::limit, LexemGraphStruct::max, LexemGraphStruct::min, LexemNodeStruct::no, GraphemNodeStruct::no, LexemGraphStruct::nodes, LexemGraphStruct::noOfPathsFromStart, LexemGraphStruct::noOfPathsToEnd, and NULL. |
|
returns TRUE if lexem nodes a and b exist on one path. This function checks whether, in principle, a complete path can exist through lg that includes both a and b. This is independent of the current state of deletions. In fact, the function merely checks whether the distance between the nodes is not 0 by using lgDistanceOfNodes(). Note that two nodes are not automatically compatible merely because they do not overlap in time. Also, a lexeme node is not compatible with itself by this definition. Definition at line 847 of file lexemgraph.c. References CDG_ERROR, cdgPrintf(), LexemNode, lgDistanceOfNodes(), and NULL. Referenced by cnOptimizeNode(), lgCompatibleSets(), lgForbiddenBy(), lgMakePath(), lgRequireLexeme(), and lgRequireLexemes(). |
|
checks if these sets of lexemes are compatible, i.e. either unrelated or intersecting?
References LexemNodeStruct::arc, FALSE, LexemNode, lgCompatibleNodes(), NULL, and TRUE. |
|
(re-)computes the distance matrix LexemGraph::distance This function computes the distance between any two lexeme nodes in lg and stores the result in lg->distance. Definition at line 56 of file lexemgraph.c. References GraphemNodeStruct::arc, LexemGraphStruct::distance, GraphemNode, LexemGraphStruct::graphemnodes, GraphemNodeStruct::lexemes, and lgAreDeletedNodes(). Referenced by cnRenew(), lgClone(), lgDeleteNode(), lgDeleteNodes(), and lgNewFinal(). |
|
computes # of paths possible in the graph, given the current state of deletions. This function computes the number of paths possible in lg, according to the state of its Vector LexemGraph::isDeletedNode. It calls computeNoOfPathsToEnd() and computeNoOfPathsFromStart() for each lexeme node. The total number of all paths is the sum of all numbers of paths leading to grapheme nodes that are end nodes. Definition at line 232 of file lexemgraph.c. References GraphemNodeStruct::arc, computeNoOfPathsFromStart(), computeNoOfPathsToEnd(), FALSE, GraphemNode, LexemGraphStruct::graphemnodes, LexemGraphStruct::isDeletedNode, GraphemNodeStruct::lexemes, LexemNode, GraphemNodeStruct::live, LexemGraphStruct::min, LexemNodeStruct::no, GraphemNodeStruct::no, LexemGraphStruct::nodes, LexemGraphStruct::noOfPaths, LexemGraphStruct::noOfPathsFromStart, LexemGraphStruct::noOfPathsToEnd, NULL, and TRUE. Referenced by cnRenew(), lgClone(), lgDeleteNode(), lgDeleteNodes(), and lgNewFinal(). |
|
Does a lexemgraph contain at least one instance of a given form? This function checks whether lg contains at least one instance of the form form. Capitalized versions of form are permissible if they are spurious (cf. lgSpuriousUppercase()). Definition at line 1657 of file lexemgraph.c. References GraphemNodeStruct::arc, FALSE, GraphemNode, LexemGraphStruct::graphemnodes, lgSpuriousUppercase(), and TRUE. |
|
Select the path in DST whose parts most closely match SRC. This function inspects the undeleted words in source and undeletes those words in destination that most closely correspond to them. (This is necessary because two lexeme graphs built from the same lattice may have their nodes in different order, so you cannot simply re-use an LexemGraph::isDeletedNode vector across lexeme graphs.) Definition at line 1569 of file lexemgraph.c. References LexemNodeStruct::arc, FALSE, LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, LexemNode, LexemNodeStruct::no, LexemGraphStruct::nodes, NULL, and TRUE. |
|
This function simply transfers the field LexemGraph::tagscore from each node in source to the corresponding node in destination. (This is only useful to save repeated invocation of taggerTag() for two graphs produced from the same lattice.) Definition at line 1498 of file lexemgraph.c. References Chunk, GraphemNodeStruct::chunk, chunkerCloneChunk(), chunkerReplaceGraphemes(), LexemGraphStruct::chunks, GraphemNode, LexemNode, ChunkStruct::nodes, LexemGraphStruct::nodes, NULL, ChunkStruct::subChunks, LexemGraphStruct::tags, and LexemNodeStruct::tagscore. Referenced by lgClone(). |
|
deletes LexemGraph This function deallocates a lexeme graph. This deallocates all parts of the structure, even the lexeme nodes and lexical entries themselves. The lexicon remains unchanged as the LexicalEntry structures are merely clones of the structures in inputCurrentGrammar. Definition at line 1313 of file lexemgraph.c. References chunkerChunkDelete(), LexemGraphStruct::chunks, LexemGraphStruct::distance, GraphemNode, LexemGraphStruct::graphemnodes, LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, GraphemNodeStruct::lexemes, LexemNode, LexemGraphStruct::nodes, LexemGraphStruct::noOfPathsFromStart, LexemGraphStruct::noOfPathsToEnd, NULL, and LexemGraphStruct::tags. Referenced by cmdAnno2Parse(), cmdChunk(), cnDelete(), and lgNew(). |
|
deletes a node from the lexeme graph itself. This function marks a lexeme node as deleted. It does this by setting the cell ln->no in the Vector lg->isDeletedNode. If this destroys the last possible path through lg=, a warning is displayed. This function always re-computes the number of remaining paths in lg. Definition at line 1045 of file lexemgraph.c. References CDG_WARNING, cdgPrintf(), LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, LexemNode, lgComputeDistances(), lgComputeNoOfPaths(), LexemNodeStruct::no, LexemGraphStruct::noOfPaths, and TRUE. Referenced by cnOptimizeNode(). |
|
delete a list of lexeme nodes This function behaves as lgDeleteNode() were called on each element of the nodes, but it is more efficient since it only re-computes the number of remaining paths once. Definition at line 1075 of file lexemgraph.c. References CDG_WARNING, cdgPrintf(), LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, LexemNode, lgComputeDistances(), lgComputeNoOfPaths(), LexemNodeStruct::no, LexemGraphStruct::noOfPaths, NULL, and TRUE. Referenced by cnOptimizeNode(). |
|
returns a distance measure for two lexem nodes This function computes the logical distance between a and b, measured in words. Usually this is just the corresponding element of LexemGraph::distance. If either of the nodes is underspecified it is treated as if it followed the latest specified lexeme node directly. Hence, the return value may be greater than value in LexemGraph::distance. Two underspecified lexeme nodes are considered to have distance zero. Definition at line 776 of file lexemgraph.c. References CDG_ERROR, cdgPrintf(), LexemGraphStruct::distance, LexemNodeStruct::grapheme, LexemNode, GraphemNodeStruct::no, and NULL. Referenced by cmdDistance(), lgCompatibleNodes(), and lgMayModify(). |
|
does existence of these lexemes exclude that lexeme.
In these cases FALSE is returned (ln is not forbidden). Otherwise TRUE is returned. Definition at line 872 of file lexemgraph.c. References FALSE, LexemNode, lgCompatibleNodes(), and TRUE. |
|
Initialize the input module. This function initializes the module Lexemgraph and registers the variable compactlevelvalues. Definition at line 1615 of file lexemgraph.c. References lgCompactLVs, and NULL. Referenced by cdgInitialize(). |
|
Do two lexeme lists intersect.
References FALSE, LexemNode, lgSimultaneous(), NULL, and TRUE. |
|
This checks if a lexem node has been deleted TRUE is returned, if not FALSE. Definition at line 947 of file lexemgraph.c. References LexemGraphStruct::isDeletedNode, LexemNode, and LexemNodeStruct::no. Referenced by cnOptimizeNode(), lgAreDeletedNodes(), and lgMakePath(). |
|
returns TRUE if node is an end node
This function checks whether
References GraphemNodeStruct::arc, FALSE, GraphemNode, GraphemNodeStruct::lexemgraph, LexemGraphStruct::max, and NULL. Referenced by cnIsEndNode(). |
|
returns TRUE if node is a start node
This function checks whether References GraphemNodeStruct::arc, FALSE, GraphemNode, GraphemNodeStruct::lexemgraph, LexemGraphStruct::min, and NULL. Referenced by cnIsStartNode(). |
|
This function checks whether at least one of the LexemNode structures in list points to a lexicon element le Definition at line 1549 of file lexemgraph.c. References FALSE, LexemNodeStruct::lexem, LexemNode, NULL, and TRUE. |
|
Takes a set of lexeme nodes, and extends it to a complete path through the graph, composed of undeleted LexemNodes. Returns NULL if this is impossible, It returns a List of lexeme nodes that
If this is not possible, NULL is returned. We do this by simply appending arbitrary non-contradictory nodes until we have bound all time points. Note that for this approach to be correct, there must not be any undeleted dangling nodes in the graph. This condition must have ensured by cnOptimizeNet(). Definition at line 1250 of file lexemgraph.c. References LexemNodeStruct::arc, FALSE, LexemNode, lgCompatibleNodes(), lgIsDeletedNode(), LexemGraphStruct::max, LexemGraphStruct::min, LexemGraphStruct::nodes, NULL, and TRUE. |
|
may these words modify each other?
References GraphemNode, GraphemNodeStruct::lexemes, LexemNode, lgDistanceOfNodes(), and TRUE. Referenced by cnBuildLevelValues(). |
|
is this lexeme a member of the this set?
In the following cases ln is not a member (return FALSE);
Otherwise TRUE is returned. Definition at line 1152 of file lexemgraph.c. References FALSE, LexemNode, lgSimultaneous(), and NULL. Referenced by lgSubset(). |
|
Returns the most probable path, as defined by tagging scores. Definition at line 1676 of file lexemgraph.c. References CDG_WARNING, cdgPrintf(), GraphemNode, LexemGraphStruct::graphemnodes, LexemGraphStruct::lattice, GraphemNodeStruct::lexemes, LexemNode, NULL, and LexemNodeStruct::tagscore. |
|
This function creates a lexeme graph from a Lattice lat and a cdg lexicon. For each arcs of the lattice a grapheme node is allocated and annotated with all possible lexical entries. (If there is no lexical entry for an arc, a warning is given, but processing continues.) For each grapheme node, as many lexeme nodes are created as there are lexical alternatives in the lexicon. Furthermore:
References LexemGraphStruct::lattice, lgDelete(), lgNewFinal(), lgNewInit(), lgNewIter(), and NULL. Referenced by cmdAnno2Parse(), cmdChunk(), and cnTag(). |
|
does the final computations for the lexemgraph This function sets those fields of lg that can only be computed after all lexeme nodes are present:
The function can fail returning FALSE if there is no valid path through the lexeme graph. Definition at line 618 of file lexemgraph.c. References CDG_INFO, CDG_WARNING, cdgPrintf(), FALSE, LexemGraphStruct::graphemnodes, LexemGraphStruct::isDeletedNode, LexemGraphStruct::lattice, lgComputeDistances(), lgComputeNoOfPaths(), lgPrint(), LexemGraphStruct::max, LexemGraphStruct::noOfPaths, NULL, LexemGraphStruct::tags, and TRUE. Referenced by lgNew(), and lgUpdateArcs(). |
|
initializes the lexemgraph This function returns a new LexemGraph structure with all fields initialized to meaningless values. In particular, it contains no nodes whatsoever. Definition at line 427 of file lexemgraph.c. References LexemGraphStruct::chunks, LexemGraphStruct::distance, LexemGraphStruct::graphemnodes, LexemGraphStruct::isDeletedNode, LexemGraphStruct::lattice, LexemGraphStruct::max, LexemGraphStruct::min, LexemGraphStruct::nodes, LexemGraphStruct::noOfPathsFromStart, LexemGraphStruct::noOfPathsToEnd, NULL, and LexemGraphStruct::tags. Referenced by lgNew(). |
|
Insert lexeme nodes into the LexemGraph that correspond the Arc. This function builds all possible lexeme nodes for the specific arc and adds them to lg. It fails if there is no matching entry in the lexicon. Maybe undo capitalisation introduced by orthographic convention. If the written word is uppercase, but that uppercase-ness is suspect because it is at the start of a phrase and might be mere orthographic convention, we have to decide which version we use for lexicon lookup. If our lexicon contains items for the lower-case version but none for the upper-case versions, we use only those; if it contains only items for the upper-case version, we use those; and if it contains neither, we allow both and hope that there is a lexical template which will catch this word. We do not use the obvious solution - look up both versions whenever a word is spurious - because it has the following defect: If a sentence starts with `Der', some naive lexical template could introduce a noun reading, and if POS tagging allows, it might actually survive even though it is exceedingly unlikely. Since we do not want this, we effectively force the reading to be `der'. Moral: If you really need to have open-class items in your lexicon that are near-homonymous with closed-class items, you can bloody well write proper lexicon items for them and not templates. Much the same goes for words in ALL UPPER CAPS, except that those can occur anywhere in a sentence, not only at the start, and we have to check three different spellings instead of two. In one-letter words, the intermediate version is indistinguishable from the third one, so we suppress it. Definition at line 454 of file lexemgraph.c. References GraphemNodeStruct::arc, LexemNodeStruct::arc, CDG_WARNING, cdgPrintf(), GraphemNodeStruct::chunk, FALSE, LexemNodeStruct::grapheme, GraphemNode, LexemGraphStruct::graphemnodes, LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, GraphemNodeStruct::lexemes, GraphemNodeStruct::lexemgraph, LexemNodeStruct::lexemgraph, LexemNode, lgSpuriousUppercase(), LexemNodeStruct::limit, LexemGraphStruct::max, max, LexemGraphStruct::min, min, GraphemNodeStruct::no, LexemNodeStruct::no, LexemGraphStruct::nodes, NULL, LexemNodeStruct::tagscore, and TRUE. Referenced by lgNew(), and lgUpdateArcs(). |
|
Do these lexeme nodes overlap? Returns TRUE if the two lexeme nodes have at least one time point in common. This is subtly different from the more common question, "Can the two nodes coexist on one path?": two nodes can be compatible although they overlap if they are identical. Conversely, a and b may be incompatible even if they do not overlap if there is no path between them. Definition at line 1416 of file lexemgraph.c. References LexemNodeStruct::arc, and LexemNode. |
|
partitions a set of lexeme nodes into equivalence classes This function partitions the set of lexeme nodes of gn into equivalence classes. The equivalence relation used is the function inputCompareLeByAtts() with the argument features. The function returns a new List of new lists of lexemes. (The latter are re-used in ConstraintNode structures, the former are deallocated by cnBuildNodes().)
result = []; FOR each lexeme l: IF l fits into one of the known classes, insert l there; ELSE create new class [lexem]; insert the new class into result; FI ROF return result. References CDG_DEBUG, cdgPrintf(), GraphemNode, LexemNodeStruct::lexem, GraphemNodeStruct::lexemes, LexemNode, lgCompactLVs, and NULL. Referenced by cnBuildLevelValues(). |
|
print lexem graph This function displays a textual representation of the lexeme graph lg. Definition at line 708 of file lexemgraph.c. References LexemNodeStruct::arc, cdgPrintf(), chunkerPrintChunks(), LexemGraphStruct::chunks, LexemGraphStruct::isDeletedNode, LexemGraphStruct::lattice, LexemNodeStruct::lexem, LexemNode, lgPrint(), LexemNodeStruct::no, LexemGraphStruct::nodes, and LexemNodeStruct::tagscore. Referenced by lgNewFinal(), and lgPrint(). |
|
prints out a lexeme node
This function displays the identifier and the time span of ln in the format References LexemNodeStruct::arc, cdgPrintf(), LexemGraphStruct::isDeletedNode, LexemNodeStruct::lexem, LexemNodeStruct::lexemgraph, LexemNode, and LexemNodeStruct::no. Referenced by cnOptimizeNode(). |
|
What categories can this node represent? (Needed while tagging.). This function queries the lexicon about what syntactical categories gn can represent. (The syntactical category is that feature whose index is taggerCategoryIndex.) This function is used to check whether an assignment by the tagger can be honored by the lexicon. Definition at line 1629 of file lexemgraph.c. References GraphemNode, LexemNodeStruct::lexem, GraphemNodeStruct::lexemes, LexemNode, and NULL. |
|
Takes a Vector of Boolean, and sets all cells that correspond to the numbers of nodes incompatible with ln. This function can be used in combination with lvVectorCompatible() to decide whether an LV is compatible with a set of other LVs. Definition at line 1428 of file lexemgraph.c. References LexemNode, lgCompatibleNodes(), LexemNodeStruct::no, LexemGraphStruct::nodes, and TRUE. |
|
This function is similar to lgRequireLexeme(), but takes a List of lexeme nodes. It marks all those lexeme nodes that are incompatible with all lexemnodes of which. Definition at line 1449 of file lexemgraph.c. References LexemNode, lgCompatibleNodes(), LexemNodeStruct::no, LexemGraphStruct::nodes, and TRUE. |
|
do the lexemes span the same time interval? This function checks whether a and b cover the same time span. An argument of NONSPEC always causes TRUE to be returned. However, the NULL node is not simultaneous to any lexeme node, not even to another root node. Definition at line 1108 of file lexemgraph.c. References LexemNodeStruct::arc, CDG_WARNING, cdgPrintf(), FALSE, LexemNodeStruct::lexem, LexemNode, NULL, and TRUE. Referenced by lgIntersectingSets(), lgMember(), and lgSubset(). |
|
Might this be a lowercase word that is spelled in upper case because of orthographic convention? Spurious uppercase must be an upper case letter... ... followed by a lower case letter. This is another instance of the "wordgraphs start at 0" assumption. Ordinarily, this would be wrong, since the lexeme graph might start at some other time point. However, at this time lg->min may not be initialized, so we can't check it. Since spurious upper case only occurs in written text, and weird time points occur mainly in recognizer output for spoken text, I'm letting it pass here. Definition at line 1785 of file lexemgraph.c. References FALSE, LexemGraphStruct::lattice, and TRUE. Referenced by lgContains(), and lgNewIter(). |
|
This function checks whether a is a subset of b.
References FALSE, LexemNode, lgMember(), lgSimultaneous(), NULL, and TRUE. |
|
updates the partial lexemgraph with the incoming arcs. This function extends a lexeme graph by the Arc structures contained in listArcs. Definition at line 1476 of file lexemgraph.c. References LexemGraphStruct::distance, LexemGraphStruct::lattice, lgNewFinal(), lgNewIter(), and NULL. |
|
References LexemNodeStruct::arc, LexemNode, LexemGraphStruct::max, and LexemGraphStruct::nodes. |
|
This variable controls whether the levelvalues should be deflated if they are equivalent. Usually, we want this switched on, only for testing a value of FALSE might be appropriate. Definition at line 48 of file lexemgraph.c. Referenced by cmdStatus(), lgInitialize(), and lgPartitions(). |