Dependency Grammars

Keywords: constraint grammars, property grammars, dependency grammars, dependency modelling, dependency parsing

Bibtex

...

Sources

Comments

Timo Järvinen and Pasi Tapanainen:

Towards an implementable dependency grammar

The authors claim both linguistic adequacy and computational tractability for their model, which they call parsing grammar. (Interestingly, they make the exact same claim as we do: "NP-hard in general, but there is a specific solution for the given grammar that can be run quickly.") Dependency as opposed to constituency is motivated entirely with the argument of discontinuous constituents.

They claim that their own Functional Dependency Grammar conforms to Tesnière's original, 1959 formulation of dependency syntax (which they say has been much misunderstood): the nodes in dependency trees should be `nuclei' rather than words, e.g. the entire verb chain should be represented in just one node; each node has exactly one regent; there are nonprojective edges, but no fragments.

Tesnière's notion of structural vs. linear order is contrasted with the standard ID/LP; a big difference is that Tesnière does not really formalize the linear order rules at all. The case of English adjectives is used to argue, rather unconvincingly, that "there are no simple rules or principles for linearisation", when in fact they effectively give the three simple relevant rules in a footnote.

Examples are given for sentences with coordinations such as "John and Bill love Mary and Joan" and "Jack painted the kitchen white and the living room blue". As usual, the actual trees are quite unreadable because the words are written into the nodes rather than at the bottom.

The authors claim to have an efficient, broad-coverage grammar for English, but give no evaluation. Note that Connexor's web site has a syntax demo, but one that works rather differently; e.g. it obviously uses words rather than nuclei.

Vincenzo Lombardo and Leonardo Lesmo: Unit Coordination and Gapping in Dependency Theory

A dependency account of coordination is detailed that follows Melcuk (who says that coordination is never truly symmetrical, and that choosing one of the elements as the head is therefore justified), so that their model is the same as ours: Dick<---and<---Jane.

Projectivity is maintained; long-distance dependencies are covered instead by using non-lexical categories (traces), something CDG can't do. "The dependencies rules that license coordination result from the application of metarules to primitive rules." The entire paper is about particular metarules that allow analysis of gapped sentences such as "I saw a unicorn and Carol a tyrannosaurus". An "Earley-type parsing algorithm" is mentioned but not explained.

Tomás Holan, Vladislav Kubon, Karel Oliva and Martin Plátek: Two Useful Measures of Word Order Complexity

Any realistic account of Czech must allow nonprojective structures; the question is how to what extent it should be allowed in their Robust Free-Order Dependency Grammar. This paper develops measures for such nonprojectivity.

Despite its name FODG uses normal binary rewrite rules, with normal nonterminals. The node-gap complexity (NG) of an analysis is the highest number gaps that appear in the yiels of a nonterminal. It is shown that no fixed bound for NG can be given for correct Czech sentences.

Norbert Bröker: How to define a context-free backbone for DGs: implementing a DG in the LFG formalism

(This is an extension of the earlier paper "Separating Surface order and Syntactic Relations in a Dependency Grammar".)

Multidimensional Dependency Grammar (DG) "decouples the dependency tree from word order, such that surface ordering is not determined by traversing the dependency tree". Previous approaches to word order rules in DG are compared and criticised. Bröker's solution that is "semantically motivated/able to restrict discontinuity to certain constructions/able to be lexicalized/declarative/formally precise" are Word Order Domains. "Word order domains loosely correspond to partial phrases." In "Den Mann hat der Junge gesehen", the verb defines the three WOD [Den Mann] [hat der Junge gesehen] [], and the nouns define one WOD each. "Order domains define scopes for precedence predicates."

The trick now is that while dependency trees may be nonprojective, the WOD structure remains projective. A modifier may not only be inserted into the WOD of its head, but also of its transitive head. The labels allowed on the path between the real and the transitive head can then be used to limit possible extractions. The model is implemented in LFG, with the WODs used to define a context-free backbone for DG.

Elke Teich: Types of syntagmatic grammatical relations and their representation

The question `constituency or dependency?' is replaced with `Are constituency and dependency sufficient for representing language?', and predictably aswered with "no".

1. Coordination. Hudson's word grammar assumes that constituents are necessary to represent coordinations, and Teich believes him. No mention is made of alternatives.

2. Information structure. Combinatorial Categorial Grammar can represent different topic structures of the same sentences well:

Who ate the beans? - Fred ate the beans What did Fred do? - Fred ate the beans.

Normal CFG always brackets this as [Fred [ate [the beans]]], which corresponds only to the first variant, while CCG can also produce [Fred ate] [the beans].

3. Agreement: HPSG can model German noun groups with their case, number, gender, and inflexion type well.

Teich's solution is Systemic Functional Grammar, which transcends both paradigms. It has four types of syntagmatic structure:

1. prosodic, meaning not "speech melody", but anything that spreads out over more than one constituent, e.g. agreement.

2. periodic, dealing with prominence, e.g. information structure.

3. interdependency, used both for subordination and coordination, only it's called hypotaxis and parataxis.

4. constituency, meaning what you think it means.

A syntagmatic representation of a sentence combines all four annotation types, so that Fred is simultaneously Actor, Theme, Given, Subject, alpha (to the beta that is `ate'), etc. Even Teich admits that "SFG is rarely used in parsing" and cites negative results.

Eva Hajicová: Movement rules revisited

Praguian dependency grammar contains nodes with complex labels with morphological and lexical values. Function words are not present in the tree; their contribution is also expressed by these labels. The distinction between morphemics (surface strings) and syntax is motivated by noting

- that function words cannot be said to be in the same kind of dependency relation to their head as autosemantic words, because they cannot possible appear alone

- there can be portmanteau words in the surface structure (such as `zum')

This paper argues a distinction between three layers of word order: surface order, which is what you actually observe, communicative dynamism, and systemic ordering.

`Dynamism' refers to information content: in `I was visited by a painter in Paris last week' (if `Paris' is stressed), `Paris' (the focus) is the more dynamic element compared to `I' (the topic). The principle is motivated by noting that sentences like `Everybody in this room knows two languages' actually change their meaning depending on whether `Everybody' or `languages' is the focus. The communicative order corresponds to the order of nodes in the corresponding tectogrammatical tree.

Systemic ordering deals with rules such as `Temporal precedes Actor'.

Movement rules are then described which mediate the differences between these orderings. Phenomena such as sentences where the focus is not rightmost, pre-modifying adjectives (which violate the dynamism principle), the ordering of function words (which, as you will remember, do not actually appear in the tree), or extraposed relatives clauses are covered.

Cristina Barbero, Leonardo Lesmo, Vincenzo Lombardo and Paola Merlo: Integration of syntactic and lexical information in a hierarchical dependency grammar

The early, structuralist account of linguistics in this century relied heavily on the existence of categories, i.e. classes of units that are in some sense interchangeable. But recently the importance of lexicalization, i.e. not treating all nouns as equivalent, has grown considerably. This paper investigates interaction between lexical and syntactic information by organizing the entire grammar around the subcategorization principle.

The formalism, of course, is the same as the one used in the seond paper. A dependency rule might look like this:

sehe:VVFIN ( # )

Of course it is wasteful to have a rule of this kind for each transitive verb. Therefore a fine-grained subcategorization hierarchy is introduced. The left-corner filtering Earley parser has three possible actions.

1. Prediction: guessing the dependents of the current node

2. Scanning: checking that the current word matches the subcategory needed by the current node (otherwise backtracking is done)

3. Completion: Move on when the current node is complete.

A classification of 101 Italian verbs is then undertaken, with no discernible relation to the previous exposition.

Harri Arnola: On Parsing Binary Dependency Structures Deterministically in Linear Time

Very strong claims are made: "Deterministic, linear parsing of dependency structures is possible under certain conditions. We discuss a fully implemented parser and argue that those conditions hold for at least one natural language." The system achieves 85% correctness for Finnish.

Wolfgang Menzel and Ingo Schröder: Decision Procedures for Dependency Parsing Using Graded Constraints

This was our own contribution.

Marie Bourdon, Lyne Da Sylva, Michel Gagnon, Alma Kharrat, Sonja Knoll and Anna Maclachlan: A Case Study in Implementing Dependency-Based Grammars

This reports on a commercial grammarchecker for English that uses dependency structures. A problem arose with stranded prepositions:

I was yelled at. They knew the man we talked about.

The linguistically correct way of dealing with them opens a huge search space, since every preposition ist hypothetically treated as dangling. They solved the problem in different ways: for pseudo-passives, a distinctive new edge label `prep-strand' is used, and extra constraints make sure that it is only used when the preposition matches the verb (e.g. `at' matches `yell'). (This cannot be the whole truth, obviously. This kind of matching constraint would prevent the `stranded' reading of `I yelled with all my power', but in the much more common `I yelled at him' it would still be tried.)

For relative clauses, the stranded preposition is attached to the antecedent, and this is only allowed if that word really is an antecedent, i.e. carries a relative clause. However, some other constraints had to be relaxed because this makes the relative clause itself incomplete (`we talked').

The approach can deal with long-distance dependencies, but breaks down when the stranded preposition is not clause-final: `The man I talked about with Mary' is not possible.

An `overall improvement of the parser's performance' is reported, with no further details.

Jacques Courtin and Damien Genthial: Parsing with Dependency Relations and Robust Parsing

Courtin describes generative dependency grammars with rules such as V(N,*,N) or similar. As usual, this would require one rule for every conceivable type ov VP: one with no adverbials, one with one adverbial, etc. For efficiency `dependency relations' are introduced, e.g.:

  N -> A := (-14,-15)
  N -> D := (-16)
  V -> N := (-20,+20)

This means that the determiner must precede adjectives, since -14 < -16, that nothing else can come between two adjectives, since -15 + 1 = -14, and that nouns can follow or precede verbs, and need not be adjacent, as there is lots of room between -20 and 20.

The parsing algorithm works on POS-tagged input and the DRs and enumerates all possible structures in a matrix, by computing recursively all possible subtrees for each governor. Paring is "instantaneous" for sentences with up to 20 analyses.

The parser was then to be used in a NLL system, so it could not use morphological info (since these funny Frenchpeople don't know about soft constraints). Instead, a totally new parser was written which used semantic knowledge to disambiguate. This one works by combining rewriting rules of the type

N_V [
 (1:{N}, (0, $F:{pf}) 2:{V})
 =>
 ( ( 1, $F ) 2 ) ]

This means that if you find an N with preverbal particles following it, and a V after that, you may combined then, and the V will be the head (parentheses indicate dependency). This kind of context-dependent rules can restrict the number of children one regent has, or forbid multiple articles for one noun, etc. You can also write

N_C [
 (1:{N}, 2:{C}, 3:{N})
 =>
 ( (1) 2 (3) ) ]

and attach stuff at the left and at the right simultaneously.

Since this would again require one rule for every possible set of modifiers, a hierarchy of categories is used where both `cnoun' and `pnoun' can match `noun'. A tree transducer then computs all posible trees non-deterministically.

Psi-terms (typed feature structures with sharing) are used to "represent knowledge about words and trees":

UL(lex => "eats";   
   cat => verb;
   subj => UL(sem => S:ANIMATE);   
   obj => UL(sem => O:EATABLE);    
   sem => INGEST(agent => S; 
                 patient => O))

One such term is attached to each node in the tree. Strangely, the paper ends here.

Tom B.Y. Lai and Huang Changning: Complements and Adjuncts in Dependency Grammar Parsing Emulated by a Constrained Context-Free Grammar

Chinese is represented in dependency structures with rules such as

X(A,B,*,D,E)
Y(*)
*(Z)

meaning that an X can occur between AB and DE, Y can occur with no modifiers (= is a leaf), and Z can occur with no regent (= is a root). (This kind of formalism is weakly equivalent to CFG.) _Functional annotation_ is the equivalent of edge labels and allows writing rules like

VVFIN(NN(SUBJ), *, NN(OBJA))

Within Gazdar's PATR parser, such rules are implemented as follows:

 TVP ---> [PN1, TV, PN2] :-                
   TVP:cat  ===   tv,
   TV:cat   ===   tv,
   PN1:cat  ===  n, 
   PN2:cat  ===  n,
   TVP:ds    ===   TV:ds,
   TVP:ds:subj  ===  PN1:ds,
   TVP:ds:obj   ===  PN2:ds.

i.e. with the nonterminals annotated by features. The authors call this a context-free PSG constrained with grammatic function annotations.

Control verb sentences (`A wants to hit B'), where Hudson would require A to have two regents, are instead modeled by function annotations (but it isn't clear how; in the example tree their own solution simply omits the second edge with no substitute). Extraction, raising and extraposition are said to be dealt with similarly.

The authors are obviously concerned about the presence of CFG derivation rules, and explicitly question whether their creation is still a dependency grammar. They test themselves against three postulates made by Hudson:

1. There should be no transformational component - check.

2. `Dependency should be basic, not derived' - check.

3. Rules of grammar should not be formally distinct from subcategorization facts. - fail.

Apparently the treatment of adjuncts is a problem under these postulates because they are not motivated lexically (i.e. by subcategorization info). The authors justify themselves,

"Adjunct rules are not related to any subcategorization facts. We believe that their existence (in small numbers) is justified in our emulation model. Even in a DG formalism that does not have phrase-structure-like rules, there have to be some general facilities to take care of such non-lexical grammatical mechanisms. DG is lexically orientated, but it has to cope with non-lexical grammatical mechanisms, where they exist, in language."

Yves Lepage, Ando Shin-Ichi, Akamine Susumu and Iida Hitoshi: An Annotated Corpus in Japanese Using Tesnière's Structural Syntax

Tesnière (1959) explained phenomena in many languages to support his claim that dependency grammar is descriptively adequate. He cited, but did not exemplify Japanese; this paper does just that.

Japanese has the usual classes of content and function words, concrete and abstract concepts etc. Case-marked phrases permute freely, which fits DG well. Constituents can also be left out, even the subject. Tesnière's (and our) principle of linking the subject to the finite verb and the object to the full verb is also followed. Forests are advocated to represent sentences with elliptical conjunctions. The concept of transference is used to model na-adjectives (nationalized foreign words which behave mopre like nouns) simply.

6500 sentences from hotl reservation have been annotated using this model.


 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback