The experiment data used in this section is based on a corpus proposed by David Chalmers [66]. In his paper, David Chalmers claimed that a Recursive Auto-Associative Memory, (RAAM) originally proposed by Pollack [61], is a connectionist architecture that is capable of processing ``compositional structure.'' He demonstrated that two RAAMs (as the encoder / decoder of symbolic sentences), plus a feedforward network [40] between the internal layers of the RAAM, can achieve the syntactic task of transforming an active sentence to a passive sentence. An initial experiment with 80 sentences (40 of each form) was used to train the connectionist architecture (both the RAAM encoder / decoder and the feedforward transformation network). A 65% generalization rate was reported on the rest of the 40 unseen sentences. That is, the error rate on the unseen test corpus was 35%. He then modified the experimental setup by training RAAMs with all possible sentences, and the transformation feedforward network with 75 out of the 125 possible active/passive pairs. A 100% generalization rate on the remaining 50 active/passive pairs was achieved.
Specifically, the corpus used in his study consists of 5 nouns, 5 transitive verbs, one auxiliary verb (is), and a preposition (by). There are 125 sentences in active form and 125 sentences in passive. The vocabulary used in this corpus is summarized in Table 7.1. The original corpus is not conjugated. For example, the following sentence,
diane kill helen -> helen is kill by diane
is used in the corpus although it is incorrect as far as common English grammar is concerned. As a starting point and to establish a more accurate comparison, we begin with the ``initial'' experimental setup in [66] by using 40 random sentences as the training set and the rest of 85 as the test set. The architecture is trained with the conjugate gradient method starting with a small (
) random parameters set. The architecture can learn all the sentences in the training set without difficulty. The generalization accuracy of the architecture on the test set is 100%.
To view the result in more detail, the output of an example sentence in the training set
diane kill helen -> helen is kill by diane
is shown in Figure 7.10. In this figure, the complex components are drawn as an array of complex planes. This is the representation of the state as a superposition of the symbol-eigenstates. For a better comparison between the end state of affairs and the target state of affairs, the output, together with the target, is shown again in Figure 7.11. In the figure, the first two rows are the absolute values of components of the target (upper) and output (lower) state of affairs. The third and the fourth rows in the figures are the phases (arguments of complex numbers) of components of the target and output respectively. As can be seen in the figures, five eigenstates (
,
,
,
,
) have the most significant absolute values. The permutation thereof that is most similar to the state of affairs (i.e. that has the maximal complex inner product with the state of affairs vector) is taken as the orthographic form of the result of the syntax manipulation.
The generalization ability of the architecture is very good. For example, in the test set, an unseen sentence
chris kill john -> john is kill by chris
is visualized in Figure 7.12. As can be seen in the figure, there are hardly any differences between the absolute values of the output of the unseen sentence and that of the target. There is significant variation of phases, however. Nevertheless, the target is still the best candidate for the orthographic output.
|