The corpus used in this experiment is based on the more complex corpus used in the previous section. Both the active and passive form are used and translated to German. There are 480 English-German bilingual sentence pairs in total. All the verbs are correctly conjugated. As in the previous section, the atomic symbols are orthographic words. There is a separable German verb (umbringen -- to kill) in the corpus7.8 which introduces some additional complexity. For example, in the following bilingual sentence pairs,
helen is killed by john -> helen wird von john umgebracht diane kills michael -> diane bringt michael um
umgebracht, bringt and um are treated as mutually exclusive symbols (eigenstates, according to the common German formulation operator). The vocabulary is summarized in Table 7.3. Specifically, there are a total of 19 (20) symbols in the English (German) vocabularies. The total number of free parameters is therefore
.
|
Seventy-eight sentences pairs are chosen randomly as the training set (16% of the corpus) and the remaining 408 sentences pairs are reserved as the test set. In a typical experiment, the correctness on the training set is 93.58% and the generalization rate on the test is 88.81%. We count an incorrectly decoded sentence as an error. On the other hand, if the correctness of words instead of that of sentences is counted, the correctness of the training set rises to 97.82% and the generalization accuracy of the test set rises to 95.82%. Given the small size of the training set, the generalization rate is not bad.
To see more details, the output of a correctly decoded example in the training set
diane and michael are betrayed by helen ->
diane und michael werden von helen verraten
is shown in Figure 7.20. The first two rows are the absolute values of the output components in which the absolute squares are represented by the area of the disks. The first is that of the target state of affairs (diane und michael werden von helen verraten) and the second is that of the output. The third and the fourth rows in the figure are the phases of the target and the output respectively. As can be seen in the figure, the architecture can generate the target state of affairs quite faithfully.
Nevertheless, there are some sentence-pairs that are not correctly learned both in the training and the test set. As an example, the output of an example in the test set
helen is killed by michael and diane ->
helen wird von michael und umgebracht* diane*
that is not correctly learned, is shown in Figure 7.21. Incorrectly decoded words are marked with *. As can be seen in the figures, the output state of affairs is largely similar to that of the target. The error is mostly due to shift of phases, therefore the word order is incorrect. If the order is swapped, the sentences would be correct. This suggests that another accuracy criterion may reveal more information about the performance. For one thing, we would like to know how many incorrectly decoded sentence-pairs are due to the error of phases. In fact, if the decoded sequences are permuted only once (by swapping the positions of exactly two of the symbols), we achieve an accuracy of 100% on the training set. On the test set, however, the accuracy is 98.5%. A glance at the remaining errors (6 sentences), we notice that they are all of the form as shown in the following:
diane kills helen ->
diane bringt john* helen* um-
This state of affairs of the above example is shown in Figure 7.22. The error of this example is due to unwanted residues of eigenstates. This kind of error can be removed by raising the threshold in the combinatorial decoding process. For example, if the threshold is set to 0.1, all error of this kind can be avoided.
This example also shows an interesting ``bias'' of the system to convict john as the killer. In fact, this comes as no surprise if we take a closer look at the training set. In the training set there are 20 sentences about ``killing.'' In these scenarios, john kills 9 times and is killed 6 times, he is the most frequent killer (michael kills 7 times and is killed 9 times. helen kills 7 times and is killed 8 times. diane kills 7 times and is killed 3 times.) Poor john seems to have become the natural ``black sheep'' of the system, owing to the unbalanced training set.
![]() |
In the ideal case, a quantum mechanical translator can be reversed in time in order to translate a sentence from the target language back to the source language. However, this can be done only if the end state of affairs is not measured (that is, no symbolic sequence in the target language is formulated). If a state of affairs formulated in the target language is subject to the time reverse version of the reasoning operator
, there can be a significant amount of noise in the ``starting'' state of affairs. As an example, the target state of affairs of the first example in the training set above (diane und michael werden von helen verraten) is prepared and subject to
. The output (in fact, the reversed input) and the original state of affairs is shown in Figure 7.23. The first two rows are the absolute squares of the original state of affairs and the reversed input respectively. The last two rows are the phases of the original state of affairs and the reversed input. As can be seen in the figure, there are unwanted mixed states in the source language that are generated by the pure states in the target language. These mixtures are not effectively cancelled as they are in the case of forward translation.
An interesting but non-trivial by-product of this experiment is that one can use the architecture to compile a bilingual dictionary of the miniature languages. This can be done using the lexical list of English as input and looking at the end state of affairs in German. The result is shown in Figure 7.24. In the figure, the absolute square of each component is represented by the area of the little square. As can be seen in the map, the English words are largely associated with their German translations. Interestingly, personal names and auxiliary words (is, are, by) are mapped to somewhat distributed German words. The German counterparts of the English personal names are nevertheless the most activated. The auxiliary words, on the other hand, show a sort of ``template'' relationship, which is basically a many-to-many mapping. These are the desirable results that correspond well to our intuitive understanding of language usage. What is puzzling is the relationship among past participles. English past participles show a tendency to associate with German past participles as a category. That is, the mapping shows a kind of ``generalization'' based on syntax (in the sense of conventional linguistics) in addition to natural associations based on content. However, this generalization proves to be incorrect for the homonym hit (as in plural present tense and as past participle). These errors of mappings among past participles (as far as category is concerned) seem to be due to the ambiguity of hit in English, which may have connected geschlagen to schlagen (hit as a past participle is almost totally neglected in the map). The other interesting phenomenon is the ``bias'' against john, discussed above which can be seen in the last row of the figure.
In the same vein, a reverse dictionary map using
is shown in Figure 7.25. One should note, however, that these two maps are not simply transpositions of each other.