next up previous contents index
Next: Syllogism in natural language Up: Application of QT to Previous: Issues of natural language   Contents   Index

Quantum mechanical NLP

In quantum mechanical terms, the state of affairs that is associated with a natural language utterance is a superposition of eigenstates of an eigenbasis pertaining to a specific vocabulary $V$. The vocabulary is a set consisting of all symbols found in a language. Moreover, all these symbols are eigenstates corresponding to a language formulation operator $F$. That is,

\begin{displaymath}V=\left\{ {s_i\left\vert {\ F\left\vert {s_i} \right\rangle =\gamma _i} \right.\left\vert {s_i} \right\rangle } \right\},\end{displaymath}

where $\gamma _i \in \mathbb{R}$ is an eigenvalue of $F$. According to Corollary 2, any state of affairs $\vert \phi \rangle$ can be treated as a superposition of components in $V$,


\begin{displaymath}\left\vert \phi \right\rangle =\sum\limits_n {c_n\left\vert {s_n} \right\rangle },\end{displaymath}

where $c_n \in \mathbb{C}$ and $c_n=\left\langle s_n \mathrel\vert \phi \right\rangle$ is the projection of $\left\vert \phi \right\rangle$ on $\left\vert s_n \right\rangle$.

In practice, a natural language utterance is usually written as an orthographic string. Generally speaking, this can be a string of phonetic transcriptions. In a sense, we are free to choose our ``atomic'' symbol set (alphabets, phonetic symbols, or ideographs). In the problems tackled in this chapter, however, orthographic words are used as the building blocks (symbols or eigenstates) of the string. For example, the eigenstate corresponding to the word loves can be denoted by


\begin{displaymath}\vert loves \rangle.\end{displaymath}

Our first question is then: how can we put together a string of symbols to refer to a state of affairs? Since we are taking a physicalist account, the answer is to be found in physics. We need a particular unitary operator (called the preparation operator $P(t)$, which is a function of time $t$) to place a particular symbol in its particular position in an utterance. In general, the unitary operator $P$ can be written as,


\begin{displaymath}{P(t)=e^{i{{H'} \over \hbar }t}},\end{displaymath}

where $H'$ is an Hermitian operator. Suppose the string is constructed incrementally, we have,


\begin{displaymath}\left\vert \phi \right\rangle =\sum\limits_{k=1}^m {P(t_k)e^{i\theta _k}\left\vert {s_k} \right\rangle },\end{displaymath}

where $s_1,s_2,...s_m$ is a string of symbols in the orthographic natural language utterance; $m$ is the length of the string; $t_k$ is the time of utterance of the $k$-th symbol; $\theta_k$ is the phase (argument of a complex number) of the $k$-th symbol. Generally speaking, the preparation operator $P$ may ``mix'' up one symbol with others if $H'$ is not a diagonal matrix. Indeed, this could occur quite often in natural language7.5. However, for simplicity, we assume that the symbols in the miniature languages discussed in this chapter do not mix with each other. That is, $H'$ is a diagonal matrix. In this case, we have


\begin{displaymath}P(t)=\left( {\matrix{{e^{i\lambda _1t/\hbar }}&0&0\cr
0&\ddots &0\cr
0&0&{e^{i\lambda _nt/\hbar }}\cr
}} \right),\end{displaymath}

where $\lambda_k \in \mathbb{R}$ is the $k$-th diagonal component of $H'$; $n$ is the size of the vocabulary. To make the model even simpler, we assume that all $\lambda_k$ are equal. Furthermore, we assume that the symbols in a string are uttered at uniform intervals ( $\theta_0=2\pi / m+2$) and the argument $\theta_k$ of each eigenstate $\left\vert {s_k} \right\rangle$ is zero. Thus we have, after all these simplifications,

\begin{displaymath}
\left\vert \phi \right\rangle =\sum\limits_{k=1}^m {e^{i(k-1)\theta _0}\left\vert {s_k} \right\rangle }.
\end{displaymath} (7.1)

A state of affairs $\left\vert \phi \right\rangle$ thus prepared is subject to a unitary operator $U$ (the reasoning operator). That is,

\begin{displaymath}\vert \phi' \rangle = U \vert \phi \rangle = e^{-iHt\over {\hbar}} \vert \phi \rangle,\end{displaymath}

where $\vert \phi' \rangle$ is the end state of affairs and $H$ is the Hamiltonian of the reasoning process. The training is done by optimizing an error function. Specifically, the error function is defined as

\begin{displaymath}err(H)=\sum\limits_{(\phi _t,\phi _j)\in T} {\left\vert {\lef...
...lldelimiterspace} {{\phi _o^k}} \right\rangle } \right\vert}^2,\end{displaymath}

where $H$ is an Hermitian matrix that is the target of the training process; $T$ is a set of training pairs ( $(\phi _t,\phi _j)$); $\vert \phi _t \rangle$ and $\vert \phi _j \rangle$ are the target and input state of affairs respectively. Moreover, $\vert \phi _o \rangle$ is related to $\vert \phi _j \rangle$ as follows


\begin{displaymath}\vert \phi_o \rangle = U \vert \phi_j \rangle = e^{-iHt\over {\hbar}} \vert \phi_j \rangle.\end{displaymath}

We can then use the conjugate gradient method [41], starting with a small random initial vector to calculate $H$.

Once $H$ is calculated, an unseen state of affairs can be subject to the same reasoning operator $U$. The end state of affairs should be then measured to generate the result of the natural language processing task. One should note that since the input state of affairs is not normalized, the end state of affairs is not normalized either. But this is not relevant because what we are interested in is an orthographic result; only the relative probability is crucial. Here, one needs another operator to generate the orthographic string. This should be a time-varying quantum state associated with the resulting utterance. This can be quite tricky and is very time-consuming to train7.6. Therefore, in this preliminary study a classical combinatorial optimizer is employed. Specifically, this is done by backward superposing possible orthographic strings and comparing them with the end state of affairs. Each candidate is given a score, which is calculated by preparing a candidate state according to Equation 7.1 and by calculating the absolute value of the complex inner product of the normalized state with the normalized end state of affairs. That is,


\begin{displaymath}score(\psi )=\left\vert {\left\langle {\psi } \mathrel{\left ...
...ern-\nulldelimiterspace} {\varphi } \right\rangle } \right\vert\end{displaymath}

where $\psi$ is a candidate state of affairs and $\varphi$ is the end state. In the ideal case, the inner product should be unity (1) for a perfect candidate. Since the vocabulary can be quite large, we suffer a combinatorial explosion if one employs a ``brute force'' (complete search) method. We therefore need heuristics to avoid such a disaster. This is done according to the following algorithm,

 0. Normalize the end state of affairs; set the initial threshold 
     Theta=0.01;
 1. Build a set S of all symbols with absolute value greater or equal
     to Theta;
 2. Calculate the score of each permutation in S; notice the one with
     best score;
 3. Theta := Theta+0.01;
 4. If Theta <= 0.4 goto step 1;
 5. Output the permutation with best score.

The string that yields the best score is taken as the orthographic result. The scheme described above is illustrated in Figure 7.3. We are now ready to apply this framework to NLP tasks.

Figure 7.3: Quantum theoretical NLP.
\begin{figure}\centering\indent{\epsfig{figure=qnet-arch.epsi,scale=0.7}}
\end{figure}


next up previous contents index
Next: Syllogism in natural language Up: Application of QT to Previous: Issues of natural language   Contents   Index
Joseph Chen 2002-09-05