Companion page to "Predictive Incremental Parsing Helps Language Modeling"
This page contains reference material and will be expanded to allow replication of our findings in our
COLING 2016 paper. If you have questions about our work, contact
Arne Köhn or
Timo Baumann.
Software
The parser we used is described in "
Incremental Predictive Parsing with TurboParser",
the code is available on its companion page. We also used the same models for English.
For the N-gram modeling reported in Section 5.1, we used
SRILM. We happily provide you with the scripts used to create the sub-models.
We adapted
faster-rnnlm to carry out the experiments in Section 5.2 and 5.3. We happily provide you with the patch.
Data
All experiments were carried out on the
billion word corpus.
We happily provide you with the parsed version of the corpus. (We'll have to check whether we can make it available online, given its size. Mailing you a flash drive will probably be the better choice.)