Session: InproTK
Participants: Okko, Nina, Kilian, Miroslav, Timo, some of the time: Jana, David; anybody else?
Notes taken by
Timo
- InproTK's data handling:
- information storage in the IU network: same-level-links (SLL), grounded-in-links (GRIN);
- distributed data storage, mostly normalized data (example:
getStartTime()
)
- single-memory architecture
- bridging processors would be possible (though limit access to SLL/GRIN information)
- =IU=s may be added, revoked, and finally comitted;
- the IU network is not necessarily bound to 1-best-processing (but see below)
- unified handling of input data (that mostly relates to the past) and output data (that relates to the future or the past)
- InproTK's processing methods:
- conceptually: IU processors with left-buffer and right-buffer: update messages about changes to the left buffer, then notify next processor (via the right-buffer) about its own changes
- in fact this is currently limited to pure left-to-right processing, top-down/right-to-left processing is not yet implemented
- (some ad-hoc top-down stuff using =Signal=s)
- processing modules currently do not support n-best-processing (at least not for input processing)
- threading issues: InproTK has improved a lot but you may still face threading issues; it's possible to have all modules in separate threads (though this is currently not done/not needed for most processors)
- potential processing with "active"
IU=s (example: a =WordIU
may actively determine prosodic stress by querying syllables→phonemes→pitch-track)
-
UpdateListener=s may register with =IU=s to be notified about certain changes that happen to an =IU
(e.g. gradual change from UPCOMING
via ONGOING
to COMPLETED
in synthesis)
- InproTK's incremental speech recognition module
- based on Sphinx, 1-best, no confidence measure but stability measure, highly effective filters for reducing incremental jitter of hypotheses
- final result is always as good as non-incremental result
- InproTK's incremental speech synthesis module
- works in real-time
- enables previously unseen system behaviours
- taking decisions
- is a serious problem (better wait or better decide?)
- Okko's work on dealing with revokes in dialogue managing (public information can't be revoked but has to be undone instead)
- alternative: top-down commit (which InproTK's ASR doesn't yet support) to avoid revokes.
- how to build your incremental module
- Okko proposes to sit down and to think about the possible edits that may happen;
- do not forget revoke messages while thinking about edits (and their consequences)
- if your code is a "module", then integration will be easy; if you have a partial implementation of a "system", then integrating with InproTK is harder (especially input, easier for incremental output)
- think about your data types (you will likely want to sub-class
IU
for your data
- discussion of Nina's use-case (integrate an incremental NLG component):
- InproTK works well for English and German
- any task needs its proper statistical language model; build one and use the
-lm
switch
- a relatively dumb incremental DM (e.g. keyword/keyconcept spotting) is easy to achieve
- integrating incremental NLG has been achived by Hendrik&Timo (SigDial 2012)
- looked at many demos
- Open-source available at http://inprotk.sourceforge.net, more information on the project at http://inpro.tk.
- alternatives to InproTK:
Edit this page
--
TimoBaumann
--
06 Oct 2012