UHH>Informatik>NatS>ACCSDS Web>Programme>Poster3 (17 Oct 2012, UnknownUser) Print version

Session: InproTK

Participants: Okko, Nina, Kilian, Miroslav, Timo, some of the time: Jana, David; anybody else?

Notes taken by Timo

InproTK's data handling:
- information storage in the IU network: same-level-links (SLL), grounded-in-links (GRIN);
- distributed data storage, mostly normalized data (example: getStartTime())
- single-memory architecture
- bridging processors would be possible (though limit access to SLL/GRIN information)
- =IU=s may be added, revoked, and finally comitted;
- the IU network is not necessarily bound to 1-best-processing (but see below)
- unified handling of input data (that mostly relates to the past) and output data (that relates to the future or the past)
InproTK's processing methods:
- conceptually: IU processors with left-buffer and right-buffer: update messages about changes to the left buffer, then notify next processor (via the right-buffer) about its own changes
- in fact this is currently limited to pure left-to-right processing, top-down/right-to-left processing is not yet implemented
- (some ad-hoc top-down stuff using =Signal=s)
- processing modules currently do not support n-best-processing (at least not for input processing)
- threading issues: InproTK has improved a lot but you may still face threading issues; it's possible to have all modules in separate threads (though this is currently not done/not needed for most processors)
- potential processing with "active" IU=s (example: a =WordIU may actively determine prosodic stress by querying syllables→phonemes→pitch-track)
- UpdateListener=s may register with =IU=s to be notified about certain changes that happen to an =IU (e.g. gradual change from UPCOMING via ONGOING to COMPLETED in synthesis)
InproTK's incremental speech recognition module
- based on Sphinx, 1-best, no confidence measure but stability measure, highly effective filters for reducing incremental jitter of hypotheses
- final result is always as good as non-incremental result
InproTK's incremental speech synthesis module
- works in real-time
- enables previously unseen system behaviours
taking decisions
- is a serious problem (better wait or better decide?)
- Okko's work on dealing with revokes in dialogue managing (public information can't be revoked but has to be undone instead)
- alternative: top-down commit (which InproTK's ASR doesn't yet support) to avoid revokes.
how to build your incremental module
- Okko proposes to sit down and to think about the possible edits that may happen;
- do not forget revoke messages while thinking about edits (and their consequences)
- if your code is a "module", then integration will be easy; if you have a partial implementation of a "system", then integrating with InproTK is harder (especially input, easier for incremental output)
- think about your data types (you will likely want to sub-class IU for your data
discussion of Nina's use-case (integrate an incremental NLG component):
- InproTK works well for English and German
- any task needs its proper statistical language model; build one and use the -lm switch
- a relatively dumb incremental DM (e.g. keyword/keyconcept spotting) is easy to achieve
- integrating incremental NLG has been achived by Hendrik&Timo (SigDial 2012)
looked at many demos
Open-source available at http://inprotk.sourceforge.net, more information on the project at http://inpro.tk.
alternatives to InproTK:
- Jindigo
- IPAACA/IPAACA2

Edit this page -- TimoBaumann -- 06 Oct 2012

ACCSDS

Navigation

NatsWiki
Main
User
Sandbox
System

NatsWiki
Main
User
Sandbox
System

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback