Dr. Timo Baumann, Diplom-Informatiker

Research Scientist at the Language Technologies Institute, Carnegie Mellon University. Former researcher and instructor in speech processing, in particular incremental processing for responsive spoken dialogue interaction at Universität Hamburg.
  • mail: tbaumannÄŦcs.cmu.edu or baumannÄŦinformatik.uni-hamburg.de (for UHH-related issues)
  • skype: timobaumann
  • tel: +1 412 268 7755
  • office: GHC 5405, Carnegie Mellon University
  • consultation hours: please make an appointment by e-mail

Recent News

  • 2018-07: I've been awarded a Google research award for cloud credits! I, the intern working on the project, and my laptop's fan are really thrilled about this.
  • 2018-06: three papers accepted to Interspeech: syntax-prosody interface, attention in a HAN that classifies post-modern poetry, demo on DialogOS. See you in Hyderabad!
  • 2018-05: Scientific American asked me about the implications of Google Duplex, the system that supposedly can make restaurant reserverations for you. Short version: why not build an engineering solution for scheduling instead of using AI?
  • 2018-05: New paper with Vivian Tsai, Florian Pecune and Justince Cassell that shows faster responses are better responses at IWSDS. Great job, Vivian!
  • 2018-05: Paper on style detection for free verse poetry accepted for Coling 2018. See you in Santa Fe! Addendum: we also have a related paper at LaTeCH-CLfL right after Coling.
  • 2018-04: Popular Science interview about Amazon Alexa laughing out of the blue.
  • 2018-03: I'll be judging at the International Science and Engineering Fair and look forward to engaging and inspiring conversations with the young inventors that will shape our future! 2018-05: it was fun!
  • 2018-02: Two papers, on speech quality estimation and prosodic styles in post-modern poetry, have been accepted at Speech Prosody 2018. See you all in Poznań in June!
  • 2018-01: Our article on the Spoken Wikipedia Corpus collection is finally out (consider the pre-print if you don't like paywalls)!
  • older news

Research Interests

Why do we speak how we speak when we just speak to speak? And how can we model systems to do the same?

A (slightly longer) research statement.

Professional Activities

  • I maintain InproTK, the incremental dialogue processing toolkit, which is being used in multiple research labs around the world for building incremental spoken dialogue systems.
  • Together with David Schlangen, I gave an Interspeech tutorial on incremental processing in 2013 which was a great success. Slides can be found here. Exercises from a more recent introduction to InproTK are on Sourceforge.
  • I've organized the Workshop on Architectures for Conversationally Competent Spoken Dialogue Systems 2012 in Hamburg, Germany. The purpose of the workshop was to bring young researchers in the field together to talk about upcoming challenges in developing highly interactive and natural dialogue systems/virtual agents/conversational systems.
  • I've co-organized the Young Researcher's Roundtable on Spoken Dialogue Systems 2012 in Seoul, Korea
  • I've been reviewing for LREC 2010-2016, IJCNLP 2011,2015, (E)ACL 2015-2017, EMNLP 2016, Coling 2016, AAAI 2012, SigDIal 2013-2016, SemDial 2014, BEA 2014-2016, ICMI 2013-2016, Interspeech 2015-2016, ICASSP 2017, HRI 2015-2016, AutomotiveUI 2014-2015, ACM TiiS, TALLIP, SoRo, KnoSys, CSL, LREV, and multiple smaller events




  • Next winter term 2016
    • Logics Programming Exercises
    • Practical Course ("Praktikum") on Using Speech Processing Software (BSc)
  • Summer term 2016

Previous Teaching

Student Supervision

Currently, I am supervising Jula Menck's BSc thesis on speech characteristics in the Spoken Wikipedia Corpus, Oskar Dörffler's BSc thesis on prosody and syntax of read speech in the Spoken Wikipedia Corpus, and Tim Krämer's MSc thesis which aims at bringing better Spoken Wikipedia browsing to the real Wikipedia.

I have previously co-supervised Alexandra Krah's BSc thesis on temporal elasticity of speech sounds in variable-rate speech, Alexander Grund's BSc thesis on incremental post-processing of Google's ASR results, Natalia Orlova's MSc thesis on the combination of multiple incremental speech recognizers, Marcel Rohde's BSc thesis on a Spoken Wikipedia Browser, Valentin Strauss' BSc thesis on incremental post-processing of Google's ASR results, Florian Stegen's BSc thesis on long audio alignment for the Spoken Wikipedia, Jonathan Werner's BSc thesis on keyword spotting in lecture transcriptions, Sven Zimmer's BSc thesis on build tools for scientific software development, Anne Rubruck's MSc thesis on decomposing semantic annotations into lexical semantics, Kolja Kirsch's BSc thesis on semi-automatic page-turning for piano sheet music, Jiyan Jonsdotter's BSc thesis on applying incremental spoken output to navigation systems, Engelke Eschner's diploma thesis on NLG for transit schedules, Sören Nykamp's BSc thesis on incremental processing in interactive storytelling, Johannes Twiefel's MSc thesis on improving Google's ASR using phonetic post-processing techniques given domain knowledge, Anita Eisenhaber's BSc thesis on sentiment analysis in social media statements (tweets, etc.), Svenja Neef's BSc thesis on analyzing the incremental properties of Android's ASR and integrating it with InproTK, Ole Eichhorn's BSc thesis on incremental speech synthesis integration into the VAVETaM system, and Rabih Hamadeh's MSc thesis on optimizing incremental ASR hypotheses.

I actively encourage students to go abroad! I successfully talked Maike Paetzel into visiting ICT as a summer intern in 2013 to work with David DeVault (which resulted in a publication at LREC 2014), and Arne Köhn is visiting ICT as a summer intern in 2014 to work with Kenji Sagae on incremental parsing. Sven Mutzl approached me regarding an internship opportunity in Shanghai which I helped to set up in collaboration with Kai Yu at Shanghai Technical University. Siva Meenakshi Renganathan from Anna University (Chennai) visited our lab as a summer intern in 2014 to work on exploiting Spoken Wikipedia data for speech research.

Under construction: Timo's assessment criteria for theses (and, to a lesser degree, seminar papers). See also: Timo's advice on how to write a good thesis.
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback