Automatically Subtitling the C3: How speech processing helps the CCC subtitle project, and vice-versa.

Presented at 31C3 (2014), Dec. 29, 2014, 10 p.m. (30 minutes)

Transcribing a talk comes relatively easy to fast typists, whereas turning a transcript into time-aligned subtitles for a video requires a much larger human effort. In contrast, speech recognition performance (especially for open-source-based solutions), is still poor on open-domain topics, but speech technology is able to align a given text to the corresponding speech with high accuracy. Let's join forces to generate superior subtitling with little effort, and to improve future open-source-based speech recognizers, at the same time! We present the ongoing work of an student project in informatics at Universität Hamburg in which we combine the strengths of human transcription performance and automatic alignment of these transcriptions to produce high quality video subtitles. We believe that our work can help the C3 community in generating video subtitles with less manual effort, and we hope to provide subtitles for all 31C3 talks (as long as you provide the transcriptions). However, we're not just a service provider to the C3. There is a shortage of training material for free and open-source speech recognizers and the acoustic models they employ. Thus, we plan to prepare an aligned audio corpus of C3 talks which will help to advance open-source speech recognition. Be a part of this by helping us with your transcriptions -- we'll repay with subtitlings and better open-source speech recognition in the future!


  • Arne Köhn
    Arne is a research and teaching assistant at Universität Hamburg. He is doing his PhD on incremental syntax parsing.
  • timobaumann
    Timo is a computer scientist and researcher working on spoken dialogue and responsive interaction. Timo Baumann is a research associate and instructor in the natural language systems division at Universität Hamburg, Germany, with Wolfgang Menzel, focusing on incremental spoken language processing and especially the prosodic annd multimodal aspects of spoken language. Timo studied computer science and phonetics in Hamburg, Geneva and Granada, and received his master's-level diploma in 2007 for work at IBM research before heading to Potsdam University to work on incremental and projective processing in dialogue systems with David Schlangen. This work resulted in his PhD from Bielefeld University in 2013. Timo takes his bike to work no matter the weather and produces his own electricity as a shareholder in a photovoltaics cooperative.


Similar Presentations: