Maastricht University’s Multilingual Speech Translation System for IWSLT 2021

Danni Liu*, Jan Niehues

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review


This paper describes Maastricht University’s participation in the IWSLT 2021 multilingual speech translation track. The task in this track is to build multilingual speech translation systems in supervised and zero-shot directions. Our primary system is an end-to-end model that performs both speech transcription and translation. We observe that the joint training for the two tasks is complementary especially when the speech translation data is scarce. On the source and target side, we use data augmentation and pseudo-labels respectively to improve the performance of our systems. We also introduce an ensembling technique that consistently improves the quality of transcriptions and translations. The experiments show that the end-to-end system is competitive with its cascaded counterpart especially in zero-shot conditions.
Original languageEnglish
Title of host publicationProceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
EditorsMarcello Federico, Alex Waibel, Marta R. Costa-jussa, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
PublisherAssociation for Computational Linguistics
Number of pages6
EditionAugust 2021
ISBN (Print)9781954085749
Publication statusPublished - 2021
Event18th International Conference on Spoken Language Translation - Online, Bangkok, Thailand
Duration: 5 Aug 20216 Aug 2021


Conference18th International Conference on Spoken Language Translation
Abbreviated titleIWSLT 2021
Internet address

Cite this