Tackling Data Scarcity In Speech Translation Using Zero-Shot Multilingual Machine Translation Techniques

Tu Anh Dinh*, Danni Liu, Jan Niehues

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Recently, end-to-end speech translation (ST) has gained significant attention as it avoids error propagation. However, the approach suffers from data scarcity. It heavily depends on direct ST data and is less efficient in making use of speech transcription and text translation data, which is often more easily available. In the related field of multilingual text translation, several techniques have been proposed for zero-shot translation. A main idea is to increase the similarity of semantically similar sentences in different languages. We investigate whether these ideas can be applied to speech translation, by building ST models trained on speech transcription and text translation data. We investigate the effects of data augmentation and auxiliary loss function. The techniques were successfully applied to few-shot ST using limited ST data, with improvements of up to +12.9 BLEU points compared to direct end-to-end ST and +3.1 BLEU points compared to ST models fine-tuned from ASR model.
Original languageEnglish
Title of host publicationICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherIEEE
Pages6222-6226
Number of pages5
ISBN (Print)9781665405409
DOIs
Publication statusPublished - 2022
Event47th IEEE International Conference on Acoustics, Speech and Signal Processing - Online, Singapore, Singapore
Duration: 22 May 202227 May 2022
Conference number: 47
https://2022.ieeeicassp.org/

Publication series

SeriesInternational Conference on Acoustics Speech and Signal Processing Proceedings
ISSN1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2022
Country/TerritorySingapore
CitySingapore
Period22/05/2227/05/22
Internet address

Keywords

  • speech translation
  • zero-shot
  • few-shot
  • machine translation
  • multi-task

Cite this