Machine Translation with Unsupervised Length-Constraints

Jan Niehues

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this paper, we explore one of these, the generation of constrained translation. We focus on length constraints, which are essential if the translation should be displayed in a given format. In this work, we propose an end-to-end approach for this task. Compared to a traditional method that first translates and then performs sentence compression, the text compression is learned completely unsupervised. We address the challenge of data availability as well as investigate several methods to integrate the constraints into the model. By combining the idea with zero-shot multilingual machine translation, we are also able to perform unsupervised monolingual sentence compression. Using the proposed approach, we are able to improve the translation quality for translation with length constraints as well as for monolingual length compression. In addition, the results are confirmed by a human evaluation.
Original languageEnglish
Title of host publicationAMTA 2020 - 14th Conference of the Association for Machine Translation in the Americas, Proceedings
Subtitle of host publication(Volume 1: Research Track)
EditorsMichael Denkowski, Christian Federmann
PublisherAssociation for Machine Translation in the Americas
Pages21-35
Number of pages15
Volume1
Publication statusPublished - 1 Jan 2020
Event14th Biennial Conference of the Association for Machine Translation in the Americas - Online, United States
Duration: 6 Oct 20209 Oct 2020
https://amtaweb.org/amta-2020-postponed-one-month-and-going-virtual/
https://www.aclweb.org/portal/content/amta-2020-virtual-conference

Publication series

SeriesAMTA 2020 - 14th Conference of the Association for Machine Translation in the Americas, Proceedings
Volume1

Conference

Conference14th Biennial Conference of the Association for Machine Translation in the Americas
Abbreviated titleAMTA 2020
Country/TerritoryUnited States
Period6/10/209/10/20
Internet address

Cite this