Abstract
We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this paper, we explore one of these, the generation of constrained translation. We focus on length constraints, which are essential if the translation should be displayed in a given format. In this work, we propose an end-to-end approach for this task. Compared to a traditional method that first translates and then performs sentence compression, the text compression is learned completely unsupervised. We address the challenge of data availability as well as investigate several methods to integrate the constraints into the model. By combining the idea with zero-shot multilingual machine translation, we are also able to perform unsupervised monolingual sentence compression. Using the proposed approach, we are able to improve the translation quality for translation with length constraints as well as for monolingual length compression. In addition, the results are confirmed by a human evaluation.
Original language | English |
---|---|
Title of host publication | AMTA 2020 - 14th Conference of the Association for Machine Translation in the Americas, Proceedings |
Subtitle of host publication | (Volume 1: Research Track) |
Editors | Michael Denkowski, Christian Federmann |
Publisher | Association for Machine Translation in the Americas |
Pages | 21-35 |
Number of pages | 15 |
Volume | 1 |
Publication status | Published - 1 Jan 2020 |
Event | 14th Biennial Conference of the Association for Machine Translation in the Americas - Online, United States Duration: 6 Oct 2020 → 9 Oct 2020 https://amtaweb.org/amta-2020-postponed-one-month-and-going-virtual/ https://www.aclweb.org/portal/content/amta-2020-virtual-conference |
Publication series
Series | AMTA 2020 - 14th Conference of the Association for Machine Translation in the Americas, Proceedings |
---|---|
Volume | 1 |
Conference
Conference | 14th Biennial Conference of the Association for Machine Translation in the Americas |
---|---|
Abbreviated title | AMTA 2020 |
Country/Territory | United States |
Period | 6/10/20 → 9/10/20 |
Internet address |