Abstract
In this paper, we investigate the role of attention heads in Context-aware Machine Translation models for pronoun disambiguation in the English-to-German and English-to-French language directions. We analyze their influence by both observing and modifying the attention scores corresponding to the plausible relations that could impact a pronoun prediction. Our findings reveal that while some heads do attend the relations of interest, not all of them influence the models' ability to disambiguate pronouns. We show that certain heads are underutilized by the models, suggesting that model performance could be improved if only the heads would attend one of the relations more strongly. Furthermore, we fine-tune the most promising heads and observe the increase in pronoun disambiguation accuracy of up to 5 percentage points which demonstrates that the improvements in performance can be solidified into the models' parameters.
Original language | English |
---|---|
Title of host publication | COLING 2025 - 31st International Conference on Computational Linguistics, Proceedings of the Main Conference |
Editors | Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 6348-6377 |
Number of pages | 30 |
Volume | Part F206484-1 |
ISBN (Electronic) | 9798891761964 |
Publication status | Published - 2025 |
Event | 31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, United Arab Emirates Duration: 19 Jan 2025 → 24 Jan 2025 https://coling2025.org/ |
Publication series
Series | Proceedings - International Conference on Computational Linguistics, COLING |
---|---|
Volume | Part F206484-1 |
ISSN | 2951-2093 |
Conference
Conference | 31st International Conference on Computational Linguistics, COLING 2025 |
---|---|
Abbreviated title | COLING 2025 |
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 19/01/25 → 24/01/25 |
Internet address |