Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

In this paper, we investigate the role of attention heads in Context-aware Machine Translation models for pronoun disambiguation in the English-to-German and English-to-French language directions. We analyze their influence by both observing and modifying the attention scores corresponding to the plausible relations that could impact a pronoun prediction. Our findings reveal that while some heads do attend the relations of interest, not all of them influence the models' ability to disambiguate pronouns. We show that certain heads are underutilized by the models, suggesting that model performance could be improved if only the heads would attend one of the relations more strongly. Furthermore, we fine-tune the most promising heads and observe the increase in pronoun disambiguation accuracy of up to 5 percentage points which demonstrates that the improvements in performance can be solidified into the models' parameters.
Original languageEnglish
Title of host publicationCOLING 2025 - 31st International Conference on Computational Linguistics, Proceedings of the Main Conference
EditorsOwen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
PublisherAssociation for Computational Linguistics (ACL)
Pages6348-6377
Number of pages30
VolumePart F206484-1
ISBN (Electronic)9798891761964
Publication statusPublished - 2025
Event31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, United Arab Emirates
Duration: 19 Jan 202524 Jan 2025
https://coling2025.org/

Publication series

SeriesProceedings - International Conference on Computational Linguistics, COLING
VolumePart F206484-1
ISSN2951-2093

Conference

Conference31st International Conference on Computational Linguistics, COLING 2025
Abbreviated titleCOLING 2025
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period19/01/2524/01/25
Internet address

Fingerprint

Dive into the research topics of 'Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models'. Together they form a unique fingerprint.

Cite this