Multilingual machine translation using hierarchical transformer
DOI:
https://doi.org/10.17308/sait.2022.1/9207Keywords:
neural machine translation, multilingual translation, parameter organization, language trees, hierarchical architecture, low-resource translation, related languagesAbstract
The way parameters are organized in multilingual machine translation models defines the effectiveness of parameter space usage. Therefore, it directly influences the translation quality. This work explores the idea of using language trees as the basis for the multilingual machine translation models architecture. Language trees show how different languages are related to each other and the primary idea is to organize multilingual models according to these expert hierarchies: the more related two languages are, the more parameters they share. We test this approach for the Transformer architecture and demonstrate that despite the success in previous works there are persistent problems inherent to training hierarchical models. We investigate it and propose a solution to this problem and show that with the suggested training fix the hierarchical model can considerably outperform both bilingual and multilingual models with full parameter sharing.
References
Downloads
Published
Issue
Section
License
Условия передачи авторских прав in English













