Academic News

NLP Group has 1 paper accepted by NAACL 2025

Time:2025-01-23


In January 2025, the Natural Language Processing (NLP) research group had 1 paper accepted at NAACL 2025. The full name of NAACL 2025 is the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics. NAACL is one of the top conferences in the field of natural language processing and is the North American chapter of ACL. NAACL 2025 will be held in Albuquerque, New Mexico, USA, from April 29 to May 4, 2025.

A brief introduction to the accepted paper is as follows::

- MoCE:Adaptive Mixture of Contextualization Experts for Byte-based

Neural Machine Translation (Langlin Huang, Mengyu Bu, Yang Feng)

- NAACL Main Conference, long paper

Abstract: Byte-based machine translation systems have shown significant potential in massively multilingual settings. Unicode encoding, which maps each character to specific byte(s), eliminates the emergence of unknown words, even in new languages, enabling broad language scalability. However, byte-level tokenization results in sequences that are hard to interpret due to limited semantic information per byte. Local contextualization has proven effective in assigning initial semantics to tokens, improving sentence comprehension. Nevertheless, variations in encoding rules across languages necessitate an adaptive approach for effective contextualization. To this end, we propose Adaptive MultiScale-Headed Attention (Ada-MSHA), adaptively selecting and mixing attention heads, which are treated as contextualization experts. This enhances the flexibility of contextualization scales and improves the potential to discover a better strategy than previous methods. Experiment results show that our method outperforms existing methods without extensive manual adjustment of hyper-parameters and surpasses subword-based models with fewer parameters in Ted-59 dataset. Our code is available at https://github.com/ictnlp/MoCE.



附件下载: