Title: Tackling Deployment Challenges for Multilingual Neural Machine Translation
Speaker: Dr. Hossam Amer, Microsoft
Abstract: Multilingual Neural Machine Translation has been showing great success using transformer models. This talk goes over some background concepts of neural machine translation and specifically transformers. In addition, this talk highlights some deployment challenges of these multilingual transformers. For example, these models usually require large vocabulary (vocab) sizes for various languages. This limits the speed of predicting the output tokens in the last vocab projection layer. To alleviate this challenge, we propose a fast vocabulary projection method via clustering which can be used for multilingual transformers on GPUs. Our results show end-to-end speed gains in float16 GPU inference up to 25% while maintaining the BLEU score and slightly increasing memory cost. The proposed method speeds up the vocab projection step itself by up to 2.6x. We also conduct an extensive human evaluation to verify the proposed method preserves the quality of the translations from the original model.
Hossam Amer recently joined Microsoft as an Applied Researcher. His research interests are Image/Video Compression, Computer Vision, and most recently Natural Language Processing. Prior to joining Microsoft, Hossam was a Postdoctoral-Fellow at the Multimedia Communications Lab at the University of Waterloo (UW) under Prof. En-hui Yang's supervision. He obtained his PhD from the same lab, where he received the prestigious annual UW teaching award based on students' and instructors' nominations and published papers in top IEEE venues. During his studies, he spent a summer internship at Google’s video team.
In collaboration with Intel Egypt Research Labs, Hossam obtained his MSc degree from the German University in Cairo, where he also completed his undergraduate degree from the Media Engineering Technology department. He worked on his Bachelor Project at the Lab of Multimedia Architectures of Ecole Polytechnique Federale de Lausanne (EPFL).
Invited by Professor En-Hui Yang, Department of Electrical and Computer Engineering, University of Waterloo
ALL ARE WELCOME!