Contact Info
Department of Applied Mathematics
University of Waterloo
Waterloo, Ontario
Canada N2L 3G1
Phone: 519-888-4567, ext. 32700
Fax: 519-746-4319
PDF files require Adobe Acrobat Reader
MC 5417 and Zoom (please email amgrad@uwaterloo.ca for the meeting link)
Yanming Kang | Applied Mathematics, University of Waterloo
Multi-level Transformer
Transformer based models have shown strong performance on natural language tasks. However, the quadratic complexity of the self-attention operation limits the maximum input length can be handled. Our proposed model reduces the computational complexity to $O(n\log{n})$ by grouping tokens according to their distance to the target, and summarizing them using strided convolution. In this presentation I will review prior work focusing on the efficiency of Transformers, describe our method, and introduce some preliminary results.
Contact Info
Department of Applied Mathematics
University of Waterloo
Waterloo, Ontario
Canada N2L 3G1
Phone: 519-888-4567, ext. 32700
Fax: 519-746-4319
PDF files require Adobe Acrobat Reader
The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is centralized within our Office of Indigenous Relations.