BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Drupal iCal API//EN
X-WR-CALNAME:Events items teaser
X-WR-TIMEZONE:America/Toronto
BEGIN:VTIMEZONE
TZID:America/Toronto
X-LIC-LOCATION:America/Toronto
BEGIN:DAYLIGHT
TZNAME:EDT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZNAME:EST
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
DTSTART:20251102T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
UID:69b0f2fc5ef29
DTSTART;TZID=America/Toronto:20260210T103000
SEQUENCE:0
TRANSP:TRANSPARENT
DTEND;TZID=America/Toronto:20260210T120000
URL:https://uwaterloo.ca/math-faculty-computing-facility/events/bandicoot-e
 fficient-gpu-linear-algebra-c-template
SUMMARY:Bandicoot: Efficient GPU linear algebra via C++ template\nmetaprogr
 amming
CLASS:PUBLIC
DESCRIPTION:Math Faculty Computing Facility\n\nPresents:\n\nDR. RYAN CURTIN
 \n\nTuesday\, February 10\, 2026\n\n10:30am - 12:00pm DC 1302\n\nR.S.V.P t
 o mfcfhelp@uwaterloo.ca\n\n\"BANDICOOT: EFFICIENT GPU LINEAR ALGEBRA VIA 
 C++ TEMPLATE\nMETAPROGRAMMING\"\n\n \n\nABSTRACT:\n\n \n\nIt's not too m
 uch of a stretch to say that linear algebra is the\nbackbone of modern com
 putational science\; by that token\, efficiency is\nof paramount importanc
 e.  For over 15 years\, the Armadillo C++\nlinear algebra library has pr
 ovided efficient linear algebra\nimplementations via template metaprogramm
 ing\, using expression\ntemplates and delayed evaluation techniques (which
  were originally\ndeveloped at Waterloo in the late 90s!).  Recently\, t
 he Bandicoot\nproject introduced the same techniques for GPU linear algebr
 a using an\nArmadillo-compatible API for easy drop-in usage.  Bandicoot 
 is not\nspecific to particular hardware\; it can be used with any CUDA or\
 nOpenCL device\, and additional backends (HIP/ROCm\, Metal\, Vulkan) are\n
 actively being developed.  Bandicoot is able to both optimize linear\nal
 gebra expressions at compile-time in the same way Armadillo can\, and\nals
 o generate efficient GPU kernel code with fused optimizations.  I\nwill 
 discuss each of these optimizations\, how they all fit together\ninto Band
 icoot\, and show how existing Armadillo applications can be\neasily adapte
 d to the GPU with Bandicoot.\n\n \n\nBIO:\n\n \n\nDr. Ryan Curtin is an 
 independent researcher and open-source software\ndeveloper\, leading the d
 evelopment and maintenance of several packages\nin the C++ scientific soft
 ware ecosystem.  During his Ph.D. at\nGeorgia Tech he focused on the for
 malization of dual-tree algorithms\,\na class of geometric branch-and-boun
 d algorithms that can be used to\nsolve subproblems relevant to machine le
 arning techniques.  These\nalgorithms underlie the efficient mlpack C++ 
 machine learning library\,\nwhich he has led since 2010.  In his free ti
 me\, he races go-karts\,\nso he never escapes from trying to go fast in on
 e way or another. 
DTSTAMP:20260311T044340Z
END:VEVENT
END:VCALENDAR