Jan Gray, GRAY RESEARCH LLC
GRVI Phalanx: A Massively Parallel RISC-V FPGA Accelerator Framework
Complementing datacenter server CPUs, FPGA accelerators promise higher throughput, lower latency, and lower energy. But it is challenging to move working software into a hardware accelerator, and to maintain it as the code evolves. So GRVI Phalanx is a work-in-progress parallel processor overlay framework that aspires to simplify FPGA accelerator development via a software-first, software-mostly methodology. A parallel C++ workload is recompiled and run across dozens or hundreds of soft processors. Then custom hardware datapaths and/or memories are added to speed up workload bottlenecks. Most design iterations are just recompiles, so that accelerator development can be more like software performance engineering than a tapeout.
GRVI is an FPGA-efficient 32-bit RISC-V soft CPU. Phalanx is a parallel processor/accelerator array overlay. Clusters of CPUs/accelerators/SRAMs and high bandwidth I/O/DRAM controllers are interconnected by a Hoplite network-on-chip (NOC) with 300-bit wide links. An example Phalanx system packs 1,680 GRVI cores in a single Xilinx VU9P FPGA. The talk begins with a brief review of FPGA architecture and a look at how FPGAs are now being used at scale to accelerate datacenter workloads. The bulk of the talk details the FPGA design and implementation of GRVI Phalanx and its NOC, and ongoing work to make it available on Amazon AWS F1 FPGA machine instances.
Jan is a software industry veteran (i.e., old). From ’87-‘09, he worked at Microsoft on developer tools and platforms including C#’88, Visual C++, Transaction Server, COM+, the Common Language Runtime, and the Parallel Computing Platform. He is also a computer architect and FPGA hacker. He built the first 32-bit RISC soft CPU/SOC, worked on transactional memory acceleration with Intel, designed the parallel machine learned model evaluator of the Microsoft Catapult Bing ranking accelerator, and now focuses on the GRVI Phalanx parallel RISC-V accelerator framework and its FPGA-efficient Hoplite NOC. Jan has a B.Math (CS/EEE), Waterloo, 1987, and 45 US patents. Long ago, Jan was truly fortunate to spend his teenage years with lovely and amazing MathSoc and CS Club students; this set the course of his life.
Invited by Professor Nachiket Kapre