Home Research Publications


Research Overview

I am currently a engineer at Microsoft/Azure working on building hyperscale accelerators for cloud services. I've helped develop and launch various accelerated services including Microsoft's first production FPGA-accelerated machine learning service for Bing Search and SDN Accelerated Networking for Azure Networking.

My previous research at UT-Austin under Professor Derek Chiou and the UTFAST research group focused on improving the design of highly parallel systems (from algorithms and applications down to microarchitecture) by improving the speed and flexibility of many-core simulation. I have worked on multiple projects involving novel approaches to HW/SW partitioning, systems analysis and communication methods to enable efficient parallelization and design of accelerators across multiple problem domains.

My primary research efforts have focused on exploiting logical decomposition of simulation across a functional/timing boundary. Using novel parallelization and speculation techniques, such a partitioning makes aggressive fine-grain parallelization with hybrid CPU/FPGA platforms a practical reality, improving simulation rates by orders of magnitude. Secondly, using a novel set of user-exposed simulation mechanisms, such a partitioning enables sw design-space exploration for optimizing large-scale algorithmic changes in parallel software prior to committing to expensive and potentially uncessary code changes.



Selected Publications

  • Serving DNNs in Real Time at Datacenter Scale with Project Brainwave
    Chung et al (Bing and Microsoft MSR Catapult Groups - multi-authored paper)
    IEEE MICRO 2018
  • Azure Accelerated Networking: SmartNICs in the Public Cloud
    Firestone et al (Azure Networking and Microsoft MSR Catapult Groups - multi-authored paper)
    NSDI 2018
  • HGum: Hardware Messaging Framework
    Sizhou Zhang, Hari Angepat and Derek Chiou
    Reconfig 2017
  • Building in the Cloud: Microsoft's Build System for FPGA Services
    Todd Massengill, Hari Angepat, Adrian Caulfield, Eric Chung, Andrew Putnam
    DAC 2017 Designer Track
  • A Cloud-scale Acceleration Architecture
    Caulfield et al (Bing and Microsoft MSR Catapult Groups - multi-authored paper)
    MICRO 2016
  • FPGA Accelerated Simulators
    Hari Angepat, Derek Chiou, Eric Chung, James Hoe
    Morgan Caufmann, 2014
  • An FPGA-based In-line Accelerator for Memcached
    Maysam Lavasani, Hari Angepat and Derek Chiou
    Computer Architecture Letters (CAL 2013) / Hotchips (HC 2013 presentation)
  • HLS^2:High-Level Synthesis for High-Level Simulation using FPGAs
    Varun Koyyalagunta, Hari Angepat and Derek Chiou
    Workshop on Computer Architecture and Reconfigurable Logic (CARL 2010)
  • NIFD: Non-Intrusive FPGA Debugger
    Hari Angepat, Gage Eads, Christopher Craik and Derek Chiou
    Field Programmable Logic (FPL 2010)
  • Accurate Functional-First Multicore Simulators
    Derek Chiou, Hari Angepat, Nikhil A. Patil and Dam Sunwoo
    Computer Architeture Letters (CAL 2009)
  • Parallelizing Computer System Simulators,
    Derek Chiou, Dam Sunwoo, Hari Angepat, Joonsoo Kim, Nikhil Patil, William Reinhart and Darrel E. Johnson,
    International Parallel and Distributed Processing Symposium (IPDPS 2008)
  • FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators,
    Derek Chiou, Dam Sunwoo, Joonsoo Kim, Nikhil Patil, William Reinhart, Eric Johnson, Jebediah Keefe and Hari Angepat.
    International Symposium on Microarchitecture (MICRO 2007)