|DATE||December 07 (Mon), 2020|
|TITLE||Achieving Performance and Programmability in DNN Acceleration Solutions|
|ABSTRACT||As the computational demand of emerging applications (e.g., deep learning) rapidly increases, the benefits of conventional general-purpose solutions are diminishing. Acceleration has been a promising alternative solution that delivers orders-of-magnitude higher performance and energy efficiency gains compared to the general-purpose counterparts. However, achieving both performance and programmability at the same time is still the greatest challenge to facilitate the use of such acceleration solutions.
I will talk about two works that address this challenge by developing hardware-software co-designed full stack solutions. I will first talk about Bit Fusion, a novel DNN acceleration solution, which leverages the inherent algorithmic properties of DNNs and provides a bit-flexible accelerator that dynamically fuse the on-chip computing units to match the bit width of individual DNN layers. I will then talk about INCEPTIONN, a hardware-algorithm co-designed in-network acceleration solution for distributed DNN training system, which significantly reduces the inter-node communication overhead and in turn substantially improves the end-to-end DNN training performance.
(Zoom Link: https://kaist.zoom.us/j/85378884451)