Machine Learning Acceleration

Over the last decade, large-scale deep neural networks (DNNs) have made breakthroughs in many fields, such as image recognition, speech recognition, game playing, complicated control systems, driver-less cars and unmanned aerial systems (UAS). High computational complexity and large model size are two key challenges of deep neural networks (DNNs) that motivated the research efforts on model compression techniques and hardware acceleration.

Our research group made research contributions to accelerate machine learning with architecture and algorithm innovations. PipeLayer [HPCA'17] is a ReRAM-based PIM accelerator for CNNs that support both training and testing. SC-DCNN [ASPLOS'17] is the first comprehensive design and optimization framework of stochastic computing based DNNs, using a bottom-up approach. CirCNN [MICRO'17] is a principled approach to represent weights and process neural networks using block-circulant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from O(n2) to O(nlogn) and the storage complexity from O(n2) to O(n), with negligible accuracy loss. VIBNN [ASPLOS'18] is an FPGA-based hardware accelerator design for variational inference on BNNs.