Abstract: Fixed-point quantization techniques have attracted considerable attention in deep neural network (DNN) inference acceleration. Nevertheless, they often require time-consuming fine-tuning or ...
Abstract: Recent SRAM-based in-memory computing (IMC) hardware demonstrates high energy efficiency and throughput for matrix–vector multiplication (MVM), the dominant kernel for deep neural networks ...
The setup for testing and evaluating of our code is based on the framework provided in the pqm4 project.