A recurrent attention model (RAM) algorithm for Keyword Spotting (KWS) is proposed.
A 0.44-μJ/dec, 39.9-μs/dec, Recurrent Attention In-Memory Processor for Keyword Spotting
Hassan Dbouk etc. — University of Illinois Urbana-Champaign
Journal of Solid-State Circuits (JSSC) · 2021
Contributions:
A recurrent attention model (RAM) algorithm [11] for Keyword Spotting (KWS).
KeyRAM algorithm allows accuracy vs energy scalability via confidence based computation scheme.
Multi-bit, multi-bank – 2 banks – IMC architecture with 4-bit matrix-vector multiplies, alongside a digital co-processor.
Sparsity aware summation scheme – what are the challenges for IMC when doing spare summations?
Digital co-processor employs a diagonal major weight storage to compute without any stalls – what is that?
Metrics: energy delay product?
Note: [12] – conference paper version of this paper.
Keyword Spotting flow: feature extraction → classification.
IMC papers: [13]–[22], [23] – IMC was first proposed.
Digital Low-Power Techniques:
a. Voltage over-scaling [4], [5], [7] an [2].
b. [6] RNN-IMC using SRAM macro 65nm – google speech dataset [8].
Depth-wise separable CNN [9], implemented as IC with lowest power consumption RNN for KWS [7].
c. [10] - signal processing, can be interesting to read. Voice activity detector (VAD), \(P_{VAD} = 200 nW\). Trick: Add VAD to perform power-gating to the KWS engine.
CIM Macro Specs: 96 × 512 6T SRAM cells, based on [13] and [14], mixed-signal multi-bit dot product processor.