Presentation + Paper
7 June 2023 In-sensor neural network for real-time KWS by image processing
Paola Vitolo, Pio Esposito, Danilo Pau, Rosalba Liguori, Luigi Di Benedetto, Gian Domenico Licciardo
Author Affiliations +
Abstract
KeyWord Spotting (KWS), i.e. the capability to identify vocal commands as they are pronounced, is becoming one of the most important features of Human-Machine Interface (HMI), also thanks to the pervasive diffusion of high-performance MEMS audio sensors with very reduced dimensions. In-Sensor Computing (ISC) appears the most viable solution to get the maximum advantage of KWS, since the dimensions of MEMS microphones remain reduced and minimally invasive. ISC, indeed, represents the extreme evolution of the edge computing paradigm, where the processing circuits are moved close to the audio sensor, integrated into its auxiliary circuitry or in the same package. However, ISC introduces severe area and power constraints and must trade off with processing speed to meet real-time operations naturally required by KWS. In this work, we want to show a neural network-based KWS suitable for ISC contexts, when audio sensor data are converted into MEL spectrogram images and a Depthwise Separable Convolutional Neural Network (DSCNN) with feature extraction capabilities is designed. To show the advantages of the above approach, the DSCNN is compared with an alternative Fully Connected Neural Network (FCNN), operating on audio signals not converted into images. The considered models have been profiled on a microcontroller and implemented on an FPGA. Their performances are compared in terms of classification accuracy and HW resources. Comparisons show that the FCNN is very far from meeting the ISC real-time processing requirements, showing a number of parameters and a frame latency respectively of 3 and 1 orders of magnitude higher than required by the DSCNN alternative when mapped to a Xilinx Zynq Ultrascale+ MPSoC.
Conference Presentation
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Paola Vitolo, Pio Esposito, Danilo Pau, Rosalba Liguori, Luigi Di Benedetto, and Gian Domenico Licciardo "In-sensor neural network for real-time KWS by image processing", Proc. SPIE 12571, Real-time Processing of Image, Depth and Video Information 2023, 125710F (7 June 2023); https://doi.org/10.1117/12.2665545
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Performance modeling

Image processing

Neural networks

Signal processing

Neurons

Digital signal processing

Feature extraction

Back to Top