SpotOn: Adversarially Robust Keyword Spotting on Resource-Constrained IoT Platforms

Abstract

IoT devices (e.g., voice assistants) that execute real-time speech commands are proliferating fast in our daily lives. In such a device, detecting the correct keyword spoken as a command triggers the supported function, and hence keyword spotting (KWS) using a machine learning (ML) model is the pivotal task in their functioning. However, KWS is vulnerable to adversarial machine learning (AML)-based attacks through which an adversary can craft an adversarial audio sample that sounds like a benign keyword to a human, but is detected as a different keyword by the KWS pipeline. In this paper, we propose SpotOn, a novel KWS pipeline that both recovers from AML attacks as well as detects whether an attacker is using the device to generate AML noise. Using the Google speech command dataset, we demonstrate that SpotOn provides reasonable accuracy in correctly detecting keywords in the absence or presence of AML attacks. Through careful optimizations, we enable SpotOn to process streaming speech input on resource-constrained IoT devices. Overall, the design of SpotOn provides critical insights into making voice-controlled IoT devices suitable for safety-critical systems.

Publication
ACM ASIA Conference on Computer and Communications Security (AsiaCCS)

Related