University of Michigan (U-M) scientists have developed a voice-authentication system for reducing the risk of being spoofed when you use a biometric system to log into secure services or a voice assistant (such as Amazon Echo and Google Home).
A hilarious example of spoofing a voice assistant happened during a Google commercial during the 2017 Super Bowl. When actors voiced “OK Google” commands on TV, viewers’ Google Home devices obediently began to play whale noises, flip lights on, and take other actions.
More seriously, an adversary could possibly bypass current voice-as-biometric authentication mechanisms, such as Nuance’s “FreeSpeech” customer authentication platform (used in a call centers and banks) by simply impersonating the user’s voice (possibly by using Adobe Voco software), the U-M scientists also point out.*
The VAuth system
The U-M VAuth (continuous voice authentication, pronounced “vee-auth”) system aims to make that a lot more difficult. It uses a tiny wearable device (which could be built in to a necklace, earbud/earphones/headset, or eyeglasses) containing an accelerometer (or a special microphone) that detects and measures vibrations on the skin of a person’s face, throat, or chest.
The team has built a prototype using an off-the-shelf accelerometer and a Bluetooth transmitter, which sends the vibration signal to a real-time matching engine in a device (such as Google Home). It matches these vibrations with the sound of that person’s voice to create a unique, secure signature that is constant during an entire session (not just at the beginning). The team has also developed matching algorithms and software for Google Now.
Security holes in voice authentication systems
“Increasingly, voice is being used as a security feature but it actually has huge holes in it,” said Kang Shin, the Kevin and Nancy O’Connor Professor of Computer Science and professor of electrical engineering and computer science at U-M. “If a system is using only your voice signature, it can be very dangerous. We believe you have to have a second channel to authenticate the owner of the voice.”
VAuth doesn’t require training and is also immune to voice changes over time and different situations, such as sickness (a sore throat) or tiredness — a major limitation of voice biometrics, which require training from each individual who will use them, says the team.
The team tested VAuth with 18 users and 30 voice commands. It achieved a 97-percent detection accuracy and less than 0.1 percent false positive rate, regardless of its position on the body and the user’s language, accent or even mobility. The researchers say it also successfully thwarts various practical attacks, such as replay attacks, mangled voice attacks, or impersonation attacks.
A study on VAuth was presented Oct. 19 at the International Conference on Mobile Computing and Networking, MobiCom 2017, in Snowbird, Utah and is available for open-access download.
The work was supported by the National Science Foundation. The researchers have applied for a patent and are seeking commercialization partners to help bring the technology to market.
* As explained in this KurzweilAI article, Adobe Voco technology (aka “Photoshop for voice”) makes it easy to add or replace a word in an audio recording of a human voice by simply editing a text transcript of the recording. New words are automatically synthesized in the speaker’s voice — even if they don’t appear anywhere else in the recording.
Abstract of Continuous Authentication for Voice Assistants
Voice has become an increasingly popular User Interaction (UI) channel, mainly contributing to the current trend of wearables, smart vehicles, and home automation systems. Voice assistants such as Alexa, Siri, and Google Now, have become our everyday fixtures, especially when/where touch interfaces are inconvenient or even dangerous to use, such as driving or exercising. The open nature of the voice channel makes voice assistants difficult to secure, and hence exposed to various threats as demonstrated by security researchers. To defend against these threats, we present VAuth, the first system that provides continuous authentication for voice assistants. VAuth is designed to fit in widely-adopted wearable devices, such as eyeglasses, earphones/buds and necklaces, where it collects the body-surface vibrations of the user and matches it with the speech signal received by the voice assistant’s microphone. VAuth guarantees the voice assistant to execute only the commands that originate from the voice of the owner. We have evaluated VAuth with 18 users and 30 voice commands and find it to achieve 97% detection accuracy and less than 0.1% false positive rate, regardless of VAuth’s position on the body and the user’s language, accent or mobility. VAuth successfully thwarts various practical attacks, such as replay attacks, mangled voice attacks, or impersonation attacks. It also incurs low energy and latency overheads and is compatible with most voice assistants.