Securing Voice Computing Interfaces

Securing the least secure computing interface since punchcards.

Alasdair Allan
3 min readOct 26, 2017

The least secure computing interface in my home is my Amazon Alexa. It is deeply tied to my Amazon account, but despite that it sits quietly on my kitchen counter where anyone can talk to it. While the device is uniquely associated with me, and my profile, it is used by everybody—family and visitors alike.

This is a problem with many, perhaps even most, connected devices in the home these days. While they’re physically accessible by anybody, they’re usually uniquely tied to an individual. Voice computing interfaces are particularly vulnerable, and properly securing them is an almost impossible job because your voice is recordable and reproducible.

What is needed is some sort of two-factor authentication for your voice, which is where recent work around wearable security tokens undertaken at the University of Michigan may well come in.

Kang Shin demonstrates VAuth, a wearable voice authentication device. (📷: Joseph Xu)

Two-factor authentication means that you need two, separate, independent methods to verify your identity. Instead of just a password, you need a password and another time-dependent token generated when you attempt to log in. The something you know, the password, and something you have, a phone or a keyfob, that generates the time-dependent token.

The solution that Kang Shin, Professor of Electrical Engineering and Computer Science at the University of Michigan, and his team have come up with is a wearable device—called VAuth—that measures the vibrations created as the wearer speaks. An algorithm then compares those vibrations with the audio signal received by the digital assistant, if they match then the voice command is ‘authorised.’ If not, then the assistant is blocked from responding.

Like the keyfob that generates the time-dependent password of a traditional two-factor authentication, VAuth is the something you have rather than the something you are. Although unlike traditional two-factor authentication the lines are somewhat more blurred, and there is still some chance of false positives. It is however a dramatic improvement over no security.

The team tested VAuth with 18 users and 30 voice commands. It achieved a 97-percent detection accuracy and less than 0.1 percent false positive rate, regardless of its position on the body and the user’s language, accent or even mobility. The researchers say it also successfully thwarts various practical attacks, such as replay attacks, mangled voice attacks or impersonation attacks.

If you’re interested in the work done by Shin and his team at the University of Michigan, a paper on VAuth entitled “Continuous Authentication for Voice Assistants” was presented last week at the International Conference on Mobile Computing and Networking, in Snowbird, Utah.