Design

Voice Control, But Make It Sonos

Closeup of a white Sonos Roam being used on kitchen countertop

Jason Adams

Guest Writer

When it came to creating a voice control experience unique to Sonos, the team behind Sonos Voice Control™ took a listener-first approach that prioritized privacy, sound quality, and inclusivity along with hands-free ease.

“Voice control makes technology very accessible,” says Joseph Dureau, Vice President of Voice Experience at Sonos. Dureau leads the team behind Sonos Voice Control, a new innovation that offers listeners hands-free control of their music and systems with any voice-enabled Sonos product. His interest in the technology, however, extends beyond his professional life.

As a father to two young children, Dureau is always trying to limit screen time for himself and his family. “Being able to control music with my voice helps me stay present,” he explains. “I don’t need to pull out my smartphone to change the song or adjust the volume, and my children don’t see me constantly looking at a device.”

Drawn to the convenience of voice control, Dureau and his team wanted to find a way to improve upon the technology in a way that only Sonos could, making it even more intuitive, private, and inclusive. Oh, and it had to sound good, too.

A white Sonos Roam with examples of voice commands in animated speech bubbles

Easy does it

“Everything we do aims to empower listeners,” says Dureau. “When we think about a new feature, product, or partnership, it has to be rooted in our ethos. And it has to be easy.” With the addition of Sonos Voice Control, the team has realized this ambition on a new level with a one-of-a-kind experience.

An out of the box approach

Sonos Voice Control stands apart from other voice assistants that have become synonymous with smart speakers.

First, the setup is nearly effortless: There are no extra apps to download or new accounts to create. Sonos Voice Control offers complete command of your Sonos system with any voice-enabled product, regardless of generation.

Unlike other voice assistants, Sonos Voice Control has a better grasp of natural language, so it understands the way people actually talk. For example, it understands “Hey Sonos, up a bit” as well as “Hey Sonos, louder.” Whatever comes more naturally. You can also quickly follow up one request with another without having to repeat “Hey Sonos” (also known as the wake word) every single time. “It seems like a small detail,” says Dureau, “But it creates a lot of friction.”

Also, Sonos Voice Control processes all requests locally on the speaker or soundbar, which significantly reduces the time to music. It’s as instant as tapping a control in the app. Most of the time, Sonos Voice Control just does what’s asked without a repetitive verbal confirmation.

Local processing also allows for limited playback controls on Bluetooth®, including pausing, resuming, and volume adjustments. Plus, because no data or audio transcript is ever sent to the cloud, Sonos Voice Control doesn’t compromise on privacy.

Safe and sound

With the increase of smart devices in our lives, privacy has become a top priority for customers. It’s also the number one reason why listeners never set up or use a voice assistant despite being drawn to the ease. So, naturally, privacy became a top priority for the Sonos Voice Experience team.

“We don’t use any customer data to feed our engines,” explains Dureau. All requests are processed locally on the Sonos product. “At no point is your voice going to be sent to the cloud, transcribed, or listened to by anyone.”

Rather than making shopping frictionless or leveraging data to enhance voice AI and search engines, Sonos Voice Control services the listening experience. “It’s like a voice version of the app,” says Dureau.

Finding our voice

“At Sonos, we want to make everything sound as good as we possibly can,” says Greg McAllister, Senior Manager of Sound Experience. Throughout his storied career as a sound engineer, McAllister has worked on everything from Hollywood blockbusters to award-winning albums alongside Sonos Sound Experience Leader Giles Martin at London’s famed Abbey Road Studios. When it came to choosing a voice for Sonos Voice Control, he wanted it to sound as “natural as possible.”

Rather than use a sterile-sounding option generated by AI, McAllister and his team sought the perfect human voice, which they found in Giancarlo Esposito, the beloved big-screen and Emmy-nominated actor known for Do the Right Thing, Breaking Bad, Better Call Saul, and The Mandalorian.

So, what makes a good-sounding voice? “It’s very dependent on context,” explains McAllister. “For instance, if you’re listening to someone reading a bedtime story, then you want the voice to be nice, soft, and relaxing. If you’re listening to somebody making an announcement, that should be a bit more direct.” Sonos Voice Control had to do it all.

“We want something that’s warm and positive, but also informative. Giancarlo’s voice perfectly fits the bill. He has a really nice deep resonance about his voice that feels very warm.”

“I’ve had a lot of experience playing characters, but I realized with Sonos that I would have to manipulate my voice in a way that was really me. It enticed me to be a part of this because people respond to the quality of my voice—they say it can be calm, it can be reassuring. It’s a voice that people recognize before they even recognize my face.”

— Giancarlo Esposito

Giancarlo Esposito recording voice over in the studio
Giancarlo Esposito speaking with an audio engineer in the studio

All-inclusive experience

Even more important than the human output of Sonos Voice Control is the human input that informed its final articulation.

Dureau and his team paid close attention to creating the best experience for all existing and future Sonos listeners. While surveying a diverse set of testers for an initial launch in US English, it became clear that they would not be designing for a monolith and found a few interesting opportunities for innovation.

“In the US, we noticed a large majority of song titles and artist names in Spanish,” explains Dureau. “We wanted to make sure that words would be pronounced correctly.” He laments how other assistants tend to mispronounce or alter non-English words and names entirely, sending the people on the other end hearing them that they were not considered in this technology. “We want everyone to feel included in the Sonos Voice Control experience,” he says.

A French native, Dureau also brings a special appreciation for regional accents. The US-based testers ranged from Southern California and New Jersey to Georgia and Minnesota, and the team plans to continue expanding the pool. “Every time we launch Sonos Voice Control in a new region,” Dureau continues, “we will do everything we can to make sure it works with the variety of accents you find there.”

Also included in the Sonos Voice Control experience? Amazon Alexa and Google Assistant.

Sonos is all about choice, letting you stream from your favorite services, mix and match speakers to customize your system, and even use multiple voice assistants. So, listeners who want a little extra hands-free help managing their smart home devices, getting answers to questions, or ordering dinner can also use Amazon Alexa or Google Assistant on their Sonos system. Alexa even works with Sonos Voice Control on the same speaker.

“We just want to be the best conduit of sound,” explains McAllister.

Sonos Voice Control is now available in the US. Learn more here. For a deeper dive into the ins and outs of the technology behind the feature, visit our Tech Blog.

Read More