ClearBuds: the first wireless headphones that clarify calls thanks to deep learning


Engineering | Press releases | Search | Technology

July 11, 2022

The ClearBuds use a new microphone system and are one of the first real-time machine learning systems to work on a smartphone.Raymond Smith/University of Washington

As meetings moved online during the COVID-19 lockdown, many people found talkative roommates, garbage trucks, and other loud sounds disrupting important conversations.

This experiment inspired three University of Washington researchers, who were roommates during the pandemic, to develop better headphones. To enhance the speaker’s voice and reduce background noise, “ClearBuds” uses a new microphone system and one of the first machine learning systems to work in real time and work on a smartphone.

The researchers presented this project on June 30 at the ACM International Conference on Mobile Systems, Applications and Services.

“The ClearBuds differentiate themselves from other wireless headphones in two key ways,” said co-lead author Maruchi Kim, a doctoral student at the Paul G. Allen School of Computer Science & Engineering. “First, the ClearBuds use a dual microphone array. The microphones in each earbud create two synchronized audio streams that provide information and allow us to spatially separate sounds coming from different directions with higher resolution. Second, the lightweight neural network further enhances the speaker’s voice.”

While most commercial earbuds also have microphones on each earbud, only one earbud actively sends audio to one phone at a time. With ClearBuds, each earbud sends an audio stream to the phone. The researchers designed Bluetooth network protocols to allow these streams to be synchronized within 70 microseconds of each other.

The team’s neural network algorithm runs on the phone to process audio streams. First, it removes all unvoiced sounds. And then it isolates and enhances any noise coming from both earbuds at the same time – the speaker’s voice.

“Because the speaker’s voice is close and roughly equidistant from both earphones, the neural network can be trained to focus only on their speech and filter out background sounds, including other voices,” said co-lead author Ishan Chatterjee, PhD student at the Allen School. “This method is quite similar to how your own ears work. They use the time difference between sounds arriving at your left and right ears to determine where a sound is coming from.

A circular circuit leans against two 3D-printed earphone cases

Shown here, the ClearBuds hardware (round disc) in front of the 3D printed earphone cases.Raymond Smith/University of Washington

When researchers compared ClearBuds with Apple AirPods Pro, ClearBuds performed better, achieving a higher signal-to-distortion ratio in all tests.

“It’s extraordinary when you consider the fact that our neural network has to run in less than 20 milliseconds on an iPhone which has a fraction of the computing power compared to a large commercial graphics card, which is typically used to run neural networks,” said co-lead author Vivek Jayaram, a PhD student at the Allen School. “That’s part of the challenge we faced in this article: how do you take a traditional neural network and reduce its size while preserving the quality of the output?”

The team also tested ClearBuds “in the wild,” recording eight people reading Project Gutenberg in noisy environments, such as a cafe or on a busy street. The researchers then asked 37 people to rate 10- to 60-second clips from these recordings. Participants rated clips processed through the ClearBuds neural network as having the best noise cancellation and the best overall listening experience.

  • For more information, see the team’s project page.
  • The hardware and software design of ClearBuds is open source and available here.

One of the limitations of the ClearBuds is that people have to wear both earbuds to get the noise-canceling experience, the researchers said.

But the real-time communication system developed here can be useful for a variety of other applications, the team said, including smart speakers, tracking robot locations or search and rescue missions.

The team is currently working on making the neural network algorithms even more efficient so that they can work on headphones.

Additional co-authors are Ira Kemelmacher-Shlizerman, associate professor at the Allen School; Shwetak Patel, professor at both the Allen School and the Department of Electrical and Computer Engineering; and Shyam Gollakota and Steven Seitz, both teachers at the Allen School. This research was funded by the National Science Foundation and the Reality Lab at the University of Washington.

For more information, contact the team at [email protected]

Tag(s): College of Engineering • Ira Kemelmacher-Shlizerman • Ishan Chatterjee • Maruchi Kim • Paul G. Allen School of Computer Science & Engineering • Shwetak Patel • Shyam Gollakota • Steven Seitz • Vivek Jayaram

Source link


Comments are closed.