Google works out a fascinating, slightly scary way for AI to isolate voices in a crowd | Ars Technica


21 bookmarks. First posted by brokenrhino april 2018.


[…] this week, a team within the tech giant attempted to replicate the cocktail party effect, or the human brain's ability to focus on one source of audio while filtering out others—just as you would while talking to a friend at a party.
Google's method uses an audio-visual model, so it is primarily focused on isolating voices in videos. The company posted a number of YouTube videos showing the tech in action
google  video  audio  spunti  strumenti 
april 2018 by nicoladagostino
Aiding smart speakers like the Google Home in their ability to recognize individual voices seems like another use case, but because this model is focused on video, it would likely work better with a speaker with a display, like Amazon's Echo Show. Earlier this year, Google opened up the Google Assistant to "smart display" devices like the Echo Show, but the company hasn't released one itself.
audio  google  ai 
april 2018 by libbymiller
Google researchers have developed a deep-learning system designed to help computers better identify and isolate individual voices within a noisy environment. via Pocket
IFTTT  Pocket 
april 2018 by domingogallardo
Jeff Dunn:
<p>The company says this tech works on videos with a single audio track and can isolate voices in a video algorithmically, depending on who's talking, or by having a user manually select the face of the person whose voice they want to hear.

Google says the visual component here is key, as the tech watches for when a person's mouth is moving to better identify which voices to focus on at a given point and to create more accurate individual speech tracks for the length of a video.

<a href="https://research.googleblog.com/2018/04/looking-to-listen-audio-visual-speech.html">According to the blog post</a>, the researchers developed this model by gathering 100,000 videos of "lectures and talks" on YouTube, extracting nearly 2,000 hours worth of segments from those videos featuring unobstructed speech, then mixing that audio to create a "synthetic cocktail party" with artificial background noise added.

Google then trained the tech to split that mixed audio by reading the "face thumbnails" of people speaking in each video frame and a spectrogram of that video's soundtrack. The system is able to sort out which audio source belongs to which face at a given time and create separate speech tracks for each speaker. Whew.</p>


Creepy machine learning! Let's continue that thread...
google  ai  audio 
april 2018 by charlesarthur
Someday, this is going to be transformational for the hearing impaired.
from twitter_favs
april 2018 by brycej
RT lunamoth : Google works out a fascinating, slightly scary way for AI to isolate voices in a crowd | Ars Technica http://bit.ly/2IWFs6k // 구글의 음성 사운드 분리 기술 April 15, 2018 at 08:17PM http://twitter.com/lunamoth/status/985477310756413445
IFTTT  Twitter  ththlink 
april 2018 by seoulrain
Google researchers have developed a deep-learning system designed to help computers better identify and isolate individual voices within a noisy environment.

As noted in a post on the company's Google Research Blog this week, a team within the tech giant attempted to replicate the cocktail party effect, or the human brain's ability to focus on one source of audio while filtering out others—just as you would while talking to a friend at a party.

Google's method uses an audio-visual model, so it is primarily focused on isolating voices in videos
twig  662 
april 2018 by leolaporte
Google researchers showcase AI tech that can isolate individual voices within a noisy environment in videos with a single audio track by watching mouth movement
april 2018 by joeo10
via Starred items from BazQux Reader https://ift.tt/1cAKc9M and IFTTT
Starred  items  from  BazQux  Reader 
april 2018 by stinkingpig
Google researchers have developed a deep-learning system designed to help computers better identify and isolate individual voices within a noisy environment.
Archive  Pocket  feedly  ifttt 
april 2018 by brokenrhino