You may notice a marked improvement in the audio quality of some YouTube Stories going forward, thanks to a new speech enhancement feature Google rolled out. A couple of years ago, the tech giant debuted the “Looking to Listen” AI technology that can pick out voices in a crowd. Now, it’s making the technology available to creators recording YouTube Stories on iOS devices.
Google taught Looking to Listen the correlations between speech and visual signals, such as the speaker’s mouth movements and facial expressions, by training it on a large collection of online videos. To ensure that it will work for everyone and won’t show bias, Google conducted a series of tests exploring its performance based on various visual and auditory attributes. Those attributes include the speaker’s age, skin tone, spoken language, voice pitch, visibility of their face, head pose, facial hair, presence of glasses and the level of background noise. They were able to determine, for instance, that the technology’s capability to enhance speech remains pretty consistent across speakers’ languages. Facial hair doesn’t seem to have a big effect on it either, though it works best on faces with no facial hair and those with a close shave.