Google's AI Can Lip Read Better Than Humans Read Count : 160

Category : Articles

Sub Category : N/A
Researchers from Google’s AI division - DeepMind and the University of Oxford have used artificial intelligence to create the most accurate lip-reading software ever. Using thousands of hours of TV footage from the BBC, scientists trained a neural network to annotate video footage with 46.8 percent accuracy. 

DEEPMIND’S AI PROGRAM WAS TRAINED ON 5,000 HOURS OF TV.
More than 5,000 hours of footage from TV shows including Newsnight, Question Time, and the World Today, was used to train DeepMind’s “Watch, Listen, Attend, and Spell” program. The videos included 118,000 difference sentences and some 17,500 unique words, compared to LipNet’s test database of video of just 51 unique words.

DeepMind’s researchers suggest that the program could have a host of applications, including helping hearing-impaired people understand conversations. It could also be used to annotate silent films, or allow you to control digital assistants like Siri or Alexa by just mouthing words to a camera (handy if you’re using the program in public).

Researchers say that there’s still a big difference in transcribing brightly-lit, high resolution TV footage, and grainy CCTV video with a low frame rate, but you can’t ignore the fact, that artificial intelligence seems to be closing this gap.

Comments

  • Mar 23, 2017

Log Out?

Are you sure you want to log out?