A breakthrough in video technology has given researchers at Massachusetts Institute of Technology (MIT) the power to animate images of real people saying words they've never actually spoken.
Tomaso Poggio, an investigator with MIT's McGovern Institute for Brain Research and Tony Ezzat, an MIT graduate student in electrical engineering and computer science, have simulated mouth movements that look so real that they convince most viewers of their authenticity.
According to the researchers, given a few minutes of footage of any individual, the team can pair virtually any audio to any videotaped face, matching mouth movements to the words.
Poggio, who investigates learning theories that can be applied to understanding the brain and building intelligent machines, said: "This human animation technique is inevitable, it's just another step in progress that has happened over the last several years."
The MIT team's software records facial expressions while a person speaks into a camera, and learns to associate the images with sounds.
Using that database, a false image of the person can be synthesised to a soundtrack of new words.
According to Poggio, the work could improve the man-machine interface by putting a "real" face on computer avatars.
Instead of the unrealistic, cartoon-like images that now exist, computerised people could become much more lifelike.
He pointed out the technology has applications in the business, entertainment, speech therapy and education arenas.
The method could also be used for redubbing a film from one language to another and could be used in tasks such as eye-tracking, facial expression recognition and visual speech estimation.
Even so, the researchers recognise that the technology could be misused - to discredit political activists on television or to illegally use trusted figures to endorse products, for example.
"The work is still in its infancy, but it proves the point that we can take existing video footage and re-animate it in interesting ways," Ezzat said.
The team still needed to work on re-animating emotions, he said. "We cannot handle footage with profile views of a person, but we are making progress toward addressing these issues."
The researchers have already begun testing the technology on videos of Ted Koppel, anchor of ABC's Nightline, with the aim of dubbing a show in Spanish, Ezzat explained.
The work, which will be presented as a research paper in July at the Siggraph conference, is funded by the National Science Foundation and the New Technology Telescope (NTT) through the NTT-MIT Research Collaboration.
Connexin drops out of Ofcom auction due to start next week
SwiftKey users now send two billion emoji every week
Recruitment plans are 'most ambitious ever', claims Openreach HR director Kevin Brady
Samsung's under-the-hood improvements separate the S9 from the pack when it comes to the display