24 May 2002
A breakthrough in video technology has given researchers at Massachusetts Institute of Technology (MIT) the power to animate images of real people saying words they've never actually spoken.
Tomaso Poggio, an investigator with MIT's McGovern Institute for Brain Research and Tony Ezzat, an MIT graduate student in electrical engineering and computer science, have simulated mouth movements that look so real that they convince most viewers of their authenticity.
According to the researchers, given a few minutes of footage of any individual, the team can pair virtually any audio to any videotaped face, matching mouth movements to the words.
Poggio, who investigates learning theories that can be applied to understanding the brain and building intelligent machines, said: "This human animation technique is inevitable, it's just another step in progress that has happened over the last several years."
The MIT team's software records facial expressions while a person speaks into a camera, and learns to associate the images with sounds.
Using that database, a false image of the person can be synthesised to a soundtrack of new words.
According to Poggio, the work could improve the man-machine interface by putting a "real" face on computer avatars.
Instead of the unrealistic, cartoon-like images that now exist, computerised people could become much more lifelike.
He pointed out the technology has applications in the business, entertainment, speech therapy and education arenas.
The method could also be used for redubbing a film from one language to another and could be used in tasks such as eye-tracking, facial expression recognition and visual speech estimation.
Even so, the researchers recognise that the technology could be misused - to discredit political activists on television or to illegally use trusted figures to endorse products, for example.
"The work is still in its infancy, but it proves the point that we can take existing video footage and re-animate it in interesting ways," Ezzat said.
The team still needed to work on re-animating emotions, he said. "We cannot handle footage with profile views of a person, but we are making progress toward addressing these issues."
The researchers have already begun testing the technology on videos of Ted Koppel, anchor of ABC's Nightline, with the aim of dubbing a show in Spanish, Ezzat explained.
The work, which will be presented as a research paper in July at the Siggraph conference, is funded by the National Science Foundation and the New Technology Telescope (NTT) through the NTT-MIT Research Collaboration.
Latest stories from Web
Related articles
Related jobs
Poll
Are you confident that the UK's IT infrastructure is secure from attack in the wake of the Flame malware revelations?
Orange and Intel talk us through the ins and outs of their San Diego smartphone
Connect with V3.co.uk
Social networking is almost ubiquitous. This white paper examines the benefits and risks and it looks at the different ways companies can reconcile them
The importance of understanding your infrastructure
Are you looking for a new positing within the Testing...
A leading global provider of critical information to...
Want to work for one of the most dynamic, creative environments...
Want to work for one of the most dynamic, creative environments...
Keep up to date with the latest products, services and technologies from the world's leading IT companies. IThound.com brings you over 2,000 white papers, case studies and analyst reports.
Do you agree?