A group of computer scientists have devised a system capable of analysing a stream of video and producing a text-based description of its content, in a development that could be a boon for firms that employ those with visual impairments.
The system, dubbed Video in Sentences out (Viso) aims to describe who did what to whom, using object-detection and tracking tools, which analyse each video frame to build up a picture of what is happening.
While the system is still in its infancy, it is able to distinguish between objects such as bags and ball, skateboards and trucks.
The team, led by Andrei Barbu, of Purdue University, also taught the system to distinguish between the different types of body positions people may adopt and a variety of actions that they may be likely to undertake – such as bending down, sitting or running.
That was necessary because for the system to be effective, it had to be able to understand what was happening to an object while simultaneously determine the roles that objects play in events.
“Producing such rich descriptions requires determining event participants, the mapping of such participants to roles in the event, and their motion and properties,” the team wrote in a paper outlining their work.
The system was tested against a catalogue of video samples originally designed for testing US military research group DARPA's Mind's Eye automatic surveillance technologies.
The sentences produced by Viso – such as “the upright person hit the big ball” - were then compared against the descriptions generated by people signed up to Amazon's Mechanical Turk system.
While not yet perfect, Viso produced descriptions that were deemed in close agreement with the human-authored descriptions in about half of all cases.
The work was recently published on the arXiv [PDF] repository of academic papers.
Could be used for everything from search-and-rescue robots to wearable tech
Don't require the rare material being mined from the mountains of South America
IBM hopes that its new tool will avoid bias in artificial intelligence
Found by calculating the strength of the material deep inside the crust of neutron stars