A group of computer scientists have devised a system capable of analysing a stream of video and producing a text-based description of its content, in a development that could be a boon for firms that employ those with visual impairments.
The system, dubbed Video in Sentences out (Viso) aims to describe who did what to whom, using object-detection and tracking tools, which analyse each video frame to build up a picture of what is happening.
While the system is still in its infancy, it is able to distinguish between objects such as bags and ball, skateboards and trucks.
The team, led by Andrei Barbu, of Purdue University, also taught the system to distinguish between the different types of body positions people may adopt and a variety of actions that they may be likely to undertake – such as bending down, sitting or running.
That was necessary because for the system to be effective, it had to be able to understand what was happening to an object while simultaneously determine the roles that objects play in events.
“Producing such rich descriptions requires determining event participants, the mapping of such participants to roles in the event, and their motion and properties,” the team wrote in a paper outlining their work.
The system was tested against a catalogue of video samples originally designed for testing US military research group DARPA's Mind's Eye automatic surveillance technologies.
The sentences produced by Viso – such as “the upright person hit the big ball” - were then compared against the descriptions generated by people signed up to Amazon's Mechanical Turk system.
While not yet perfect, Viso produced descriptions that were deemed in close agreement with the human-authored descriptions in about half of all cases.
The work was recently published on the arXiv [PDF] repository of academic papers.
Worried about data privacy? Here are several ways to secure your Facebook account
The ICO is seeking an urgent warrant to investigate a major data breach - everything you need to know as the story continues to unfold
Microsoft comes up with a new way to foist its unloved and little used Edge web browser on people