The time has come for the semantic web to SPARQLPodTech recently published
three videos of Sir Tim Berners-Lee in action. The first was a presentation of
the semantic web at HP Labs in Palo Alto, California, the second was the
resulting Q&A session, and the third a brief interview with Robert Scoble.
The ever watchful Paul Miller of Talis has linked to all three
(tinyurl.com/2c6wm3).
The semantic web has been much heralded. A headline back in 2002 announced:
“The semantic web lifts off”. One of that article’s authors was Berners-Lee. Yet
six years later it is hard to use the term “lift-off” even though a lot of the
architectural underpinnings are already in place. Some tough problems lie ahead.
Trust, in particular.
Advertisement
It reminds me of the many IT projects that progress rapidly to 80% and then
slow to a crawl. However, Berners-Lee expects the new year to start with a very
useful new tool called SPARQL (yep, it’s pronounced “sparkle”). Like GNU before
it, it is a recursive acronym (it stands for SPARQL Protocol And RDF Query
Language) and is designed to pick up truly relevant information from the
internet in RDF (Resource Definition Framework) format.
Until now, web searches have parsed the content of web pages and assembled
the results as lists of apparently matching URLs. But this is a fairly hopeless
process when it comes to machine-processing the results.
Better to use RDF to describe entities within documents and other data stores
so that the resulting hits are credible and can be manipulated to give more
comprehensive and meaningful results.
What’s of interest is not the web pages or documents themselves but the
networks of things names, places, dates and so on that transcend the
documents. They will match the original search and may then unearth related
documents, or the query results may lead to further automatic searches. It’s a
bit like the agent technology we’ve been talking about for so long.
Fortunately, the W3C folk have invented a way to do some of the heavy lifting
needed to get us from where we are to where we need to be. It’s a way to extract
RDF from XML and XHTML sources. It’s called GRDDL (Gleaning Resource
Descriptions from Dialects of Languages).
One of Berners-Lee’s co-panellists in Palo Alto was Wendy Hall, from the
University of Southampton, who speculated about keeping her larder contents
online so delivery people could keep tabs on what needs replenishing. She said
she was amazed at how much time people have and how much they want to say about
themselves. Making information about yourself visible could lead to all sorts of
interesting consequences. She even started talking about revealing her vital
statistics, but then bit her tongue.
Which leads to security and trust. These are major issues for the semantic
web, certainly where personal data is concerned. It might be a case of the agent
software trawling through the links until it is sure of the provenance of a
piece of information or the authority of the user to receive it.
As someone who once wrote software to mimic the way the brain stores and
links information, I am only too aware how easy it is for ambiguities to creep
in. Presumably, by following the network of related links, software will be able
to disambiguate multiple uses of the same name or term.
Talis, which had a big hand in developing SPARQL and GRDDL, has incorporated
semantic web services in its platform. Using the APIs, it has developed Engage,
one of the first commercial semantic web apps.
Do you agree?
Have your say on this article