Liquent, a specialist in document content conversion, has announced a new suite of solutions that unlocks legacy content using XML as the central storage type.
As well as automating the process of integrating legacy document formats while rigorously maintaining format integrity, the software goes on to provide granular indexing and retrieval capabilities.
"Liquent is not a repository and it does not deploy," said Hugh Tamassia, the company's chief technology officer. "But it is a bridge between legacy documents and XML content elements."
Liquent, the name of which derives from 'liquid content', has a background of handling complex documents for the pharmaceutical industry where it boasts 33 customers within the top 50 global companies.
It had previously used the Adobe portable document format (PDF) as the common medium for transforming data, but Tamassia said: "PDF is not suited to all requirements and is too heavily controlled by its inventor Adobe. These problems are overcome by the World Wide Web Consortium XML open standard."
Three products - Xtent, Encore and Clarity - make up the suite. Xtent is an automatic content transformation engine that scans documents in any of 155 different formats, converting them to XML. In doing so, it maintains format accuracy, integrity and context, while also identifying the critical information required for XML structure creation.
It does this by using an XML schema comprising a combination of scalable vector graphics (SVGs) for visual fidelity, XHTML for data such as paragraphs, and extensible Stylesheet Language formatting objects for document styling.
The resulting XML can then be transformed again into a large number of formats such as HTML, PDF, SVG, Open eBook, Wireless Markup Language and Microsoft Word.
Leonor Ciarlone, senior consultant with US marketing research firm Cap Ventures, said: "Xtent fills a hole that is coming up rapidly in content management: how to quickly normalise all sorts of proprietary formats with all their glitches into a vendor neutral meta-language."
Encore provides bi-directional transportation of content with most repositories, including Documentum, Lotus Notes, OpenText, FileNet and Hummingbird, and these may even be spread across multiple locations.
The Clarity content classification and categorisation engine 'understands' and extracts meaning from the content which it can then classify against specific industry taxonomies. It can go on to trawl corporate and web sources for related information to enhance results, and integrate it with management systems and portals.
Ciarlone identified three separate market opportunities. The first was to help marry content management and content transformation through original equipment manufacturer relationships such as those already struck with Interwoven, Percussion and Software AG.
For enterprises one attraction would be its ability to push content between their repositories, categorising information so as to unlock and add value to their information assets.
This would be enhanced by the third opportunity of creating more vertical industry solutions to maximise customer benefits. "But it will be interesting to see how [Liquent] handles its opportunities," said Ciarlone.
Do you agree?
Have your say on this article