25 Jan 2006
IBM has released the source code of its Unstructured Information Management Architecture (UIMA) technology to the open source community.
The Java-based technology enables the analysis of information in unstructured files such as documents, images, comments, email messages and multimedia files.
Further reading
Rather than perform search queries based on keywords, UIMA allows users to look for concepts and related topics.
"We are making UIMA available to the open source community to encourage innovation and allow analytics software tools from multiple sources to work together and build on each other," said Nelson Mattos, vice president of information and interaction at IBM Research.
UIMA was first unveiled in 2004 and was developed by the Defense Advanced Research Projects Agency and IBM.
The technology has been adopted by several organisations including the International Federation of Pharmaceutical Manufacturers and Associations and the Memorial Sloan-Kettering Cancer Center as a foundation to develop knowledge management applications.
UIMA is also supported by knowledge management vendors including ClearForest, Cognos and Factiva.
The UIMA code is governed by the Common Public Licence, an official licence approved by the Open Source Initiative. The code has been made available on the SourceForge website.
IBM will act as the project's steward for now, but intends to move UIMA to a full open source community development model later this year.
Latest stories from Open Source
Related articles
Related jobs
Poll
What is the most important IT priority for your company this year?
Connect with V3.co.uk
This paper focuses on a series of best practices and techniques for development teams looking to improve their software development processes
Why good data management at all levels is essential in the modern business (video, 6mins)
UK Based Channel Sales Executive - Security and Service...
Graduate Developer - Manchester. My client has an opening...
.Net Graduate Developer - Manchester. My client is looking...
Accounting Business Analyst/Systems Accountant (Back...
Keep up to date with the latest products, services and technologies from the world's leading IT companies. IThound.com brings you over 2,000 white papers, case studies and analyst reports.
Do you agree?
Nice thoughts, but already implemented in InfoCodex.
Thanks for the interesting article. Once again IBM is giving us a great vision about the future and how unstructured information can be searched. InfoCodex already does all this today with the help of a linguistical database and synonym and/or similarity search across 5 languages (German, French, Italian, English and Spanish). With InfoCodex you can search for a block of text in one language and it will find you all the similar documents in the other languages as well. All of this is done without one single minute of training - because of the linguistical database that contains 2.9 Mio words and terms (i.e. "European Court of Justice" or "The President of the United States" are terms and reconized as such).
Posted by: Zeno Davatz 09 Aug 2006