Intel has released an open source tool designed to improve firms' handling and analysis of unstructured data.
Intel said that its GraphBuilder tool would aim to fill a market void in the handling of big data for computer learning. Currently available as a beta release, the tool allows developers to construct large graphs which can then be used with big data analysis frameworks.
"GraphBuilder not only constructs large-scale graphs fast but also offloads many of the complexities of graph construction, including graph formation, cleaning, compression, partitioning, and serialisation," wrote Intel principal scientist Ted Willke.
"This makes it easy for just about anyone to build graphs for interesting research and commercial applications."
Willke said that the tool was developed in a collaboration with researchers at the University of Washington in Seattle. The teams sought to address a perceived hole in the market for tools to build the graph data used for many big data analysis activities.
"Scanning the environment, we identified a more general hole in the open source ecosystem: A number of systems were out there to process, store, visualise, and mine graphs but, surprisingly, not to construct them from unstructured sources," Willke explained.
"So, we set out to develop a demo of a scalable graph construction library for Hadoop."
The researchers estimate that GraphBuilder can help big data platforms analyse data as much as 50 times faster than the conventional MapReduce system.
The project is one of many research efforts dedicated to improving the performance of big data analysis platforms. Last month, researchers from the University of California Berkeley showcased a pair of technologies dubbed 'Spark' and 'Shark' which promise to dramatically improve the performance of the Apache Hive big data system.
The big data market has been suffering from a general lack of qualified analysts and developers, say vendors. Companies have sought to help bridge the gap by extending training efforts and partnerships with universities.