Users of the popular big data analytics platform Hadoop can now process queries on data in real-time due to a breakthrough from Cloudera.
Cloudera, the commercial arm of Hadoop, has launched Impala, an open source, Apache-licensed, real-time query engine that works on data stored in the Hadoop Distributed File System and HBase.
Impala will allow organisations to query petabytes of data at once. The product is still in beta and can be downloaded from the Cloudera website.
"The motivation for Hadoop was always to process large amounts of data, but the question was whether Hadoop could go beyond batch-processing," Cloudera chief operating officer, Kirk Dunn, told V3.
"Two years ago Cloudera launched the project Impala to try and speed up the analytics process so Hadoop could return data fast enough to inform business decisions in real time. You can now use Impala to get results in seconds. It's a large and fundamental advancement in the platform."
Cloudera Enterprise will also soon be available to Hadoop users, as an optional management and support subscription module. Dunn said the offering will be available from the start of next year.
Only last week, Hadoop founder Doug Cutting told V3 the platform was about to get faster and more interactive.
Hadoop is a collection of software, including a distributed file system which can handle large amounts of data storage, MapReduce which processes the data, and Common, which is the shared infrastructure that supports the project.
Companies can use Hadoop for the types of analyses that business intelligence tools and big data SQL analysis tools are not designed to handle.
The distributed file system is a batch processing system, a system where data is collected and processed on a batch-by-batch basis.
This has meant that while the Hadoop is highly scalable and allows users to query petabytes of data, the high latency that comes with batch processing has until now slowed down data analysis.
Cloudera is celebrating the launch of Impala as the first management solution that allows batch and real-time operations to be performed on large amounts of data at the same time.
And, yep, it'll run Android rather than RiscOS
US engineering giant's cost-cutting outsourcing plan is on the rocks, according to insiders
HP Envy X2 laptop only affordable if you've got loadsamoney
Counterfeit code-signing certificates enabling hackers to hide malware being sold by cyber criminals
Certificates can be used as part of layered obfuscation to evade detection by anti-virus software