Hadoop specialist MapR Technologies has unveiled a stream processing tool designed for real-time event handling and high scalability, which it is combining with its existing storage and NoSQL tools to deliver what it claims as the industry's first converged data platform.
MapR Streams is currently available only under an early access programme, but is set for general availability from early 2016. When combined with MapR-FS for storage and the MapR-DB in-Hadoop NoSQL database, it makes up the MapR Converged Data Platform, a single system for streams, file storage, databases and analytics, according to MapR.
"Unlike any other event streaming system that is on the market today, MapR Streams is integrated into a full data platform, and at the top level what that means is converging the data in motion with the data at rest, which really speeds the agility a company can have in understanding the data they've just collected or created," MapR chief marketing officer Jack Norris told V3.
MapR describes the new tool as a global publish and subscribe event streaming system that can scale to handle billions of messages per second, and which can tie together clusters operating in geographically separate data centres across the globe.
The need for a tool like this is due to the fact that data tends to flow in from sources on an ongoing basis, MapR said, while many previous big data platforms have focused just on processing previously collected information on a batch by batch basis.
"The reality is that big data is generated one event at a time, and those events can be sensor feeds, devices and biometrics and so forth, or they can be call logs or system logs that are related to the status of certain machines, but they can also be transaction based, tracking customer interactions across transactional systems as well as social media," Norris said.
"The ability to control and analyse those events as fast as possible, and in an integrated fashion, is what's so important about MapR Streams," he added.
Combining MapR Streams into a Converged Data Platform also eliminates multiple siloes of data for streaming, analytics and traditional systems of record, which can lead to data duplication and latency issues, the firm said.
"With a converged platform, you've got it all together in a single cluster, and that leads to much faster processing of the data," Norris claimed, enabling organisations to analyse and act on events as they happen, a facility that could become more and more important with the Internet of Things (IoT).
MapR Streams is built on top of the MapR Data Services layer, so it inherits the scalability, performance, and reliability of the core platform, all with a unified security framework, the firm said. It also supports standard application programming interfaces (APIs) such as those of the Apache Kafka project, enabling integration with other popular stream processing frameworks like Spark Streaming, Storm, Flink and Apex.
When available, the MapR Converged Data Platform will be offered as a free to use Community Edition to allow developers to experiment with the system; and in a paid-for Enterprise Edition consisting of one or more of the components, backed by enterprise-grade service level agreements (SLAs) for high availability, data protection and disaster recovery.
Some parts of Atacama have not received rainfall for 500 years - but a sudden deluge of water upset the Desert's delicate biological balance
Spitzer Space Telescope could not spot Oumuamua, suggesting that it is actually pretty small
Greenland crater one of the 25 largest impact craters on Earth
This long-sought progenitor star was identified in an image captured by Hubble in 2007