Basho Technologies has launched an integrated platform that combines the firm's Riak distributed NoSQL database with a suite of supporting services, such as a data store and the Apache Spark processing framework and management tools, to deliver a complete system for big data processing.
Set to be available from June, the Basho Data Platform builds on the Riak database, now known as Riak KV (for key-value), and is intended to greatly simplify the deployment and operation of a big data platform to support mission-critical active workloads at organisations.
"People select Riak when they need a database that stays on no matter what happens. They have mission-critical systems that cannot go down or cannot suffer a mis-write," Basho's EMEA managing director Emmanuel Marchal told V3.
Basho is now extending the operational simplicity and reliability of Riak to turn it into a complete big data solution. The firm has integrated a number of extra tools and technologies with an eye on the scalability and reliability of the complete platform.
Along with the Riak KV database, the new platform integrates Basho's Riak S2 (formerly Riak CS) object storage back end and a number of service instances, including Apache Spark, the Redis in-memory data store for caching, and the Apache Solr search engine.
However, Marchal explained that Basho intends to add further data storage back ends to the Basho Data Platform in the future, as well as expanding on the number of service instances bundled with the software.
Basho also provides additional tools, including message routing, networking, and co-ordination, as well as data replication and cluster management and monitoring services across all the nodes of a Basho Data Platform deployment.
The aim is to offer customers a single solution for big data processing that is relatively easy to set up and operate. Part of the value-add it brings to the table are tools to ensure high availability and scalability of components such as Redis and Spark, according to Marchal.
"To deploy Spark as a highly available fault-tolerant cluster, things get a bit complex. Spark is a master-slave architecture and one node need to be the leader. As soon as you do that, you have a single point of failure, so you need to put something in place to overcome that," he said.
Organisations often deploy tools such as Apache Zookeeper, but this is "not trivial to use", Marchal said, so Basho instead uses the core technology in Riak and the data platform to provide high availability support for Spark.
Meanwhile, the Spark connector included with the Basho Data Platform allows the Spark cluster to retrieve operational data from Riak, perform the analytical calculations and then send the result back to Riak.
Basho did not disclose pricing for the Basho Data Platform but, as with most software based on open source, it is offered to enterprise customers as part of a service and support agreement.
HP and Centrica are the first industry partners to sign up to the government's new Code
New ice grows faster but is also more vulnerable to weather and wind
With a crackdown on cheats is coming in November, PUBG rushes to fix matchmaking problems introduced in Update #22
New material uses carbon dioxide from the air to repair and reinforce itself