This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.  > Find out more here

 

All the latest UK technology news, reviews and analysis

Hadoop founder says big data platform to get faster and more interactive

by Rosalie Marshall

16 Oct 2012

View Comments

  • Tweet this
Big data illustration from 8 September 2011 Computing

Despite the success in recent years of the open source Hadoop big data analytics platform, the project's founder Doug Cutting has said he believes there is still a room for improvement.

Speaking to V3 Cutting, who now works for Cloudera having previously worked at Yahoo, said that Hadoop should be just as renowned for fast data processing as it is now for big data number crunching.

"Hadoop started out as a batch computing environment built around the MapReduce computing metaphor," said Cutting.

"People are storing more data and doing more batch analysis, but I think there will soon be a move to interactive online computing, where queries take seconds to run."

Hadoop is a collection of software, including a distributed file system which can handle large amounts of data storage, MapReduce which processes the data, and Common, which is the shared infrastructure that supports the project.

Companies can use Hadoop for the types of analyses that business intelligence tools and big data SQL analysis tools are not designed to handle.

The distributed file system is a batch processing system, a system where data is collected and processed on a batch-by-batch basis.

This means that while the Hadoop is highly scalable and allows users to query petabytes of data, the high latency that comes with batch processing slows down data analysis.

Cutting said that to improve big data analytics, a Hadoop format needs to be developed that allows data to be interoperable between different systems. He is currently working on a project to do this called [Apache] Avro.

Additionally, Hadoop processing needs to be pushed online and become less batch oriented, so queries should take seconds rather than minutes or hours, said Cutting.

"I think search technology will play more of a role in big data to make it more interactive, such as that of [Apache] Lucene," he said. "Hadoop certainly has a way to go in terms of improvements."

Do you agree

blog comments powered by Disqus

Poll

Business security poll

How concerned are you by the rising tide of cyber threats?

17%

55%

10%

9%

9%

Popular Threads

Powered by Disqus
BlackBerry Q5

BlackBerry Q5 video demo

BlackBerry's latest smartphone is a mid-tier handset that will cost less than the Q10 and Z10

Updating your subscription status Loading

Connect with V3.co.uk

Sign up to our daily or weekly newsletters

newsletter sign-up button

mcafee

7 requirements for hybrid web delivery

It's no longer one or other with web security; you can now have a virtualisation and SaaS hybrid model

navisite

BYOD: the implications for the IT team

BYOD is important for employee satisfaction, but poses challenges in terms of security, productivity loss and costs

PHP Developer - £30,000 - £35,000

PHP Developer £30,000 - £35,000 We are looking for...

Corporate Treasurer - Banking - London

Corporate Treasurer - Banking London - £70k-£120k...

Product Manager – Insurance (Telematics)

Product Manager – Insurance (Telematics) £40k-£50k...

Product Manager - East Sussex (South Coast) - £40k-£50k+Bens

Product Manager £40k-£50k+Bens + Relocation Package...

To send to more than one email address, simply separate each address with a comma.