.
/v3-uk/analysis/2123281/analytics-challenges-mainstream-adoption
09 Nov 2011, Rosalie Marshall , V3
Industry analysts have pinpointed big data analytics as the next wave of innovation for the IT industry, while user case studies are starting to emerge that demonstrate the value big data can bring to society.
However, not all organisations are ready for the latest analytics technology, and professionals with the skills to make sense of big data are in short supply. This may mean that only the most forward-thinking organisations invest in big data analytics in the near future, and that it may be being years before the average business jumps on the bandwagon.
Big data analytics technology differs from traditional data warehousing and reporting technology because it can cope with extremely large data sets that often include unstructured data coming from a range of internet applications, devices and sensors. Big data environments are able to answer specific questions on historic as well as projected data.
According to the Gartner hype cycle, big data is showing the first signs of early adopter investigation this year, and the analyst firm believes that it will be one of the top enterprise technologies in 2012.
Meanwhile, Ovum has released bullish figures on how it expects big data analytics take-up to increase. A recent survey by the analyst firm showed that 44 per cent of businesses storing more than 1TB of data have plans to invest in the technology in the next two to five years.
Ovum analyst Tony Baer told V3 that the field of data science had seen a significant growth spurt since a number of large IT vendors planted stakes in the industry.
EMC bought Greenplum in 2010, and has since launched the Greenplum Data Computing Appliance (DCA), which combines storage and recovery options with the Greenplum massively parallel processing (MPP) database. The DCA also contains support for the Apache Hadoop database, which is the leading NoSQL analytics platform for processing unstructured data.
"Greenplum's strength is its MPP architecture, which means that the processing nodes can deal with multiple queries at the same time. For example, financial institutions using the database could query the nodes to find out which customers are over their credit card limits, at the same time as carrying out real-time credit checks," said Mark Sears, Greenplum solutions architect, in an interview with V3.
Because the DCA is also integrated with Hadoop technology, the sentiment of customers can be further processed and analysed at the same time, said Sears.
IBM, HP and Teradata are some of the other large IT vendors that have acquired pure-play big data analytics SQL firms in the last year. These being Netezza, Vertica and Aster Data respectively.
Exciting use cases of big data analytics are those that bring major changes to people's lives. One such initiative, the Global Viral Forecasting project, harnesses anthropological research in diseased areas with social media trends to prevent outbreaks of global pandemics.
A US example called Cabspotting uses a set of data only available in recent years - the GPS co-ordinates of taxis - to reveal where and when residents are moving around San Francisco.
Still, day-to-day uses of big data analytics are harder to find.
"There's lots of talk and lots of interest around the technology, but not many people leveraging it," said MWD analyst Helena Schwenk in an interview with V3. "That's because it's big data, which carries a lot of challenges."
Teradata business development manager Kevin Long said that big data analytics is still being explored, and that businesses are unlikely to commit large amounts of cash to projects that do not have a solid return on investment.
"At the moment the hype is ahead of business drivers," said Long. However, he added that there are some areas of big data analytics in which businesses are showing growing interest.
"Many businesses want to better understand customer transactions so they can better detect fraud. We also have a lot of interest from call centres that want to use the technology to see whether customer calls were negative or positive and connect this with more structured data, like how many calls were made into the centre," Long said.
Andrew De Rozairo, business development manager at SAP Sybase, said that big data analytics technology is still too expensive to become mainstream.
"It's challenging finding customers out there doing big data analytics because building projects that can handle big data requires huge amounts of cash," he said.
"Or if organisations decide to go down the Hadoop NoSQL route then the technology is cheap but the skills are expensive, as there are not a lot of professionals out there with Java coding skills."
Rozario suggested that big data analytics requires a new breed of professional with a wide range of skills, ranging from programming, data mining, statistics and mathematics, to more creative skills for drawing visualisations of data sets.
"How many people can have this kind of skillset out there?" asked Rozairo.
This shortage of data scientists has been picked up by the McKinsey Global Institute, which has forecast that by 2018 the US alone could face a shortage of up to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.
Rozairo added that big data analytics is further held back by data protection laws, which make some organisations wary of processing certain data.
But even given these multiple challenges, Rozairo urged businesses to start thinking about big data analytics. He pointed out that the financial crash occurred because mortgage brokers and banks had failed to recognise big data.
"We have one customer called CoreLogic who did analyse big data before the crash using Sybase IQ, and studied all detailed mortgage payment data for every residential mortgage applied for in the US. They warned people of what was going to happen but no one wanted to hear," said Rozairo.
Steve Jones, global head of master data management at Capgemini, warned organisations not to embark on any big data analytics projects until they are sure of having good quality data.
"It's easy to find patterns in massive amounts of data but this is no good if you don't have a clean set of data to start off with," he said.
Jones added that many organisations can analyse large amounts of data by integrating the data warehouses they already own. It's only when they need real-time processing of data such as CCTV footage, phone calls or comments left on social networking sites that they should consider updating their analytics suite to something geared towards processing structured and unstructured data.
"Unstructured data analytics is a whole different category people are taking on. It counts as big data even though the data may not be in the exabytes," said Jones.
Jones suggested that the big data analytics market is still relatively immature and that there is lots of room for the vendors in the space to flesh out their offerings.
"There is no key player leading the market, and no big data analytics product boxed up that everyone can use," said Jones.
"Vendors need to focus on integrating data quality and security into big data environments. They also need to produce environments that can be queried by both Hadoop and SQL technologies."
Jones said that EMC Greenplum DCA is one such environment, but pointed out that it still lacks interoperability with other types of processing technology businesses may use, such as those supplied by Oracle or Teradata.
A number of issues have to be addressed before the full potential of big data can be exploited by organisations, such as the price, maturity and interoperability of big data analytics technology. The shortage of professionals with the skills needed to manage big data is another challenge facing the industry.
Promoting use cases of big data analytics as they arise will alert organisations to the potential that can come with big data. However, before an organisation does embark on the next field of analytics, the IT department needs to put strong processes in place to ensure that business data is of a high standard, and can all be analysed in the same place.