A panel of experts at an event from V3's sister title Computing has revealed their mistakes when attempting to launch big data initiatives within their organisations.
Speaking at the Big Data and IoT Summit recently, Gael Decoudu, head of data science & digital analytics at Shop Direct said that his firm began by creating an impressive data lake, but then quickly drowned in it.
"The approach we took was to create massive data lake, to collect as much data as we could," said Decoudu. " So we invested money in that, then after couple of years realised that we couldn't do anything with it. So then we started thinking about hiring a team of data scientists and analysts who can find insights in that data," he added.
Dr Kevin Findlay, IT & digital board director at insurance firm Complete Cover Group, described the problems his organisation found with open source software.
"We went down the open source route - and we used [coding language] Python. There are just one or two platforms in the data lake world, based around the Hadoop infrastructure, so we went for one of those. Then one guy last year spent a lot of time writing neural network algorithms instead of just using the standard packages," said Findlay.
So far so good. But next Findlay admitted that whilst technically impressive, this didn't actually create any value.
"It was more for his own educational value though. The other side of the coin is that is there's a freely available Python library that does the same, so what's the point in making your own?"
Decoudu also had words to add about open source, suggesting that it can be hard to know which supporting tools and software to use.
"We've slowly moved on to open source, and we're now on AWS [Amazon Web Services], and we're starting to use [programming language] R. One of problems with open source is it's hard for someone who doesn't have lots of experience to pick the right package. There are probably 20 different ways of doing neural networks in R and Python, but which is the right one? Which should you use in a business setting? Getting that wrong can cost the company millions of pounds," said Decoudu.
Jude McCorry, head of business development, at The Data Lab advised firms to encourage their younger staff to be proactive.
"Some companies get excited about saying they hire graduates in data science programmes, but often those graduates just sit there and wait for work to come to them. They're supposed to be there to answer questions about the data, and be self starters," she argued.
Molybdenum ditelluride is a two-dimensional material that can be easily stacked into multiple layers to create a memory cell
New light-guiding nanoscale device can control and monitor a nanoparticle trapped in a laser beam with high sensitivity
Optical traps are scientific instruments in which a focused laser beam is used to exert an attractive or repulsive force on a microscopic object to hold it in place
Scientists estimate that the exoplanet has already lost up to 35 per cent of its mass over its lifetime
The observations were made using the Atacama Array in the Chilean desert