A panel of experts at an event from V3's sister title Computing has revealed their mistakes when attempting to launch big data initiatives within their organisations.
Speaking at the Big Data and IoT Summit recently, Gael Decoudu, head of data science & digital analytics at Shop Direct said that his firm began by creating an impressive data lake, but then quickly drowned in it.
"The approach we took was to create massive data lake, to collect as much data as we could," said Decoudu. " So we invested money in that, then after couple of years realised that we couldn't do anything with it. So then we started thinking about hiring a team of data scientists and analysts who can find insights in that data," he added.
Dr Kevin Findlay, IT & digital board director at insurance firm Complete Cover Group, described the problems his organisation found with open source software.
"We went down the open source route - and we used [coding language] Python. There are just one or two platforms in the data lake world, based around the Hadoop infrastructure, so we went for one of those. Then one guy last year spent a lot of time writing neural network algorithms instead of just using the standard packages," said Findlay.
So far so good. But next Findlay admitted that whilst technically impressive, this didn't actually create any value.
"It was more for his own educational value though. The other side of the coin is that is there's a freely available Python library that does the same, so what's the point in making your own?"
Decoudu also had words to add about open source, suggesting that it can be hard to know which supporting tools and software to use.
"We've slowly moved on to open source, and we're now on AWS [Amazon Web Services], and we're starting to use [programming language] R. One of problems with open source is it's hard for someone who doesn't have lots of experience to pick the right package. There are probably 20 different ways of doing neural networks in R and Python, but which is the right one? Which should you use in a business setting? Getting that wrong can cost the company millions of pounds," said Decoudu.
Jude McCorry, head of business development, at The Data Lab advised firms to encourage their younger staff to be proactive.
"Some companies get excited about saying they hire graduates in data science programmes, but often those graduates just sit there and wait for work to come to them. They're supposed to be there to answer questions about the data, and be self starters," she argued.
14nm Cavium ThunderX2 CPUs deployed in HPE Apollo 70 supercomputer for US National Nuclear Security Administration
MWR's Countercept platform and phishd technologies key to F-Secure acquisition
Brexit labour shortages will lead to higher adoption of robotics
Newbies will be thrown in with the big boys on Sanhok as Kar98 fodder