DALLAS: Online auction giant eBay has revealed how it makes use of its vast collections of data to conduct experiments on its users in order to maximise selling prices and customer satisfaction.
Speaking at the Teradata Partners conference in Dallas, Texas, eBay principal architect Tom Fastner told an audience of database professionals how the firm takes on the challenges and pressures of dealing with such large amounts of information.
"If you're a customer, you're part of our experimentation," he said. "We do this with everybody who's online and we're running 200 of these tests at the same time."
The tests in question range from barely-noticeable alterations to the dimensions of product images right up to complete overhauls to the ways in which content for users' personal eBay home pages is displayed.
"We have data about everything," he said. "We know what they look at, we know what they like, we know what they buy."
In monitoring their 100 million customers' interactions - from every button they click to every product they buy - eBay creates 12TB of data per day which is continually added to a 4 petabyte table containing 4tn rows of data. As the data is queried both by automatic monitoring systems and employees looking to find more meaning from it, data throughput reaches 100 petabytes (102,400TB) per day.
As eBay seeks to achieve the highest buying price possible for all items users place for sale, because it takes a cut of each sale, its data scientists look at all variables in the way items are presented and sold. Fastner explained that as eBay acquired and integrated firms that were using the open-source Hadoop database framework, they were able to perform more advanced analytics on items for sale on eBay:
"One strength of Hadoop is that you can handle any type of data including pictures," he said. "We were wondering if the quality of the picture in a listing had an impact on selling price. To do that we moved a couple of petabytes of pictures from our picture servers to Hadoop, analysed the pictures and got some more structured information such as how much they were sold for, how many people viewed. "We were able to prove that [better image quality] actually does provide a better price."
Fastner was quick to warn however that Hadoop is not the magic bullet for all big data analytics, explaining that while the platform is free and open source, its costs in terms of less efficient performance may not make it the best value for all use cases.
Dubbed Barnard's star B, newly discovered planet is believed to be rocky
Also, what's a USB stick?
Gravitational waves become extremely weak by the time they reach the Earth and require highly sensitive equipment for detection
The reactor topped out at 100 million° C