Up to 70 percent of online dating giant eHarmony's IT systems run on open source software, according to the firm's CTO Thod Nguyen, as his department has achieved a 95 percent reduction in the time taken to run its compatibility matching algorithms.
Nguyen told V3 that eHarmony is a big believer in open source software, with its most crucial infrastructure running on software that originated as free and open source. For the compatibility matching system the firm chose MongoDB, an open source noSQL database that Nguyen said can achieve both high scalability and high availability.
When the firm sought to improve the speed of its matching systems in 2012, it had only legacy systems in place that could complete matching algorithms in 15 days. "We need to have very rich profile data, especially if you're looking at your soul mate," Nguyen said. "People fill out 150 to 200 questions in the relationship questionnaire when they sign up to the site so we have tens of terabytes of data we need to process daily, bidirectionally, meaning that if you like a person, that person has to like you back."
"The models were becoming more and more complex," he said. "And with that we started looking for a different solution."
His department eventually settled on MongoDB following a long search, and transferred all of its data over to the new system in a matter of months. In addition to professional support from 10gen, the company behind the software, Nguyen cited strong community support as another benefit of using open source software systems.
As a result of the new implementation of the database, eHarmony is now able to process its one billion matches within 12 hours, a 95 percent reduction in time taken to scour the whole database. Nuguyen added that users "also get a lot better quality matches because we're able to process a lot more algorithmic matching functionality."
"The adoption of open source allows us to adopt things quicker than waiting to go from sales to contract to operation, plus cost is important to us," said Nguyen.
In addition to using open source software that includes Hadoop, Hive and Hibernate, eHarmony releases various pieces of its ideas into the code sharing community GitHub, including its "seeking" library for making queries against different data sources.
The firm claims it is responsible for 438 marriages in the US every day, equating to roughly four percent of weddings in the country, with its services also available in Australia, the UK, Brazil and Canada.
Russell Group slammed for misusing student data in donation campaigns
Linus Torvalds is unhappy with current approaches to Linux security
Bug prevents ASLR from randomising location of important data
Organisations will work together on research projects to benefit UK business