Forrester Wave Report
Leslie Stretch, Medallia CEO, and Borge Hald, Medallia Founder, Reflect on Being Named A Leader by Forrester in The Forrester Wave™ Customer Feedback Management Platforms, Q4 2018 Elena:...
Your message has been received and we will contact you shortly.
Get the best in Customer Experience content delivered straight into your inbox.
At Medallia, a key component of our system currently works with an open source relational db. Since this component mainly queries the db entries by key, we want to try to switch to a key-value storage system and take advantage of several benefits provided by such a system, including distributed replication, load balancing, and failover. One of our objectives is to re-architect this component in a way that will allow us to achieve horizontal scalability, that among other things will help us alleviate the high disk storage requirements we currently have.
Recently we took the time to look into this (and other technological improvements too, exciting times at Medallia right now!), and we reviewed several options. To make a long story short, we ended up with two finalists, Apache Cassandra and Project Voldemort. These two projects seem to be the most mature open source options in their class, and both provide a native decentralized clustering support including partitioning, fault tolerance, and high availability. Both are based on Amazon’s Dynamo paper, but the main difference is that Voldemort follows a simple key/value model, while Cassandra uses a persistency model based on BigTable‘s column oriented model. Both provide support for read-consistency where read operations always return latest data, which was a requirement for us.
While not an exhaustive list, these are the most relevant pros and cons we identified when reviewing both stores:
To our surprise this was the only link we’ve found that compares the performance for both projects – thus we decided to write this post to share our research. We used the vpork test framework, which we modified to suit our needs by upgrading the client code to the latest versions, adding a warm-up phase, and adding rewrite capabilities. These are the results of our tests:
We ran tests for 4 different write-rewrite-read configurations. A write is equivalent to a put operation with a new record (non-existing key). A rewrite is a put operation with an existing key. A read is a get operation on an existing key. These are the configurations we tested:
We ran all the tests for two different value sizes, 15 and 1.5 KB. Even though we evaluated different options, for our current needs, the last one with a 15 KB data entry was the most representative scenario.
The first pair of charts shows the latency, or average time it takes a read or write operation to complete in each case. Lower values are better. As expected, Cassandra write (and rewrite) times were consistently faster than Voldemort, while read times varied a bit depending on the scenario but were more or less the same in general.
The second pair of charts shows the maximum time in the best 99% of cases; again lower values are better:
On the front-end, we have a write-back cache which means that write operations don’t affect the user experience. On the other hand, read operations are directly related to page loads. That’s why we were concerned about the peak for Cassandra read in the last scenario for 15KB. We ran some further tests to measure the 99.9% and 99.99% percentiles and the difference was even greater: 5050 ms for Cassandra and 748 ms for Voldemort in the first case, and 9176 ms against 1129 ms in the second case. This huge difference was a key decision factor for us.
Finally, these two charts show the general throughput in terms of operations (read or write) per second. In this case higher values are better:
Issues found while testing:
I think there is no clear winner, in general terms. The best option depends on many factors that each company has to evaluate. My preference changed a few times during the review and tests.
Having said that, we had to choose one, and we decided to go with Project Voldemort. The main reasons were the simplicity, better versioning control, persistency layer maturity, and latency predictability.
We are currently developing the new solution, and it will take some time before we can put it in production, but we wanted to share our preliminary results with everyone who is considering one of these two options, so they’ll have one more tool at the time of the decision.
We’ll keep you posted on how it goes.
Lead Software Engineer
Other useful articles comparing different key-value stores: