On moving from CouchDB to Riak

This week at work, we’ve started to move our data store from CouchDB to Riak. This represents 35 millions of documents, and a little bit more than 2TB of data.

What are we storing

We store a lot of data. It all comes from the web, via our various source collectors (feeds aggregator, crawler, APIs clients, ...). To summarize, we store:

  • full HTML content from the web pages
  • some meta-data extracted from the feeds and the pages
  • categorized links extracted from HTML (is the link inside the article, the blogroll, a comment, ...)
  • tweets (and their meta-data)
  • relations between tweets

And to store this, we use a lot of tools:

  • the main index is based on Solr
  • some secondary indexes on ElasticSearch
  • PostgreSQL to maintain some indexes on documents’ ID
  • MongoDB to store tweets relationships and some indexes
  • web pages (content and metadata) inside CouchDB
  • redis for cache (ranks, ...)

I’m pretty sure we’ve tested most of the NoSQL tools at this point. We’re pretty happy with most of the tools we use, except CouchDB.

Why we move away from CouchDB

We were already aware of Riak before we started using CouchDB, but we weren’t sure about trusting a new product at this point, so we decided, after some benchmark, to go for CouchDB.

After the first couple of months, it was obvious that this was a bad choice.

Our main problems with CouchDB is scalability, versionning and stability.

Once we store a document in CouchDB, we modify it at least twice after the original write. Each modification generates a new version of the document. This feature is nice for some use-cases, but we don’t need it, and there’s no way to disable it, so the size of our databases started to become really important. You’ll probably tell me “hey, you know you can compact your database ?”, and I’ll answer “sure”. The trouble is that we never managed to get it to compact an entire database without crashing (well, to be honest, with the last version of CouchDB we finally managed to compact one database).

The second issue is that one database == one file. When you have multiple small databases, this is fine. When you a have only a few databases, and some grow to more than 1TB, the problems keep growing too (and it’s a real pain to backup).

We also had a lot of random crashes with CouchDB, even if the last version was quite stable.

Why did we choose Riak

To store documents, we need persistance, so this eliminates redis (and redis still doesn’t handle sharding in a cluster configuration yet). We did some benchmark with MongoDB, but it was not the right choice for this kind of data. Obviously we could not store them in Solr, because that’s not what an index is for.

After eliminating the previously enumerated products, we had the following requirements in mind:

  • easy to replicate
  • no master/slave
  • a REST interface
  • sharding

We decided to take another look at Riak at this time. We started to do some benchmarks. Our first (huge) problem was there was no client for Perl, and since we’re mostly a Perl shop … We wrote one.

Next we started to read the documentation, the wiki and the mailing list. Germain (our sysadmin) started to throw questions around on the mailing list, and we were happy to get a quick answer each time.

At this point they released their benchmark tool, so we use it in combination with our own tests. Our first results were positive.

To conclude, after all these tests, we decided to go for Riak (and no, we didn’t try another product, because we couldn’t find another one which quite suited our needs that we hadn’t already tested before).

Our usage

Each web page and its metadata are stored in riak. As buckets are “free” with Riak, each website has its own bucket. We can access the page with a simple GET on website_id/page_id.

We don’t plan to use Map/Reduce, as we already index data in Solr/ElasticSearch. We also took a look at riaksearch, but we don’t feel the need to move our search infrastructure to this product.

So our only usage for riak is to store data.

The setup

We rolled out a cluster of 5 nodes and 1024 partitions. Each node has 8gig of memories, 3x1Tb disk in RAID5.

We choose bitcask for the backend. We did some tests with innostore, but bitcask was faster and less memory consuming. At first we were a little bit worried about bitcask. As this backend keeps in memory an index of each key you store, and as we store a lot of keys, we feared that at some point we would fill up the memory of the server. It turns out that the overhead is really small, and it consumes less memory than innostore. Also, innostore was really slow when starting up/shutting down the nodes.

There are some monitoring tools provided by basho when you use the entreprise product. There is also a small javascript interface that will give you some information about your cluster (read/write/...).

Basho

We decided to go for the commercial support, since Riak is still a young product, and we were not confident enough to dig in the code in case we encountered a problem (riak is written in Erlang, same as CouchDB).

We had some phone calls with the team, in order to help us to decide whether we wanted the support, and to explain what we wanted to achieve.

We also appreciated that there was no bullshit, like “our product is the best, you can use it to do anything”. For exemple, at some point, we were investigating the possibility to use Riak or Hadoop + cascalog to do some heavy operations on our logs to extract information. We discussed this idea with them, and they advised us to go for Hadoop, and told us that if needed, they could write for us a connector Riak → Hadoop.

The team is really great, Matt (our account manager) is quick to answer our questions, we got to talk to Sean at each phone call, and he explained use various technical details about Riak, bitcask, etc. And Mark sent us some goodies for our work on the Perl client and our activity on the mailing list.

One of the great contribution they do on the mailing list, is a recap every two days of what happened about Riak anywhere: it could be a discussion on IRC (with a link to the logs), an exchange on the mailing list, an interesting blog post from the community, ... This is really a good way to stay up to date with what is happening.

Conclusion

Our migration is not over yet, but we will publish some data in the following weeks to share our experience.

At this point, we (Germain and me) are quite pleased with our choice. We know we can have a quick answer when we encounter an issue, that there is someone to help us to do the right choice when are in doubt, and that the community (on IRC and on the mailing list) is really awesome.


blog comments powered by Disqus