Cold Hard Code

A review of Varnish.

After several respected peers mentioned Varnish, I decided to give it a try.  For reasons that are fairly complex, so here's the story from the top.

The way our MogileFS cluster works here (and I'm just about finished with the new front-end media management code, so stay tuned next week) is quite simple. 

To start with, when designing it, I threw out the idea that we should pregenerate any of the image sizes/formats.  Instead, we simply filter and scale to a max size (configurable, but currently set to 1280x1024) and then we throw out all the meta data (EXIF profiles, etc) that we can.  This drastically reduces the size on disk, and then to spit out resized images it doesn't take long (I haven't finished stress testing this yet, but I'll post when I do).

So, now I have files stored in the MogileFS cluster that are stripped down versions of the originals.  For the sake of simplicity, just think of those as the original (if I were running something like SmugMug, I would store it in the original size, but then extract the meta data into a separate data node).

Now, when a request comes in for a media asset, we just enforce it follows a specific convention.  In our case, the URL looks like:
http://static.cartionary.com/images/{uuid}/{mutations}.{ext}
For now, we just serve images, so that part is easy to deal with.  The UUID is the unique identifier that we use to store the image and {mutations} is any series of programmatic mutations.

If the {uuid} isn't known to the system (the test is if MogileFS knows the key), it returns 404.

If it is, the system then parses {mutations}, which can also simply be 'original' and it will return out the original copy (or rather, the raw data stored in MogileFS).  Mutations are a format that I've simply concocted to determine rotation, scaling and anything else we can come up with.  For example: "s=s" means "size=small" (other sizes are "m", "l", "xl").

Combining that upstream with as permanent of a cache as we can gives us better results, because we don't have to generate thumbnails that may never be used.  They're generated on demand, and then cached on our proxy for as long as we reasonably can keep them.

The reason why I went with Varnish, over lighttpd+mod_cache or Squid, is simply that Varnish (from the varnishcmd command) allows you to purge cached items via regular expression.  Which means that if we want to delete an image from the entire cluster we just have to delete it from MogileFS then issue:
varnishadm -T :6082 url.purge {uuid}
Done!  All requests for that UUID will end up as a 404.

That's the reason why I picked Varnish over lighttpd+mod_cache and Squid (or, rather to be more specific, picking something that doesn't rely mostly on HTTP PURGE) because we'd have to know every image ever generated (that's a lot).  This is also why I decided to do away with the idea of generating permanent thumbnail and other modified images.

Every time you need to do some operation, or make some change, you're stuck with however many user contributed images * number of generated images.  It's costly, silly, and most of those images will sit dormant except for the occasional .  CPU time is really really cheap. Disk space, too.  So make your caching huge and your media cluster fast.

The decision really comes down to the fact that I'd rather have the CPU deal with on-demand requests, and have the user have to wait an extra second or two, then have a developer have to get "creative".  The reality of it is that a user, in most cases, won't have to wait.

When an image is uploaded, have a job (or preload it in the resulting HTML displayed to the user) that asks the caching cluster for the expected image sizes (small, medium, large) and then they're saved until the cluster runs out of disk space, then the least recently used items fall off and won't be missed.

I'm happy to be building this system in a day and age where I can make that decision and expect it to work out well.  I've dealt with two other high traffic photo-centric organizations, and I experienced the pain of not doing it this way.

I'm eager to experience the pain of doing it my way now.
jshirley

Written by Jay Shirley

Jay Shirley combines technical fundamentals with modern, practical savvy. An open source veteran with plenty of notches in his personal and professional belt, the combination of his work and his field vision (soccer metaphor!) has few rivals.

Comments