Danko

Stuff.

Rails on SmartOS Part 1

| Comments

As a Rails developer, I’m often not asked questions about deployment platform. Why would anyone?

The Rails community has really easy options for deployment so it’s not always something we think a whole lot about. We have Engine Yard, Heroku, capistrano setups for deploying to VPS’s that are pretty much magic, and things like the Rubber gem that make it pretty much effortless to deploy a machine in the Amazon cloud.

All of the above options will have your deployed application running on a Linux distribution by default. Linux is great. It’s free, it’s tested, it runs on everything.

In a former career I had a Ruby app that was very very IO intensive. How intensive? 250,000,000 SNMP GETs in a 24 hour period, each and every one of them had to be persisted. A different app worked on around the same time managed massive amounts of video content with individual file sizes measured in terabytes.

Those problems of scale aren’t the ones the usual Rails folks think about when thinking of problems of scale. Linux is good enough for Google, why couldn’t I make it work on that platform?

It’s not that I couldn’t, but there were other solutions that made life incredibly simple, eliminated the need for hardware expansion, added security features as an unexpected bonus and made it so I didn’t have to create a PA’s for more stuff.

That was about six years ago. We ended up deploying those two apps on OpenSolaris for a few reasons I’ll cover later. Since then the kernel for that OS has been forked, new projects and products have arisen and one really stands out as a great application platform: SmartOS from Joyent.

In this four part series we’ll cover four features of SmartOS that are the compelling reasons for developers to choose it as a platform. Part one is a history of the project’s roots and some great developer centric features around ZFS.

In posts 2-4 we’ll go over resource and machine virtualization, DTrace features and how to deploy a Rails app on SmartOS.

So first a little history on how they got to where they are before we get into why it’s a killer Rails deployment platform.

Sun Microsystems

So in the early 2000’s, Sun Microsystems was open sourcing everything they had as well as purchasing companies that had great products that were already open sourced, like MySQL.

Licensing was complicated though. Sun didn’t have the authority to release some of their existing products under licenses like the GPL, so they based a new license of their own making off the Mozilla Public License and called it the CDDL.

They’d released Java and many other products under a public license, but there was still a behemoth in the room that was still proprietary: Solaris.

So they set about the path to create the next version of Solaris under a public license. The kernel was named Nevada and the effort to build a fully fledged OS around it was Indiana.

The community behind these projects was a who’s who of the *NIX world. I could name names, but we only have so many pixels.

In 2009, the first Live CD was released with a full gnome-based desktop. The project was on a 6 month release schedule and things were going swimmingly.

After Oracle’s purchase of Sun, the release date for the next OpenSolaris slipped, and it slipped without a word. Two months later, Oracle announced they were discontinuing the product in favor of other options and OpenSolaris was no more.

Since this was an open source project, it lived on in forks. So why bother forking something like this when upstream is no more? It’s the features.

Feature 1: ZFS

One of the most compelling features that came from later editions of Solaris 10 was the new file system: ZFS. It was designed to be future proof and had some compelling features, such as:

  • An 128 bit storage system. Mathematically, this amount of storage has been calculated as needing the amount of energy to boil the earth’s oceans to be powered. Check it for yourself.
  • Guaranteed data integrity with a focus on preventing silent data corruption.
  • Storage pools. Storage is virtualized into pools meaning adding storage to a volume is as easy as plugging in drives.
  • Snapshots. ZFS has a copy-on-write transactional model that makes it possible to capture a snapshot of an entire working filesystem at once, then storing only the differences between that and the new filesystem as it continues to change.
  • Attack of the clones! The above file system snapshots can be cloned, and the clones only consume disk space that is the diff between the working copy and the clone. This operation is nearly instant.
  • It’s network centric. Shipping a filesystem across the network to another machine is as easy as running “zfs send”
  • The ARC (FS Cache), adaptable volume block sizes, deduplication, encryption, and many, many more features.

As a developer, you may think that these are mostly ops related problems. The two do eventually meet, so let’s go over some scenarios that can make your life easier with ZFS.

Brogrammer Bob and his Magical Migrations

Brogrammer Bob is getting ready to deploy to staging using capistrano. Prior to starting, he creates a ZFS snapshot for his Postgres installation.

Something with the data set didn’t go right and it wasn’t something that could be easily tested. The migration wasn’t wrapped in a transaction and it half completed. Bob restores his snapshot and goes back to the drawing board.

Deployer Dan and his Super Security Patches

A critical security patch for the database is announced. Upgrading involves upgrading many OS packages as the problem is in a C library. One of the system daemons fails, causing chaos and mass hysteria. No problem though, as Dan made a system wide snapshot before starting. He reboots the system, chooses the old boot environment from GRUB, and he’s back to where he started within 30 seconds of what was a catastrophic event.

Patricia and her Postgres

Our app has a lot of IO-bound queries. The database is quite large at this point and performance is quite poor. The filesystem that the Postgres data is on is migrated to one with ZFS’s compression turned on. Patty sees a 3x-4x speed-up in these queries.

Even with the overhead of compression, the queries are limited by disk throughput and not the CPU. We’re reading less data from the disk’s cache at this point, so boom, performance increase. In addition, seek times are much lower because compression reduces the physical distance between logical blocks.

Donny and Dedepulication

ZFS includes an option to deduplicate data on a per-block basis. Say our app holds buckets of files for groups of users. Outside of our application, our users start emailing around a crazy cat gif and drop it in their box-in-which-they-drop-things that we host. The 100 copies of the 5Mb GIF now only exist once.

A cat GIF is one thing, how about a few thousand developers storing their projects on your service? If our block size is set to 4K and we have file level deduplication on, all the files that are inevitably similar will cut down on your storage needs considerably.

Deduplication has an overhead however and it’s something you don’t want to willy-nilly just turn on. When you benefit from it, you can really benefit from it, when your dataset does not benefit from deduplication, the overhead can be more trouble than it’s worth. Fortunately, it’s easy to turn on and off.

For more information on dedup’s advantages and drawbacks, the following two links are for the technical minded:

https://blogs.oracle.com/bonwick/entry/zfs_dedup https://blogs.oracle.com/roch/entry/dedup_performance_considerations1

Summary

The features of ZFS are pretty compelling, but it’s just the start of the whole package that is an amazing platform for Rubyists.

Next time we’ll explore some of the virtualization features of SmartOS, something that doesn’t stop at virtual machines but allows you to have fined grained control over every aspect of the machine, including network traffic, and how you can use these things as a developer to deliver great product to your users.

Stop #3 on the SmartOS train will take us to DTrace and solving production issues.

The last stop will bring us to creating and deploying an app on SmartOS that leverages all the features we’ll eventually cover.

Comments