I have a long backlog of things to write about. One of those things is Varnish (more on that in a future post). So, over these Christmas holidays, while intentionally taking a break from real work, I decided to finally do some of the research required before I can really write about how Varnish is going to make your web apps much faster.
To get some actual numbers, I broke out the Apache Benchmarking utility (ab), and
decided to let it loose on my site (100 requests, 10 requests
concurrently):
ab -n 100 -c 10 http://seancoates.com/codes
To my surprise, this didn't finish almost immediately. The command ran for what seemed like forever. Finally, I was presented with its output (excerpted for your reading pleasure):
Concurrency Level: 10
Time taken for tests: 152.476 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 592500 bytes
HTML transferred: 566900 bytes
Requests per second: 0.66 [#/sec] (mean)
Time per request: 15247.644 [ms] (mean)
Time per request: 1524.764 [ms] (mean, across all concurrent requests)
Transfer rate: 3.79 [Kbytes/sec] received
Less than one request per second? That surely doesn't seem right.
I chose /codes because the content does not depend
on any sort of external service or expensive server-side
processing (as described in an earlier post). Manually browsing to this same URL
also feels much faster than one request per second.
There's something fishy going on here.
I thought that there might be something off with my server configuration, so in order to rule out a concurrency issue, I decided to benchmark a single request:
ab -n 1 -c 1 http://seancoates.com/codes
I expected this page to load in less than 200ms. That seems reasonable for a light page that has no external dependencies, and doesn't even hit a database. Instead, I got this:
Concurrency Level: 1
Time taken for tests: 15.090 seconds
Complete requests: 1
Failed requests: 0
Write errors: 0
Total transferred: 5925 bytes
HTML transferred: 5669 bytes
Requests per second: 0.07 [#/sec] (mean)
Time per request: 15089.559 [ms] (mean)
Time per request: 15089.559 [ms] (mean, across all concurrent requests)
Transfer rate: 0.38 [Kbytes/sec] received
Over 15 seconds to render a single page‽ Clearly, this isn't
what's actually happening on my site. I can confirm this
with a browser, or very objectively with time and
curl:
$ time curl -s http://seancoates.com/codes > /dev/null
real 0m0.122s
user 0m0.000s
sys 0m0.010s
The next step is to figure out what ab is actually
doing that's taking an extra ~15 seconds. Let's crank up
the verbosity (might as well go all the way to 11).
$ ab -v 11 -n 1 -c 1 http://seancoates.com/codes
(snip)
Benchmarking seancoates.com (be patient)...INFO: POST header ==
---
GET /codes HTTP/1.0
Host: seancoates.com
User-Agent: ApacheBench/2.3
Accept: */*
---
LOG: header received:
HTTP/1.1 200 OK
Date: Mon, 26 Dec 2011 16:27:32 GMT
Server: Apache/2.2.17 (Ubuntu) DAV/2 SVN/1.6.12 mod_fcgid/2.3.6 mod_ssl/2.2.17 OpenSSL/0.9.8o PHP/5.3.2
X-Powered-By: PHP/5.3.2
Vary: Accept-Encoding
Content-Length: 5669
Content-Type: text/html
(HTML snipped from here)
LOG: Response code = 200
..done
(snip)
This all looked just fine. The really strange thing is
that the output stalled right after LOG: Response code =
200 and right before ..done. So, something
was causing ab to stall after the request
was answered (we got a 200, and it's a small number of bytes).
This is the part where I remembered that I've seen a similar behaviour before. I've lost countless hours of my life (and now one more) to this problem: some clients (such as PHP's streams) don't handle Keep-Alives in the way that one might expect.
HTTP is hard. Really hard. Way harder than you think. Actually, it's not that hard if you remember that what you think is probably wrong if you're not absolutely sure that you're right.
ab or httpd does the wrong thing. I'm
not sure which one, and I'm not even 100% sure it's wrong
(because the behaviour is not defined in the spec as far as I can
tell), but since it's Apache Bench, and Apache
httpd, we're talking about here, we'd think they could work
together. We'd be wrong, though.
Here's what's happening: ab is sending a HTTP 1.0
request with no Connection header, and
httpd is assuming that it wants to keep the
connection open, despite this. So, httpd hangs on to
the socket for an additional—you guessed it—15 seconds, after the
request is answered.
There are two easy ways to solve this. First, we can tell
ab to actually use keep-alives properly
with the -k argument. This allows ab to
drop the connection on the client side after the request
is complete. It doesn't have to wait for the server to close the
connection because it expects the server to keep the
socket open, awaiting further requests on the same socket; in the
previous scenario, the server behaved the same way, but the
client waited for the server to close the connection.
A more reliable way to ensure that the server closes the
connection (and to avoid strange keep-alive related benchmarking
artifacts) is to explicitly tell the server to close the
connection instead of assuming that it should be kept open. This
can be easily accomplished by sending a Connection:
close header along with the request:
$ ab -H "Connection: close" -n1 -c1 http://seancoates.com/codes
(snip)
Concurrency Level: 1
Time taken for tests: 0.118 seconds
Complete requests: 1
Failed requests: 0
Write errors: 0
Total transferred: 5944 bytes
HTML transferred: 5669 bytes
Requests per second: 8.48 [#/sec] (mean)
Time per request: 117.955 [ms] (mean)
Time per request: 117.955 [ms] (mean, across all concurrent requests)
Transfer rate: 49.21 [Kbytes/sec] received
(snip)
118ms? That's more like it! A longer, more aggressive (and
concurrent) benchmark gives me a result of 88.25
requests per second. That's in the ballpark of what I was
expecting for this hardware and URL.
The moral of the story: state the persistent connection behaviour explicitly whenever making HTTP requests.
Webshell is a console-based, JavaScripty web client utility that is great for consuming, debugging and interacting with APIs.
I use Firefox as my primary browser. The main reason I've been faithful to Mozilla is my set of add-ons. I use Firebug regularly, and I'm not sure what I'd do without JSONovich.
Last year, as I built Gimme Bar's internal API, I found myself using Curl, extensively, and occasionally Poster, to test and debug my code.
These two tools have allowed me to interact with HTTP, but not in the most optimal way. Poster's UI is clunky and isn't scriptable (without diving into Firefox extension internals), and Curl requires a lot of Unixy glue to process the results into something more usable than visual inspection.
I wanted something that would not only make requests, but would let me interact with the result of these requests.
When working with Evan to debug a problem one day, I mentioned my problem, and said "I really should build something that fixes this." Evan suggested that such a thing would be really useful to him, too, and that he'd be interested in working on it.
I'd planned on building my version of the tool in PHP. Evan is… not a PHP guy. He's a [whisper]Ruby[/whisper] guy.
If you've seen me speak at a conference, lately, you've probably seen this graphic:
It shows that we have diverse roles in Gimme Bar, but everyone who touches our code can speak JavaScript. (This is another, much longer post that I maybe should write, but in the meantime, see this past PHP Advent entry.)
Thus, Evan suggested that we write Webshell in JavaScript, with node.js as our "framework." Despite the aforementioned affinity for Ruby (cheap shots are fun! (-: ), Evan is a pretty smart guy. It turns out that this was not only convenient, but working with HTTP traffic (especially JSON results (of course)) is way better with JavaScript than it would have been with PHP.
So, Webshell was born. If you want to see exactly what it does, you should take a look at the readme, which outlines almost all of its functionality.
If you use curl, or any sort of other ad-hoc queries to inspect, consume, debug or otherwise touch HTTP, I hope you'll take a look at Webshell. It saves me several hours every week, and most of our Gimme Bar administration is done with it. Also, it's on GitHub so please fork and patch. I'd love to see pull requests.
I'm happy to report that Gimme Bar has been running very well on MongoDB since early February of this year. I previously posted on some of the reasons we decided to move off of CouchDB. If you haven't read that, please consider it a prerequisite for the consumption of this post.
Late last year, I knew that we had no choice but to get off of CouchDB. I was dreading the port. The dread was two-fold. I dreaded learning a new database software, its client interface, administration techniques, and general domain-knowledge, but I also dreaded taking time away from progress on Gimme Bar to do something that I knew would help us in the long term, but was hard to justify from a "product" standpoint.
I did a lot of reading on MongoDB, and I consulted with Andrei, who'd been using MongoDB with Mapalong since they launched. In the quiet void left by the holiday, on New Year's day this year, I seized the opportunity of absent co-workers, branched our git repository, put fingers-to-keyboard—which I suppose is the coding version of pen-to-paper—and started porting Gimme Bar to Mongo.
I expected the road to MongoDB to be long, twisty, and paved with uncertainty. Instead, what I found was remarkable—incredible, even.
Kristina Chodorow has done a near-perfect job of creating the wonderful tandem that makes up PHP's MongoDB extension and its most-excellent documentation. If it wasn't for Kristina (and her employer, 10gen for dedicating her time to this), the porting might have been as-expected: difficult and lengthy. Instead, the experience was pleasant and straightforward. We're not really used to this type of luxury in the PHP world. (-:
From the start, I knew that our choice of technologies carried a certain amount of risk. I'm kind of a risk-averse person, so I like to weigh the benefits (some of which I outlined in the aforementioned post), and mitigate this risk whenever possible. My mitigation technique involved making my models as dumb as possible about what happens in the code between the models and the database. I wasn't 100% successful in keeping things entirely separate, but the abstraction really paid off. I had to write a lot of code, still, but I didn't have to worry too much about how deep this code had to reach. Other than a few cases, I swapped my CouchDB client code out for an extremely thin wrapper/helper class and re-wrote my queries. The whole process took only around two weeks (of most of my time). Testing, syncing everyone, rebuilding production images and development virtual machine images, and deployment took at least as long.
That was the story part. Here's comes the opinion part (and remember, this is just my opinion; I could very well be wrong).
After using both, extensively (for a very specific application, admittedly), I firmly believe that MongoDB is a superior NoSQL datastore solution for PHP based, non-distributed (think Dropbox), non-mobile, web applications.
This opinion stems almost fully from Mongo's rich query API. In the current version of Gimme Bar, we
have a single map/reduce job (for tags). Everything else has been
replaced by a straightforward and familiar query. The map/reduce
is actually practical, and things like sorting and counting are a
breeze with Mongo's cursors. I did have to cheat in a
few places that I don't expect to scale very well (I used
$in when I should denormalize), but the beauty of
this is that I can do these things now, where with
Couch, my only option was to denormalize and map. Yes, I know
this carries a scaling/sharding and performance penalty, but you
know what? I don't care yet. ("Yet" is very important.).
MongoDB also provides a few other things to developers that were absent in CouchDB. For example, PHP speaks to Mongo through the extension and a native driver. CouchDB uses HTTP for transport. HTTP carries a lot of overhead when you need to do a lot of single-document requests (for example, when topping up a pagination set that's had records de-duplicated). My favourite difference, though, is in the atomic operations, such as findAndModify, which make a huge difference both logic- and performance-wise, at least for Gimme Bar.
Of course, there are two sides to every coin. There are CouchDB features that I miss. Namely: replication, change notification, CouchDB-Lucene (we're using ElasticSearch and manual indexing now), and Futon.
Do I think MongoDB is superior to CouchDB? It depends what you're using it for. If you need truly excellent eventual-consistency replication, CouchDB might be a better choice. If you want to have your JavaScript applications talk directly to the datastore, CouchDB is definitely the way to go. Do I have a problem with CouchDB, their developers or their community? Not at all. It's just not a good fit for the kind of app we're building.
The bottom line is that I'm extremely happy with our port to MongoDB, and I don't have any regrets about switching other than not doing it sooner.
As mentioned in a previous post, we started building Gimme Bar a little over a year ago. We did a lot of things right, but we also did some things wrong.
Since—in those early days—I was the only developer, and since most of my professional development experience is in PHP, that choice was obvious. I also started building the API before the front-end. I chose a really simple routing model for the back-end, and got to work, sans framework. Our back-end code is still really lean (for the most part), and I'm still (mostly (-: ) proud of it.
When it came time to select a datastore, I chose something a bit more risky, with Cameron's blessing.
Having just spent the best part of a year and a half working with PostgreSQL at OmniTI, I felt it was time to try something new. We knew this carried risks, but the timing was good, and—quite frankly—I was simply bored of hacking on stored procedures in PL/pgSQL. We wanted something that could be expected to scale (eventually, when we need it), without deep in-house expertise, but also something that I'd find fun to work on. I love learning new things, so we thought we'd give one of the NoSQL solutions a whirl.
In those days (January 2010), the main NoSQL contenders for building a web application were—at least in our minds—CouchDB and MongoDB. Also in those days, MongoDB didn't quite feel like it was ready for us. I could, of course, be wrong, but I figured that since I didn't know either of these systems very well, the best way to find out was to just pick one and go with it. The thing that ultimately pushed us to the CouchDB camp was a mild professional relationship with some of the CouchDB guys. So, we built the first versions of Gimme Bar on top of Linux, Apache, PHP 5.3, Lithium (on the front-end), jQuery and CouchDB.
By summer 2010, we began work on adding social features (which have since been hidden) to Gimme Bar, and CouchDB started giving us trouble. This wasn't CouchDB's fault, really. It was more of an architectural problem. We were trying to solve a relational problem with a database that by-design knew nothing about how to handle relationships.
Now might be a good time to explain document-independence and map/reduce, but I fear that would take up more attention than you've kindly offered to this article, and it's going to be long even without a detailed tutorial. Here's the short version: CouchDB stores structured objects as (JSON) documents. These documents don't know anything about their peers. To "query" (for lack of a better term) Couch, you need to write a map function (in JavaScript or Erlang, by default) that is passed all documents in the database and emits keys and values to an index that matches your map's criteria. These keys can be (roughly) sorted, and to "query" your documents, you jump to a specific part of this sorted index and grab one or more documents in the sequence. From what I understand of map/reduce (and my only practical experience so far is with CouchDB), this is how other systems such as Hadoop work, too.
There is tremendous value to a system like this. Once the index is generated, it can be incrementally updated, and querying a huge dataset is fast and efficient. The reduce side of map/reduce (we had barely a handful of reduce functions) is also incredibly powerful for calculating aggregates of the map data, but it's also intentionally limited to small subsets of the mapped data. These intentional limits allow map/reduce functions to be highly parallelizable. To run a map on 100 servers, the dataset can be split into 100 pieces, and each server can process its individual chunk safely and in parallel.
This power and flexibility has an architectural cost. Over a decade of professional development with various relational databases taught me that in order to keep one's schema descriptive and robust, one must always (for small values of "always") keep data normalized until a performance problem forces denormalization. With a document-oriented datastore like CouchDB or MongoDB, denormalization is part of the design.
A while ago, I made an extremely stripped-down example of how something like user relationships are handled in Gimme Bar with CouchDB. This document is for the user named "aaron" (_id: c988a29740241c7d20fc7974be05ec54). Aaron is following bob (_id: c988a29740241c7d20fc7974be05f67d), chris (_id: c988a29740241c7d20fc7974be05ff71), and dale (_id: c988a29740241c7d20fc7974be060bb4). You can see the references to the "following" users in aaron's document. I also published example maps of how someone might go about querying this (small) set.
The specific problem that we ran into with CouchDB is that our "timeline" page showed the collected assets of users that the currently-logged-in user is following. So, aaron would see assets that belong to bob, chris and dale. This, in itself, isn't terribly difficult; we just needed to query once for each of aaron's follows. The problem was further complicated when a requirement was raised to not only see the above, but also to collapse duplicates into one displayed asset (if bob and chris collected the same asset, aaron would only see it once). Oh, and also, these assets needed to be sorted by their capture time. These requirements made the chain of documents extremely complicated to query. In a relational system, a few (admittedly expensive) joins would have taken care of it in short order.
I spent a lot of time fighting with CouchDB to solve this problem. I asked in the #couchdb channel on Freenode, posted to the mailing list and even resorted to StackOverflow (a couple times) before coming up with a "solution." I put the word "solution" in quotes there because what I was told to do only partially solved our problem.
The general concensus was that we should denormalize our follow/following + asset records in an extreme way (as you can see in the StackOverflow posts, above). I ended up creating an interim index of all of a user's followers/following links, plus an index of all of the media hashes (what we use to uniquely identify assets, even when captured by different users). Those documents got pretty big pretty quickly (even though we had less than 100 users at the time). Here's an example: Cameron's FollowersIndex document.
As you might guess, even a system designed to handle large documents like this (such as CouchDB) would have a hard time with the sheer size. Every time an asset was captured, it would get injected into the FollowersIndex documents, which caused a reindex… which used up a lot of RAM, and caused bottlenecks. Severe bottlenecks. Our 8GB of RAM was easily exhausted by our JavaScript map function. Think about that. 8GB… for <100 users. This was not going to survive. Turns out we were exhausting Erlang's memory allocator and actually crashing CouchDB. From userspace. I asked around, and the proposed solution to this problem-within-a-problem was to re-write the JavaScript map as Erlang to avoid the JSON conversion overhead. At this point, I was desperate. I had Evan (who is a valuable member of the team, and is a far superior computer scientist to me) translate the JS to Erlang. What he came up with made my head hurt, but it worked. And by "worked," I mean that it didn't crash CouchDB and send it into a recovery spiral (crash, respawn, reindex, crash, repeat)… but it did work. Enough to get us by for a few weeks, and that's what we did: get by. The index regeneration for the friends feed was so slow that I had to use delayed indexes and reindex in cron every minute. CouchDB was using most of our computing resources, and we knew we couldn't sustain any growth with this system.
At this point, we decided to cut our losses, and I went to investigate other options, including MySQL and MongoDB. My next blog post will be on why I think MongoDB is a superior solution for building web applications, despite CouchDB being better in certain areas.