I wish someone wrote django-static-upstream… maybe even… me!

2011/12/28 § 7 Comments

I used to think serving static files (aka static assets) is really easy: configure nginx to serve a directory and you’re done. But things quickly became more complicated as issues like asset compilation, CDNs/scalability, file-specific custom headers, deployment complexity and development/production parity rear their ugly heads. Judging by the huge number of different asset management packages on djangopackages, it seems like I’m not the only one who ran into this problem and felt not-entirely-satisfied with all the other solutions out there. Things actually took a turn for the worse ever since I started drinking the Heroku cool-aid, and for the live of me I just can’t make sense of their best-practice with regard to serving static files. The heroku-django quickstart conveniently sidesteps the issue of statics, and while there are a few support articles that lightly touch on neighboring subjects, nothing I found was spot-on and hands-on (this is an exception to the rule; the Heroku cool-aid is otherwise very tasty and easy to drink, to my taste so far). Ugh, why can’t there be a silver bullet to solve all this? Let me tell you about my “wishlist” for the best static serving method evar.

First, I want to be able to take any checkout from my VCS, maybe run an easy bootstrap function, and get a working development environment with statics served. In production, I want to serve my statics from a CDN with aggressive caching, so I need some versioning system, but I’d like to minimize deployment complexity and I want fine-grained cache invalidation of my statics. I want my statics to be served the same way (same headers, same versioning mechanism) in development and production without having to update two different locations (i.e., my S3 syncing script and my development nginx configuration). I also don’t want to have to “garbage-collect” my old statics from S3 every so and so days. Like I said, I’d like some of my statics to be served with some bells and whistles, like various custom headers (Access-Control-Allow-Origin, anyone?) or gzip compression. Speaking of bells and whistles, how about a whole marching band, since I want to serve statics that require compilation (minify/concatenate/compile scss+coffe/spritalize/etc), but I don’t want to have to rerun a ‘build process’ every time I touch a coffee script file in development. Finally, and this isn’t something I’m not very adamant about, but I prefer my statics to be served from a different subdomain (static, not www), I think it’s cleaner, I don’t need my clients’ cookies with every static request and it allows for some tricks (like using a CDN with support for a custom origin).

And nothing I found does all that, definitely not easily. In my dream, there’s a package called django-static-upstream, which is designed to provide a holistic approach to all these issues. I’m thinking:

  • a pure Python/django static HTTP server (probably just django.views.static with support for the bells and whistles as mentioned above), and yeah, I think I should bloody use this server as a backend to serve my in production
  • a “vhost” middleware that will replace request.urlconf based on the Host: header; if the host starts with some prefix (say, static), the request will be served by the webserver above
  • a couple of template tags like {% static "images/logo.png" %} that will create content-hashed links to the static webserver (i.e., //static.example.com/829dd67168a3/images/logo.png); the static server will know to ignore the content-hash bit
  • this isn’t really up to the package, but it should be built to support easily setting a custom-origin supporting CDN (like CloudFront) as the origin URL; this is both to serve the statics from nearby edges CDN but also (maybe more importantly) to serve as a caching reverse proxy so the dynamic server will be fairly idle
  • support for compiling some static types on the fly (coffeescript, scss, etc) and returning the rendered result; the result may have to be cached using django’s cache (keyd by a content hash), but this is more to speed up multi-browser development where there is no CDN to serve as a reverse caching proxy than because I worry about production

So now I’m thinking maybe I should write something like this. There are two reasons I’m blogging about a package before I even wrote it; first, since I wanted to flesh out in my mind what is it that I want from it. But second, and more importantly, because I’d like to tread carefully (1) before I have the hubris to start yet another assets related django package, (2) before I start serving static content with a dynamic language (what am I, mad?) and (3) because compiling static assets on runtime in violation of the fifth factor in Adam Wiggins’ twelve-factor app manifesto (what, you didn’t read it yet? what’s wrong with you?). These are quite a few warning signs to cross, and I’d like to get some feedback before I go there. But I honestly think I’ll be happier if I had a package to do all this, I don’t think writing it should be so hard and I hope you’d be happy using it, too.

The ball’s in your court, comment away.

nginx+gzip module might silently corrupt data upon backend failure

2011/11/04 § 3 Comments

There are several elements that make absolutely certain the page you’re reading in your browser is an accurate representation of the resource the HTTP server meant to send you1. Disregarding caching for a minute, we have two elements making sure the representation you get is protected from errors. The first protecting element is, of course, TCP, making sure that if the server wrote two-hundred bytes in a particular order, either they’ll all arrive to your end (in order and without errors) or your TCP stack will realize something bad happened and give your user-agent (your browser) a chance to cope with the error. The need for the second protecting element is a bit more sneaky: TCP will guarantee everything the server wrote will arrive, i.e., bytes for which the server called write(2) or equivalent will arrive (or you’ll know something went wrong). But what about bytes the server should have written but didn’t write all – for example, because some component on the server’s side failed?

The original HTTP (HTTP 0.9, 1996 time) didn’t cope with this situation at all. The signal to the client that the server finished talking was to disconnect the TCP session, which, from the client’s side, is a vague signal. Did the TCP server disconnect because it finished or because it ran into trouble (software fault, sysadmin action, kernel behaviour due to memory pressure or even a bug, etc)? Thankfully, current HTTP kicks in to complement TCP, allowing the server to do one of several things in order to make sure you’ll at least know you didn’t receive the whole picture. By far the two most common thing the server will do are to specify a Content-Length in the response’s header or to use a Transfer-Encoding, most probably chunked transfer encoding.

Content length is simple to grasp. The server wishes to say 200 bytes. It explicitly says: “I will say 200 bytes” in the response header. If the user-agent didn’t receive 200 bytes of response, it knows something went wrong. Chunked transfer encoding is only slightly more complex – the server will send the response in chunks, each chunks prefixed by the length of the chunk. The end of the document is marked by a zero-length chunk. So if the user-agent saw a chunk cut in the middle, or didn’t receive a zero-length chunk, it also knows something went wrong and has a chance to decide what to do about it. For example, when faced with incorrect content length, Chrome displays an ERR_CONNECTION_CLOSED error, whereas Firefox would display the portion of the page it did receive. Different behaviour, yes, but at least both user-agents in this example had a chance realize the response they received is partial. Which is really, really important, you know why? I’ll tell you why.

Enter caching. HTTP caching is a non-trivial matter with many unexpected gotchas and pitfalls, and I can’t cover it all here (why the complexity? I think it’s because all caching is an intentional form of data/state repetition, and repetition is something that in my experience humans often have difficulty reasoning about). By far the best document I know about HTTP caching is this splendid guide, but if you’re in a hurry or impatient, let me summarize the points interesting for this particular post. First, caches might exist in many places, some of them might be surprising, some of them might be slightly broken or at least very aggressive (ISP transparent caches, mutter mutter cough cough). Second, among many other things, HTTP caching lets a server give a client a token together resource, telling the client “next time you request this resource, tell me you have this token; maybe I’ll just tell you that the representation you got with this token is still fresh, without transferring it all over again”. This is called an ETag, and the response that says “just use what you have in your cache” is called HTTP 304 NOT MODIFIED.

How is this relevant to HTTP responses cut in the middle? Well, if servers didn’t have a way of telling the user-agent how long is the document, and if the response was cut in the middle due to a server fault as described above, the user-agent/sneaky-caching-proxy might cache incorrect responses. If the server also sends an ETag along with the response, the caching entity will store this ETag along with the invalid cached representation, and even when it’s time to check the representation’s freshness with the origin server, the server will just take a look at the ETag, say “yep, this is fine”, tell the cache to keep using the bad representation and <sinister>never ever let it recover</sinister>. If this happens on a large ISP’s transparent cache, easily tens of thousand of your users could be affected. If the affected resource is a common element in many of your pages with strict syntax checking, like a javascript resource, you’re kinda screwed. The only hope in such a condition is that the client, for some reason, will specify Cache-Control: no-cache in the request, and that caching entities along the path to the server will honour this request. Browsers like caching, so they won’t usually request no-cache, although AFAIK, recently Chrome started sending no-cache when the user explicitly requests a force-reload (Cmd-R on a Mac). Other browsers don’t fare as well, and I think that hoping one of your Chrome users will force reload the bad resource in time to save the day is hardly a sturdy solution.

Bottom line is, it’s really important to know when a representation of a resource is broken. Which is why I was quite amazed to learn that my HTTP server of choice, nginx, doesn’t validate the Content-Length it receives from its upstreams and is simply unaware when the response it received from an upstream server is chopped off. If your response specifies a content length but closes the connection without delivering enough bytes, nginx will simply stall the request for a long time without closing the connection downstream, even though it has no hope of receiving additional data to push downstream. I tried this both with proxy_pass and uwsgi_pass, but I’m quite confident it’s true for other backends (fastcgi_pass, scgi_pass, etc). This is bad, but not as bad as the case where you want an nginx module to manipulate your content, removing existing content length/transfer encoding and applying its own (the gzip module indeed does that). If a backend error occurs while content-length-oblivious-nginx is altering the data, the content altering module will apply what it applies to the bytes it received, add new content-length/transfer-encoding, assuring everyone the response is OK, and entice user-agents or even proxies to enter the almost-never-recover bad cache scenario I described in the previews paragraph. Ouch!

The proper way to fix this, IMHO, is that nginx simply must start looking at the upstream’s content length (or transfer encoding, once nginx starts using chunked responses with its upstreams). Part of the reason I’m writing this post is that Maxim Dounin, venerable nginx comitter and an OK chap overall, told me he doesn’t consider this a top priority at the moment, but I humbly disagree with his assessment of how serious the issue is. Until such a time as nginx is fixed about this, I think you must disable all content-manipulating nginx modules and instead handle all message length affecting work in your upstream (compression, addition, etc). This is what I opted to do with my django based web app, I replaced nginx’s gzip module with Django’s GZipMiddleware. It’s a terrible shame though. It’s doing the job of nginx for it, probably in a lesser fashion than how nginx could, it violates a must not clause in Python’s WSGI PEP333, and I have empiric proof that Tim Berners-Lee chokes a kitten every time you do it.

But what’s the alternative? Risk invisibly cached corrupt data for an undetermined length of time? Ditch nginx, which I think is the best HTTP server on this planet despite this debacle? Nah. Both are unacceptable.


1 This post assumes convenient values of “absolutely certain”; also, everything related to security/content tampering is out of scope in this post. I’m talking about possibly misbehaving but certainly well-meaning components.

Introducing django-ajaxerrors

2011/01/31 § Leave a comment

Today I finished a small django middleware that facilitates development of AJAX applications. I reckon most if not all Django developers know about Django’s useful debug-mode unhandled error page (you know, the one that looks like this). However, when an AJAX request reaches a faulty view, the error page will not be rendered in your browser but instead be received by your AJAX error handler (assuming you even had one), which is almost always not what you want. This forces you you to find some other way to reach your traceback information. For example, before I wrote this package, I used to regularly open Chrome’s developer tools, find the failed resource in the Resources tab, and then either read through the raw HTML (yuck) or copy and paste it to a file and double click it (tedious).

As you can see, this bothered other people, too, but I couldn’t find a decent solution on the web. Thankfully, since the problem is really about ease of development and not very relevant in environments where DEBUG is false (you’d get the traceback via email anyway), and since I do most of my development work locally (and I suspect so do many other Django developers), I figured the solution can take advantage of the server being a full fledged desktop with a modern browser and a GUI. Enter ajaxerrors.middleware.ShowAJAXErrors.

This little middleware intercepts all unhandled view exceptions, pickles the technical error page and uses Python’s webbrowser module to direct a new browser tab at a special URL that will serve (and delete) the previously stored page. All this is only triggered if DEBUG and request.is_ajax() is true, so pretty much everything you’re used to in your development flow should stay the same. Sweet. ShowAJAXError can also be configured to run arbitrary user-defined handlers with the error information, and even comes with a handler for growlnotify and a handler that replies to all failed AJAX calls with HTTP OK result containing an empty JSON object (I use it for development, YMMV).

django-ajaxerrors is on PyPI, so you can install it with pip or easy_install with ease. You can clone/fork the code (and read the more detailed readme) here. Also, if you’d like to see what django-ajaxerrors is like without any special effort, you can download a simple AJAX app I’ve written for the purpose of developing django-ajaxerrors itself. The app has django-ajaxerrors contained within it, so this is pretty much all you need to see django-ajaxerrors in action about 1 minute from now:

$ virtualenv -q demo
$ pip install -q -E demo django
$ cd demo
$ source bin/activate
(demo)$ curl -s -o - http://cloud.github.com/downloads/yaniv-aknin/django-ajaxerrors/ajaxerrors-demo.tar.bz2 | tar jx
(demo)$ cd ajaxerrors-demo/
(demo)$ python manage.py runserver
# demoserver running...

Now point your browser at the development server, and play with the calculator a bit. Hint: the calculator is not protected against division by zero. :)

I hope you like django-ajaxerrors, it’s MIT licensed, so do what you will with it. Let me know if you need assistance or ran into a snag, or, even better, if you want to contribute something to it.

Where Am I?

You are currently browsing entries tagged with django at NIL: .to write(1) ~ help:about.