I wish someone wrote django-static-upstream… maybe even… me!

2011/12/28 § 7 Comments

I used to think serving static files (aka static assets) is really easy: configure nginx to serve a directory and you’re done. But things quickly became more complicated as issues like asset compilation, CDNs/scalability, file-specific custom headers, deployment complexity and development/production parity rear their ugly heads. Judging by the huge number of different asset management packages on djangopackages, it seems like I’m not the only one who ran into this problem and felt not-entirely-satisfied with all the other solutions out there. Things actually took a turn for the worse ever since I started drinking the Heroku cool-aid, and for the live of me I just can’t make sense of their best-practice with regard to serving static files. The heroku-django quickstart conveniently sidesteps the issue of statics, and while there are a few support articles that lightly touch on neighboring subjects, nothing I found was spot-on and hands-on (this is an exception to the rule; the Heroku cool-aid is otherwise very tasty and easy to drink, to my taste so far). Ugh, why can’t there be a silver bullet to solve all this? Let me tell you about my “wishlist” for the best static serving method evar.

First, I want to be able to take any checkout from my VCS, maybe run an easy bootstrap function, and get a working development environment with statics served. In production, I want to serve my statics from a CDN with aggressive caching, so I need some versioning system, but I’d like to minimize deployment complexity and I want fine-grained cache invalidation of my statics. I want my statics to be served the same way (same headers, same versioning mechanism) in development and production without having to update two different locations (i.e., my S3 syncing script and my development nginx configuration). I also don’t want to have to “garbage-collect” my old statics from S3 every so and so days. Like I said, I’d like some of my statics to be served with some bells and whistles, like various custom headers (Access-Control-Allow-Origin, anyone?) or gzip compression. Speaking of bells and whistles, how about a whole marching band, since I want to serve statics that require compilation (minify/concatenate/compile scss+coffe/spritalize/etc), but I don’t want to have to rerun a ‘build process’ every time I touch a coffee script file in development. Finally, and this isn’t something I’m not very adamant about, but I prefer my statics to be served from a different subdomain (static, not www), I think it’s cleaner, I don’t need my clients’ cookies with every static request and it allows for some tricks (like using a CDN with support for a custom origin).

And nothing I found does all that, definitely not easily. In my dream, there’s a package called django-static-upstream, which is designed to provide a holistic approach to all these issues. I’m thinking:

  • a pure Python/django static HTTP server (probably just django.views.static with support for the bells and whistles as mentioned above), and yeah, I think I should bloody use this server as a backend to serve my in production
  • a “vhost” middleware that will replace request.urlconf based on the Host: header; if the host starts with some prefix (say, static), the request will be served by the webserver above
  • a couple of template tags like {% static "images/logo.png" %} that will create content-hashed links to the static webserver (i.e., //static.example.com/829dd67168a3/images/logo.png); the static server will know to ignore the content-hash bit
  • this isn’t really up to the package, but it should be built to support easily setting a custom-origin supporting CDN (like CloudFront) as the origin URL; this is both to serve the statics from nearby edges CDN but also (maybe more importantly) to serve as a caching reverse proxy so the dynamic server will be fairly idle
  • support for compiling some static types on the fly (coffeescript, scss, etc) and returning the rendered result; the result may have to be cached using django’s cache (keyd by a content hash), but this is more to speed up multi-browser development where there is no CDN to serve as a reverse caching proxy than because I worry about production

So now I’m thinking maybe I should write something like this. There are two reasons I’m blogging about a package before I even wrote it; first, since I wanted to flesh out in my mind what is it that I want from it. But second, and more importantly, because I’d like to tread carefully (1) before I have the hubris to start yet another assets related django package, (2) before I start serving static content with a dynamic language (what am I, mad?) and (3) because compiling static assets on runtime in violation of the fifth factor in Adam Wiggins’ twelve-factor app manifesto (what, you didn’t read it yet? what’s wrong with you?). These are quite a few warning signs to cross, and I’d like to get some feedback before I go there. But I honestly think I’ll be happier if I had a package to do all this, I don’t think writing it should be so hard and I hope you’d be happy using it, too.

The ball’s in your court, comment away.

Walking Python objects recursively

2011/12/11 § 6 Comments

Here’s a small function that walks over any* Python object and yields the objects contained within (if any) along with the path to reach them. I wrote it and am using it to validate a deserialized datastructure, but you can probably use it for many things. In fact, I’m rather surprised I didn’t find something like this on the web already, and perhaps it should go in itertools.

Edit: Since the original post I added infinite recursion protection following Eli and Greg’s good advice, added Python 3 compatibility and did some refactoring (which means I had to add proper unit test). You will always be able to get the latest version here, on ActiveState’s Python Cookbook (at least until it makes its way into stdlib, fingers crossed…).

from collections import Mapping, Set, Sequence

# dual python 2/3 compatability, inspired by the "six" library
string_types = (str, unicode) if str is bytes else (str, bytes)
iteritems = lambda mapping: getattr(mapping, 'iteritems', mapping.items)()

def objwalk(obj, path=(), memo=None):
    if memo is None:
        memo = set()
    iterator = None
    if isinstance(obj, Mapping):
        iterator = iteritems
    elif isinstance(obj, (Sequence, Set)) and not isinstance(obj, string_types):
        iterator = enumerate
    if iterator: 
        if id(obj) not in memo:
            memo.add(id(obj))
            for path_component, value in iterator(obj):
                for result in objwalk(value, path + (path_component,), memo):
                    yield result
            memo.remove(id(obj))
    else:       
        yield path, obj

And here’s a little bit of sample usage:

>>> tuple(objwalk(True))
(((), True),)
>>> tuple(objwalk({}))
()
>>> tuple(objwalk([1,2,3]))
(((0,), 1), ((1,), 2), ((2,), 3))
>>> tuple(objwalk({"http": {"port": 80, "interface": "0.0.0.0"}}))
((('http', 'interface'), '0.0.0.0'), (('http', 'port'), 80))
>>> 

"any" is a strong word and Python is flexible language; I wrote this function to work with container objects that respect the ABCs in the collections module, which mostly cover the usual builtin types and their subclasses. If there’s something significant I missed, I’d be happy to hear about it.

Where Am I?

You are currently viewing the archives for December, 2011 at NIL: .to write(1) ~ help:about.

Follow

Get every new post delivered to your Inbox.

Join 34 other followers