Tuesday, February 14, 2012

Python Concurrency Spreadsheet

I'm a fan of Nicholas Piël's Asynchronous Servers in Python blog post. In a similar vein, my buddy Shailen Tuli and I put together the following spreadsheet. Feel free to view it on Google Docs and make corrections.



A few notes:

It is not my goal to use this spreadsheet to pick favorites. Rather, I'm trying to use this spreadsheet to point out differences among the different approaches.

I know very little about DieselWeb, MultiTask, FriendlyFlow, Weightless, Fibra, and Cogen. All I know is that Nicholas Piël's pointed them out as generator-based libraries.

Some of these libraries support multiple approaches at the same time. For instance, Twisted lets you use a mix of callbacks, generators, and threads. Similarly, Tornado lets you use a mix of callbacks and threads. However, it's important to point out the compromises involved. For instance, if you're using Tornado, you can't have 5000 concurrent clients to a WSGI server; instead you have to switch over to the asynchronous callback API.

When I say things like "Can handle 5000 clients?", I don't mean that each of those requests is actively being served. There's only so much CPU to go around! Rather, I mean you can have 5000 clients that are each in various stages of completion, and the server as a whole is mostly waiting on IO.

You can also read my other blog posts Python: Concurrency and Python: Asynchronous Networking APIs and MySQL.

No comments: