Python: Does it scale?
And no, I'm not talking about threads, concurrency or inter-process messaging. I'm talking about teams, products and organizations.
For example, a friend and coworker of mine looked at me a few days (weeks? I have no sense of time anymore) and asked me the simple question "Do you think you could build the product we have today, in python?".
This stems from a conversation (as usual) about threads/types/static typing and social enforcement. It's interesting for a few reasons - one, we're not talking about a web app. We're also not talking about a GUI or direct-to-end-user application per-se. We're talking about a highly scalable, distributed archiving system.
A large scale, highly distributed storage system with one goal: it can never, ever loose data. So the better question is: Is Python, as a language, appropriate for distributed/fault tolerant mission-critical-zero-data-loss systems?
I like to think that as a language (and platform) it can be used for bigger and better things. Especially with things like stackless, and jython becoming more and more mature (never just focus on cPython).
I know of (what I think) are the "top three" projects:1
- EVE Online (video game, uses stackless)
- YouTube (web application, potentially large and complex backend)
- ITA (flights/ticketing system)
- Tabblo
What other "large scale" systems do you know of? If facing down the possibility of someone handing you a few million dollars for you're next big idea - the next big disruptive technology - would you "bet the farm" on Python?
In the end: it's a question of technical merit. Does Python, as a language lend itself to the problem domain? What, if any, drawbacks would it have in that domain? Are there third-party add ons and projects which could help you?
I think - if I get the choice - the next big project I want to work on will be done in Python (should it fit the problem domain). I'd like to see my (language of choice) scale with my own eyes.
Maybe one day we'll see a python implementation of the mighty Google Filesystem, or BigMap - although in the latter case, python bindings to a c-library might be appropriate.
- More here Python::Success [↩]


August 30th, 2007 at 10:01 am
The youtube way of dealing with python involves throwing more hardware each time python choke on something.
August 30th, 2007 at 10:20 am
According to what I know, that is not necessarily the case. See:youtube-scalability and youtube-architecture for examples. Adding servers is a “big duh” when it comes to web farm scaling (any web developer working on a big app knows about load balancers and scaling the farm), I was talking about the back-end, BigTable/etc stuff - how do the access the disks, are the farms that hold the video clustered/etc - how do they manage/access the database? The back end has to scale just as well (better) than the front end in many cases.
August 30th, 2007 at 11:04 am
Jesse, guess I have a different view on ‘the big thing’ I worked on one of two major projects back in prior employment. I have to tell you that in my experience conceptualization was the bigger problem than scalability. You only have a finite amount of time to get the idea from paper napkin to prototype. My tools then were C++ and Scheme, python having not even broken on the scene.
Now the landscape is so different. I would have leaped at Python back then if I had access to it. One of Python’s strengths I have used time and again is its ability to glue pieces together. If profiling a python app indicates that there are delays, then most likely I can get someone on staff to write the code to use the native C or ASM code to drive it more efficiently. At the scale of something like a YouTube I don’t think languages per se are the issue.
August 30th, 2007 at 11:12 am
I completely agree: that’s why python is so attractive in many case, the speed at which you can go from “zero to hero” in it (especially in light of the limited burn startups have). The concept of “build it fast in python:optimize when you have to” is a key in this kind of discussion.
Maybe deep down inside that’s the question/point that counts the most: what will let me get this done the fastest to “prove” out an idea/concept.
But when you’re aiming at “something big” from position 0 (let’s say, oh, a distributed filesystem) - often the time you spend prototyping can chew into your cash and limited time, and then you run the risk of find yourself having to re-implement in the “proper” domain language later (let’s say, a 60% or more “optimization” re-write in Java or C++).
The second half of that is: Does Python scale in teams (i.e: can duck typing scale)?
August 30th, 2007 at 11:52 am
You might consider the case of MojoNation, which begat BitTorrent, HiveCache, Mnet, and Allmydata-tahoe. It created an architecture for a large-scale, fault-tolerant persistent distributed storage system similar to GoogleFileSystem (albeit a couple of years before GFS existed.) The actual implementation did not meet all of its goals, but because it was created in Python it was easier for the follow-on projects to pick up the pieces and re-work them in more application-specific ways to meet various facets of the original goal (e.g. file sharing for BT, enterprise backups for HiveCache, etc.) Python lets you easily take your first effort, break it down, and re-purpose existing code for a new set of constraints.
Python has several advantages that you touch upon briefly but need to be repeated. It makes prototyping easy, which is a big win. You may think that it burns cash/time before you re-write it in a “proper” language, but there is nothing about such a project like this that necessitates it being written in c/c++ or Java — once you are dealing with distributed storage across a WAN boundary you will discover that managing network latency is the big bottleneck (unlike a LAN filesystem where disk latency and component optimization can become an issue.) Therefore you will be writing your prototypes in your “shipping” language, but will have more flexibility while you are building your product. The advantage of this cannot be understated for a complex problem like a large-scale distributed system. You will re-write most of the code eventually, probably several times.
If you really think hard about your problem domain, there optimization that you will need to do will be more about process and algorithm optimization than about whether or not a particular loop is running as fast as possible. For this sort of a problem you will discover that no particular language is going to offer you anything more than a 5-10% speedup in actual execution time of any particular component, so what you need to do is optimize programmer time. This is a _huge_ task.
One other point I really hate to make here is that you are going to want to look at Twisted instead of Stackless. I really love Stackless and prefer it over Twisted whenever given the choice, but there is a sizeable amount of existing code in this particular area that is already written in Twisted and you will save yourself some time by choosing that framework over Stackless.
August 30th, 2007 at 12:20 pm
It’s not quite live yet but Chandler is written in Python and must qualify as large complex application with both desktop and back end components.
August 30th, 2007 at 12:33 pm
Wow, first off - thanks for taking the time to post that, here are some thoughts:
Overall, your points are very well made - also note, I am not talking about a planned or current project. I already work (day to day) on a distributed filesystem/archiving system, and have for some time. The question (rather rhetorically) I was trying to answer is the one of “could Python “scale” up to these requirements”.
Also, one of these days I am going to be able to actually sit down and write something in twisted, damnit.
August 30th, 2007 at 12:36 pm
I want to see the end-result of chandler, I’ve dealt with some of the small components, but I want to see the much larger finished end-product first
August 30th, 2007 at 2:51 pm
In terms of scaling up a project for developers (vs. scaling it up for speed/capacity) you can also take a look at TinyERP.com. I’m amazed at the functionality / code size in that project.
Not having to write code is an advantage of scaling up teams.
August 30th, 2007 at 4:51 pm
Mercurial is a distributed version control system that is going great guns. Sun has chosen it for opensolaris and other code bases.
- Paddy.
September 8th, 2007 at 4:56 pm
Aching for Shane Hathaway to unveil Bit Mountain:
“The Bit Mountain Research Project”
September 9th, 2007 at 8:59 am
Interesting. I had not heard of this - but it hasn’t been updated since 2006 :(