Python: Does it scale?

August 30th, 2007 § 12 comments

And no, I’m not talk­ing about threads, con­cur­rency or inter-process mes­sag­ing. I’m talk­ing about teams, prod­ucts and orga­ni­za­tions.

For exam­ple, a friend and coworker of mine looked at me a few days (weeks? I have no sense of time any­more) and asked me the sim­ple ques­tion “Do you think you could build the prod­uct we have today, in python?”.

This stems from a con­ver­sa­tion (as usual) about threads/types/static typ­ing and social enforce­ment. It’s inter­est­ing for a few rea­sons — one, we’re not talk­ing about a web app. We’re also not talk­ing about a GUI or direct-to-end-user appli­ca­tion per-se. We’re talk­ing about a highly scal­able, dis­trib­uted archiv­ing system.

A large scale, highly dis­trib­uted stor­age sys­tem with one goal: it can never, ever loose data. So the bet­ter ques­tion is: Is Python, as a lan­guage, appro­pri­ate for distributed/fault tol­er­ant mission-critical-zero-data-loss systems?

I like to think that as a lan­guage (and plat­form) it can be used for big­ger and bet­ter things. Espe­cially with things like stack­less, and jython becom­ing more and more mature (never just focus on cPython).

I know of (what I think) are the “top three” projects:44

  • EVE Online (video game, uses stackless)
  • YouTube (web appli­ca­tion, poten­tially large and com­plex backend)
  • ITA (flights/ticketing system)
  • Tab­blo

What other “large scale” sys­tems do you know of? If fac­ing down the pos­si­bil­ity of some­one hand­ing you a few mil­lion dol­lars for you’re next big idea — the next big dis­rup­tive tech­nol­ogy — would you “bet the farm” on Python?

In the end: it’s a ques­tion of tech­ni­cal merit. Does Python, as a lan­guage lend itself to the prob­lem domain? What, if any, draw­backs would it have in that domain? Are there third-party add ons and projects which could help you?

I think — if I get the choice — the next big project I want to work on will be done in Python (should it fit the prob­lem domain). I’d like to see my (lan­guage of choice) scale with my own eyes.

Maybe one day we’ll see a python imple­men­ta­tion of the mighty Google Filesys­tem, or BigMap — although in the lat­ter case, python bind­ings to a c-library might be appro­pri­ate.
4

  1. More here Python::Success444

4

  • -

    The youtube way of deal­ing with python involves throw­ing more hard­ware each time python choke on something.

  • http://www.jessenoller.com jesse

    Accord­ing to what I know, that is not nec­es­sar­ily the case. See:youtube-scalability and youtube-architecture for exam­ples. Adding servers is a “big duh” when it comes to web farm scal­ing (any web devel­oper work­ing on a big app knows about load bal­ancers and scal­ing the farm), I was talk­ing about the back-end, BigTable/etc stuff — how do the access the disks, are the farms that hold the video clustered/etc — how do they manage/access the data­base? The back end has to scale just as well (bet­ter) than the front end in many cases.

  • JohnMc

    Jesse, guess I have a dif­fer­ent view on ‘the big thing’ I worked on one of two major projects back in prior employ­ment. I have to tell you that in my expe­ri­ence con­cep­tu­al­iza­tion was the big­ger prob­lem than scal­a­bil­ity. You only have a finite amount of time to get the idea from paper nap­kin to pro­to­type. My tools then were C++ and Scheme, python hav­ing not even bro­ken on the scene.

    Now the land­scape is so dif­fer­ent. I would have leaped at Python back then if I had access to it. One of Python’s strengths I have used time and again is its abil­ity to glue pieces together. If pro­fil­ing a python app indi­cates that there are delays, then most likely I can get some­one on staff to write the code to use the native C or ASM code to drive it more effi­ciently. At the scale of some­thing like a YouTube I don’t think lan­guages per se are the issue.

  • http://www.jessenoller.com jesse

    I com­pletely agree: that’s why python is so attrac­tive in many case, the speed at which you can go from “zero to hero” in it (espe­cially in light of the lim­ited burn star­tups have). The con­cept of “build it fast in python:optimize when you have to” is a key in this kind of discussion.

    Maybe deep down inside that’s the question/point that counts the most: what will let me get this done the fastest to “prove” out an idea/concept.

    But when you’re aim­ing at “some­thing big” from posi­tion 0 (let’s say, oh, a dis­trib­uted filesys­tem) — often the time you spend pro­to­typ­ing can chew into your cash and lim­ited time, and then you run the risk of find your­self hav­ing to re-implement in the “proper” domain lan­guage later (let’s say, a 60% or more “opti­miza­tion” re-write in Java or C++).

    The sec­ond half of that is: Does Python scale in teams (i.e: can duck typ­ing scale)?

  • evgen

    You might con­sider the case of MojoNa­tion, which begat Bit­Tor­rent, Hive­Cache, Mnet, and Allmydata-tahoe. It cre­ated an archi­tec­ture for a large-scale, fault-tolerant per­sis­tent dis­trib­uted stor­age sys­tem sim­i­lar to Google­FileSys­tem (albeit a cou­ple of years before GFS existed.) The actual imple­men­ta­tion did not meet all of its goals, but because it was cre­ated in Python it was eas­ier for the follow-on projects to pick up the pieces and re-work them in more application-specific ways to meet var­i­ous facets of the orig­i­nal goal (e.g. file shar­ing for BT, enter­prise back­ups for Hive­Cache, etc.) Python lets you eas­ily take your first effort, break it down, and re-purpose exist­ing code for a new set of constraints.

    Python has sev­eral advan­tages that you touch upon briefly but need to be repeated. It makes pro­to­typ­ing easy, which is a big win. You may think that it burns cash/time before you re-write it in a “proper” lan­guage, but there is noth­ing about such a project like this that neces­si­tates it being writ­ten in c/c++ or Java — once you are deal­ing with dis­trib­uted stor­age across a WAN bound­ary you will dis­cover that man­ag­ing net­work latency is the big bot­tle­neck (unlike a LAN filesys­tem where disk latency and com­po­nent opti­miza­tion can become an issue.) There­fore you will be writ­ing your pro­to­types in your “ship­ping” lan­guage, but will have more flex­i­bil­ity while you are build­ing your prod­uct. The advan­tage of this can­not be under­stated for a com­plex prob­lem like a large-scale dis­trib­uted sys­tem. You will re-write most of the code even­tu­ally, prob­a­bly sev­eral times.

    If you really think hard about your prob­lem domain, there opti­miza­tion that you will need to do will be more about process and algo­rithm opti­miza­tion than about whether or not a par­tic­u­lar loop is run­ning as fast as pos­si­ble. For this sort of a prob­lem you will dis­cover that no par­tic­u­lar lan­guage is going to offer you any­thing more than a 5–10% speedup in actual exe­cu­tion time of any par­tic­u­lar com­po­nent, so what you need to do is opti­mize pro­gram­mer time. This is a _huge_ task.

    One other point I really hate to make here is that you are going to want to look at Twisted instead of Stack­less. I really love Stack­less and pre­fer it over Twisted when­ever given the choice, but there is a size­able amount of exist­ing code in this par­tic­u­lar area that is already writ­ten in Twisted and you will save your­self some time by choos­ing that frame­work over Stackless.

  • Carl

    It’s not quite live yet but Chan­dler is writ­ten in Python and must qual­ify as large com­plex appli­ca­tion with both desk­top and back end components.

  • http://www.jessenoller.com jesse

    Wow, first off — thanks for tak­ing the time to post that, here are some thoughts:

    • I was going to men­tion Allmydata-tahoe, and the other like-kin, but I haven’t had a chance to dig into tahoe yet, it’s prob­a­bly the clos­est kin to the sys­tem I’ve dealt with.
    • The quote “Python lets you eas­ily take your first effort, break it down, and re-purpose exist­ing code for a new set of con­straints.” is an excel­lent point.
    • As for re-writing, I did not mean to insin­u­ate “you would always have to rewrite” (in fact, I would always like the short­est path from prototype->production). I do know about the LAN vs. WAN latency issues in dis­trib­uted sys­tems though. Depend­ing on the sys­tem you’re deal­ing with, hit­ting a disk/network bot­tle­neck is agree­ably dif­fi­cult to reach though. Fre­quently you spend too much time else­where in the system-layer.
    • I have had mul­ti­ple expe­ri­ences with “You will re-write most of the code even­tu­ally, prob­a­bly sev­eral times.” and you’re dead-right.
    • Is there a 6 degrees of Twisted game out there? :)

    Over­all, your points are very well made — also note, I am not talk­ing about a planned or cur­rent project. I already work (day to day) on a dis­trib­uted filesystem/archiving sys­tem, and have for some time. The ques­tion (rather rhetor­i­cally) I was try­ing to answer is the one of “could Python “scale” up to these requirements”.

    Also, one of these days I am going to be able to actu­ally sit down and write some­thing in twisted, damnit.

  • http://www.jessenoller.com jesse

    I want to see the end-result of chan­dler, I’ve dealt with some of the small com­po­nents, but I want to see the much larger fin­ished end-product first

  • http://www.AppropriateSolutions.com Ray

    In terms of scal­ing up a project for devel­op­ers (vs. scal­ing it up for speed/capacity) you can also take a look at TinyERP.com. I’m amazed at the func­tion­al­ity / code size in that project.

    Not hav­ing to write code is an advan­tage of scal­ing up teams.

  • http://paddy3118.blogspot.com Paddy3118

    Mer­cu­r­ial is a dis­trib­uted ver­sion con­trol sys­tem that is going great guns. Sun has cho­sen it for open­so­laris and other code bases.

    - Paddy.

  • J Esteves

    “A large scale, highly dis­trib­uted stor­age sys­tem with one goal: it can never, ever loose data. So the bet­ter ques­tion is: Is Python, as a lan­guage, appro­pri­ate for distributed/fault tol­er­ant mission-critical-zero-data-loss systems?“

    Aching for Shane Hath­away to unveil Bit Moun­tain:

    “The Bit Moun­tain Research Project”

  • http://www.jessenoller.com jesse

    Inter­est­ing. I had not heard of this — but it hasn’t been updated since 2006 :(

What's this?

You are currently reading Python: Does it scale? at jessenoller.com.

meta