What do you want to see in a concurrency talk…

September 22nd, 2008 § 13 comments

So, for starters, I’ll be doing a 1 hour talk at the PyWorks con­fer­ence in Atlanta on Novem­ber 13.

Fol­low­ing that, I am work­ing on a PyCon tuto­r­ial, and one other non-tutorial talk for PyCon. My cur­rent theme for this round of talks is “Python 2.6, thread­ing and mul­ti­pro­cess­ing concurrency”.

That’s a mouthful.

The PyWorks talk is enti­tled “Get­ting Started with Con­cur­rency with Mul­ti­Pro­cess­ing and Threads” — given it’s a one hour talk, I’m prob­a­bly going to need to only go into threads on a cur­sory level (to give every­one a com­mon under­stand­ing) and then delve into mul­ti­pro­cess­ing fea­tures and touch on a basic appli­ca­tion as an anchor.

I plan on cov­er­ing the dif­fer­ences, pluses, minuses and “get­ting started” and walk­ing through some amount of the API. Given it’s only an hour, I won’t be able to deep-dive into build­ing a fully fledged appli­ca­tion. I can only hope to give peo­ple enough infor­ma­tion to get started.

Mov­ing onto pycon, I wanted to expand on this space for a full-blown tuto­r­ial — I wanted to cover threads, mul­ti­pro­cess­ing, pros, cons, orga­ni­za­tion, APIs, best prac­tices, the GIL and walk through build­ing actual application(s) — although I need to pick a good “show­case” appli­ca­tion that peo­ple can grok.

Also, in the tuto­r­ial, I was think­ing about going from local con­cur­rency to network-based con­cur­rency (start­ing with the built in mul­ti­pro­cess­ing API) so that peo­ple can under­stand the dif­fer­ences and take those into account.

The final one — the short talk for PyCon is going to be a “Python 2.6 thread­ing changes and intro the mulitprocessing”

In the final talk, given it’s short — it’s going to be an overview of the changes/new mod­ule and a walk through (I hope) of a basic application.

My ques­tion to you, oh inter­webs, is what might you like to see/be able to get from talks in these veins? I can’t go into all of the whys and hows and whens, but I can arm peo­ple with enough infor­ma­tion to get up and running.

I des­per­ately want to get the “most bang for the buck” in these talks, obvi­ously, due to the nature of slides, except for the tuto­r­ial, I won’t be able to show hun­dreds of lines of code to illus­trate everything.

For an exam­ple appli­ca­tion — I need to make some­thing which will “scale up” — from thread-based approaches all the way to distributed-over-the-network approaches.

  • Doug Napoleone

    One idea for the exam­ple program:

    Con­cur­rent test execution.

    You have a bank of tests you need to run before com­mit­ting, and a 4 ore machine.
    Scale that up to some­thing which can be run across mul­ti­ple machines for a qa cycle.

    Just because there are pack­ages out there which already do this (nose+buildbot, etc), does not mean that is is not a good exam­ple problem.

    It sounds like it meets your needs and pulls from your strengths.

  • http://www.goldb.org Corey

    I won’t be there but can hope­fully catch a video or see the slide deck.. sounds very inter­est­ing and this is an area I am very into. Id espe­cially be inter­ested in hear­ing about changes to the thread­ing API. I didn’t real­ize any­thing changed in 2.6/3.0.

    I am very intrigued with the new mul­ti­pro­cess­ing and will be using it imme­di­ately on a project as soon as the final release of 2.6 goes out :)

  • jnoller

    Oddly enough, I was design­ing some­thing like this today. Also, Jason P, of nose fame has been work­ing on a mul­ti­pro­cess­ing based nose exten­tion to do this:

    http://code.google.com/p/python-nose/source/bro…
    http://code.google.com/p/python-nose/issues/det…

    I could fur­ther build on that — it’s not a bad idea. I’ve also con­sid­ered doing a web-load-test appli­ca­tion, or a data-crunching app ala the wide finder project (http://www.tbray.org/ongoing/When/200x/2007/09/…)

  • jnoller

    No prob­lem, I’ll prob­a­bly post every­thing here when it all comes out.

    As for the thread­ing changes in 2.6/3.0 — it’s really just pep8-ifying some of the names, and mov­ing some of the other inter­nals to prop­er­ties. See: http://bugs.python.org/issue3352

    I’d be inter­ested to hear how you’re using the pack­age, some of what you and I do at our day jobs over­lap, so to speak :)

  • Ben

    I’d be inter­ested in some­thing like Kamaelia… Maybe a map reduce kind of appli­ca­tion built with kamaelia that can be dis­trib­uted over the network?

  • http://blog.tplus1.com Matt Wil­son

    If you’re gonna talk about the GIL, can you explain what stack­less python is and what it means for con­cur­rent programming?

  • jnoller

    I can men­tion it briefly, but won’t go into details given stack­less
    still has a GIL

  • http://edit.kamaelia.org/Developers/ Michael Sparks

    Jesse — please take a look at my Kamaelia talks on slideshare …

    http://www.slideshare.net/kamaelian/slideshows

    … they’re (lit­er­ally) lit­tered with numer­ous exam­ple sys­tems which are highly con­cur­rent, writ­ten in python and tested. Kamaelia is designed specif­i­cally with novices in mind, and the fol­low­ing 2 talks were given at pycon UK last weekend:

    http://www.slideshare.net/kamaelian/practical-c…

    http://www.slideshare.net/kamaelian/sharing-dat…

    The lat­ter one needs a bit of work based on feed­back in the ses­sion, but the talks seemed highly acces­si­ble. (Indeed, kamaelia has been designed with acces­si­bil­ity in mind as a key design fea­ture — it’s also the dri­ving long term rea­son behind Kamaelia’s existence :-)

    Kamaelia is pri­mar­ily a mes­sage pass­ing based sys­tem (gen­er­a­tors, threads, processes as units of con­cur­rency), but also includes a soft­ware trans­ac­tional mem­ory imple­men­ta­tion for the times when you really do need to share data. (kamaelia’s real under­ly­ing design is “no uncon­strained data” and “if you have a piece of data and haven’t given it back/to any­one else, your changes should be safe”.

    We’re actu­ally in the process of a revamp of the website.

    If you want a selec­tion of appli­ca­tions to talk about — there’s exam­ples using net­work servers, inter­ac­tive pygame based appli­ca­tions, dig­i­tal TV/PVR sys­tems, offline batch transcoders, to name a few… I’ve noticed a ten­dency for peo­ple to relate to very con­crete ideas — which is why this time round I chose a sim­ple and direct game as the ini­tial example.

    The key thing which is per­haps dif­fer­ent is that the focus in kamaelia is on mak­ing com­mu­ni­cat­ing sys­tems use­ful through a com­po­nent metaphor, rather than on “lets use con­cur­rency for con­cur­rency sake”. ie use a pat­tern for soft­ware con­struc­tion that hap­pens to (delib­er­ately :-) drop out as nat­u­rally highly concurrent.

    Per­haps my most fun hack (10 hours or so) was the Speak And Write app — which uses hand­writ­ing recog­ni­tion and speech syn­the­sis to teach a child to read and write.
    http://edit.kamaelia.org/SpeakAndWrite

    Regard­ing the map-reduce idea — I’d per­son­ally love to see such a beast writ­ten using Kamaelia and would also be more than happy to see it merged onto /trunk …

    Per­haps the only down­side of Kamaelia vs your talk is the choice of mul­ti­pro­cess­ing mod­ule in python 2.6 doesn’t match the one we’ve been using in Kamaelia. (at some point we’ll flip when we’ve got a chance — it’ll make lit­tle dif­fer­ence to user code beyond being 1 less thing to install :)

    Any­way, you did ask “what do you want to see …” :-)

    What I would hope is that you teach peo­ple the one rule that causes all prob­lems in con­cur­rent sys­tems though — uncon­strained shared data. If you make sure you don’t have that, life becomes easy, fun and con­cur­rent. If you do have any though, life often becomes very hard, and painful to debug.

    Any­way, bets of luck for you talk — I hope it goes well :-)

  • http://edit.kamaelia.org/Developers/ Michael Sparks

    Hmm. sorry for all the typoes — that’ll teach me for post­ing responses to things when it’s late! Any­how, good luck :)

  • jnoller

    I’ve looked at Kamelia before — it def­i­nitely looks like another “swiss army knife” frame­work — ala twisted, etc. I’m going to men­tion all of the frame­works (kamelia included) but I want to focus on the more prim­i­tive aspects of con­cur­rency (mul­ti­core, mes­sages) and how to build some­thing using stan­dard python.

    Of course, I need a good net­work layer, so maybe I’ll poke at Kamelia’s :)

  • http://edit.kamaelia.org/Developers/ Michael Sparks

    OK, cool. Let’s see.

    Tell you what — how about I just pack­age up the net­work com­po­nents (no pro­to­cols, no pygame, no open gl, etc), along with a min­i­mal set of other use­ful com­po­nents as well by itself for you? ie the absolute min­i­mum use­ful (by no smaller) for build­ing TCP & UDP (inc mul­ti­cast) based servers & clients ?

    ie some­thing that will just install quickly and cleanly by itself, with no exter­nal dependencies?

    That should be pretty tiny and still useful.

    Most of Kamaelia’s com­po­nents have no depen­dency on each other, so this would be fairly easy and quick to do. (It’s per­haps bet­ter to think of Kamaelia as a toy box or tool chest or radio shack which you can just take the things out you want, rather than as a swiss army knife IMO) All the com­po­nents depend on Axon, but then that’s not too supris­ing — Axon is the com­po­nent frame­work. But even that is small enough to be comprehensible.

    Heck, I’ll spec­u­la­tively do it — it’ll be quick to do and if you want to use it, feel free. :-)

  • http://medicinebrain.com Tim Nash

    my biggest hope for the mul­ti­proces­sor library is that it will allow me to con­tinue work­ing in python rather than switch to Erlang. Erlang has a great show­case in couchDB, the FOSS project most like BigTable. Why don’t you start a couchDB clone built in python using the new libraries? Or if a par­tially com­pleted project won’t do, please show a map-reduce exam­ple (eg. dis­trib­uted grep)
    –Thanks!

  • http://medicinebrain.com Tim Nash

    my biggest hope for the mul­ti­proces­sor library is that it will allow me to con­tinue work­ing in python rather than switch to Erlang. Erlang has a great show­case in couchDB, the FOSS project most like BigTable. Why don’t you start a couchDB clone built in python using the new libraries? Or if a par­tially com­pleted project won’t do, please show a map-reduce exam­ple (eg. dis­trib­uted grep)
    –Thanks!

What's this?

You are currently reading What do you want to see in a concurrency talk… at jessenoller.com.

meta