I saw this on the 'tubes today: Concurrency with Python, Twisted, and Flex by Bruce Eckel. Ever since I read it, I've gotten that tickle again to tackle twisted. He offers a pretty elegant solution to the problem of chunking up the work and spreading it across CPUs.
The draw for me to his solution is the asynchronous way in which the results are dispatched/captured (I'm going to be doing a lot of async work soon). It's a pretty good mini-intro into the semi-unpenetrable fort of Twisted.
Personally, I would have probably approached this using the processing module ala (note, I assumed access to Bruce's methods):
from processing import Pool, TimeoutError pool = Pool(detectCPUs()) # ... skipping building the worklist, in bruce's case it's $cores * steps TASKS = [ step1, step2, step3 ] results = [pool.apply_async(t for t in TASKS] imap_it = pool.imap(TASKS) imap_unordered_it = pool.imap_unordered(TASKS) print 'Ordered results using pool.apply_async():' for r in results: print '\t', r.get() print
Note, of course - I lifted the above right from the documentation examples for the module. Yes - I know, it lacks the network-ability that Bruce's twisted-based solution has innately (which does make his solution quite nice). You could do the same with the processing Client/Listeners in theory, but twisted is doing all of the connection management for him under the covers.
Dangit Bruce gave me the twisted itch again.
And later I'll do a more complete solution with the processing module, and compare the speeds of both against one another. I prefer the single-module solution over the twisted.* approach - but I can be swayed. :)