Have GIL: Want Benchmarks.
1 So, with the recent furvor over the GIL, one of the things GvR asked for was for someone to provide some benchmark numbers of Single v. Multi v. Other in various tests.
I started working on this a night or so ago, and a lot of things have fallen out of it - not the least of which is my burning desire to really look at the GIL, threading in python and the ecosystem of alternatives.
A thing of Note: I've spawned an initial google code project to explore the alternatives/benchmarks. Read about it here.
The first test script I wrote was a simple one: take a function which calculated a number of fibonacci numbers and run it in a loop with larger and larger sets. I then:
- Ran it twice, synchronusly.
- Ran it twice, once in each thread (2 threads).
- Ran it twice, but used the processing module to spawn to processes, running it once in each.
- Ran it twice, but used the parallel python module to spawn a pool of workers for 2 processors, running it once in each.
Yes. I know that fib calculations are a processor-only activity, and that's all I wanted to do at first. I want to add tests for file access, network access, and I want to run the tests (as allowed by the extenstion modules) in PyPy, Jython and IronPython2.
In the long run - I want to make something of a test suite to not only demo the various methods/alternatives but to also to work towards a future where something like parallel python or the processing module can make it into the standard python library.
Not to mention, for those looking for alternatives to basic threading in python (or more information on threading in python). Yes, things that have to be taken into account:
- Shared vs. unshared memory and state.
- Ease of use/API
- Amount of Fun.
- Can the solution spread across machines?
- Is it, or can it be, useful for web applications? (i.e: can it be used to spread load across processors)
If you have a test or a suggestion - drop me an email or post a comment. Heck, this might just turn out to be a learning experience for me or it could turn out to be something bigger.
Update: Wow, you guys are awesome. The private and public comments I've gotten on this have left me with a lot to think about - and hell - a lot to learn. I'm still in the toying/planning phase, so please keep feeding me information. When I get more time this weekend (if I get time) I will try to put up something cohesive, including a subversion repository and a maybe a wiki.
Update 2: Hello Reddit.
- Yes, this is a gratutitous picture of my daughter [↩]
- Yes, Daniel Watkins has started this process as well [↩]


September 12th, 2007 at 7:07 pm
Interestingly enough, I’ve just implemented a GUI-based file downloader using both threading and processing on OS X on a G4 (single core). Using urllib2 to download (reading chunks of 1MB at a time) I found that the threading implementation played better with the GUI (with some semi-icky hacks) than the processing approach.
Both approaches had the download worker writing directly to disk, so the traffic back from the worker to the parent was minimal (a “size received” message after each read()).
This was downloading from a LAN server, so the download speed was roughly 10MB/s.
The semi-icky hack in the threading approach: I had to introduce a semaphore in the worker main loop which only allowed the worker to “fire” once per every two GUI update loops, otherwise the GUI was starved for CPU.
So for me it wasn’t totally about how to make things go the fastest, but also about how to best share the CPU in an interactive application.
September 13th, 2007 at 4:49 am
Have read blog post: want your results.
September 13th, 2007 at 8:40 am
I will be posting the results + code once I get all my ducks in a row. I also want to pass it by a few people to verify I didn’t do something monumentally screwy.
September 13th, 2007 at 9:37 am
I agree with you that alternatives must be measured and tested.
Check out this other blog post about testing something similar with what you did:
http://blogs.warwick.ac.uk/dwatkins/entry/benchmarking_parallel_python_1_2/
Speccially, have a lot at what Mr. Guido Van Rossum said. (9th comment)
I’ve also sent you a email elaborating more on my thoughts about this whole “thinking, testing and choosing” a alternative “parallel/concurrency” API for python.
Best regards,
Miguel Sousa Filipe
miguel dot-thingie0 filipe at-thingie1 gmail dot-thingie2 com
September 13th, 2007 at 9:40 am
I got the email and replied, I also linked to Watkins’ blog post in mine, I will be following up with him as soon as I have a moment.
September 13th, 2007 at 3:42 pm
Thanks for taking the time to verify results. This is the kind of thing you (or me, or anyone) could screw up really easily. And it’s important we get solid results on this issue.
September 13th, 2007 at 4:13 pm
Please include Pyro in your tests: http://pyro.sourceforge.net/
Since Parallel Python uses threading inside the module, it really isn’t a pure non-threading solution. If you set up a similar test with Pyro using two worker processes on one machine, and a process on another machine to control it all, that would completely remove threading from the equation. It may turn out to have a minimal difference from Parallel Python but that is the kind of thing we want to learn from benchmarking.
Also, I think that the benchmark should be structured so that the master thread/process does a fair amount of communication, i.e. to supply the parameters for each loop. That way you can run the test in two ways, first with the Fibonacci computations, and second with a null computation. The second style of test will only measure the overhead. Theoretically, the variance in overhead would be the only difference between the Fibonacci runs, but, as always, you need to run the benchmarks to learn this.
Once you have it all running, please distribute it so that people can try it on different CPU and OS types. Benchmarks on a single CPU or OS type can often be deceptive.
September 14th, 2007 at 2:16 pm
I’m really glad you’re doing this. Python definitely needs good libraries for running multiple processes on different cpus, and your benchmarking efforts will definitely help this happen.
September 16th, 2007 at 7:06 pm
[...] last weeks tempest in a blogpot, and my subsequent post “Have GIL: Want Benchmarks” I’ve been doing a lot of reading, planning and discussion with [...]