Suffering from insomnia this morning, I decided to delve into my python-list archive box in gmail. I normally only scan it once or twice a month due to signal to noise ratio. A post by James Mills here caught my eye:
I’ve noticed over the past few weeks lots of questions
asked about multi-processing (including myself).For those of you new to multi-processing, perhaps this
thread may help you. Some things I want to start off
with to point out are:“multiprocessing will not always help you get things done faster.”
“be aware of I/O bound applications vs. CPU bound”
“multiple CPUs (cores) can compute multiple concurrent expressions -
not read 2 files concurrently”“in some cases, you may be after distributed processing rather than
multi or parallel processing”cheers
James
James is very correct:
James is quite correct, and maybe I need to amend the multiprocessing
documentation to reflect this fact.While distributed programming and parallel programming may cross paths
in a lot of problems/applications, you have to know when to use one
versus the other. Multiprocessing only provides some basic primitives
to help you get started with distributed programming, it is not it’s
primary focus, nor is it a complete solution for distributed
applications.That being said, there is no reason why you could not use it in
conjunction with something like Kamaelia, pyro, $ipc mechanism/etc.Ultimately, it’s a tool in your toolbox, and you have to judge and
experiment to see which tool is best applied to your problem. In my
own work/code, I use both processes *and* threads — one works better
than the other depending on the problem.For example, a web testing tool. This is something that needs to
generate hundreds of thousands of HTTP requests — not a problem you
want to use multiprocessing for given that A> It’s primarily I/O bound
and B> You can generate that many threads on a single machine.
However, if I wanted to say, generate hundreds of threads across
multiple machines, I would (and do) use multiprocessing + paramiko to
construct a grid of machines and coordinate work.That all being said: multiprocessing isn’t set in stone — there’s room
for improvement in the docs, tests and code, and all patches are
welcome.–jesse
Like any tool, library — or even language — you have to know when to switch one tool for another. For example — it doesn’t make sense for anyone to use python 100% of the time, maybe you have some math routine that simply makes more sense written in C (say, a crypto function). Heck, even Java is better suited for some tasks (like making really long lines in source files!).
Yeah, I wrote PEP 371: but even I am not blind to the usefulness of things like Actors, Threads, Coroutines, Stackless Python, etc. There is no single solution to anything, the most we can ever hope for is to have a rich toolbox from which to pick the proper tools.
-
http://Thirdpipe.com JohnMc
-
http://Thirdpipe.com JohnMc