Help needed: multiprocessing

by jesse in ,


Originally, this post was going to be much more different than what is has become - the original title was "Failing in Public" - but I don't think "failing" is fair to me personally, or to anyone who has ever helped me, or contributed a patch or a fix to the multiprocessing module.

Yesterday, I made a statement on twitter:

I am officially looking for someone to take over multiprocessing maintenance from me. http://bugs.python.org/issue6721

Ignoring any comments in that bug; I maintain that a later tweet is still true:

Sometimes good points and poignant criticism can be buried in a pile of crap.

In hindsight; I could have worded the original message differently "taking over maintenance" means that I am, and always have been the sole contributor to the multiprocessing code base, which is patently false. Antoine, and many other python core developers, and people within the community have submitted bug reports, patches, tests and documentation. My words were intentionally harsh - but the direction of that harshness was to me; I feel that as the "leader" (for some measurement of "lead") I have been remiss in my responsibilities and leadership.

Sure; I could be less harsh on myself - but the level of expectational debt that I've incurred against myself for the module and the maintenance has grown, and grown. Even if I find myself leading PyCon, busy as a PSF Director, pushing the core-mentors program, the sprints program, and a lot of other community projects, I am still responsible for the care and feeding for the creature I helped create and birth. I've committed the sin of "going dark".

For some history, see:

Months ago - I spun up the multiprocessing-sig mailing list, in hopes to engage more people - highly active users, interested people, etc to help me pay down the debt. Of course, in retrospect; it's unfair for me to expect anyone but me to help me pay down the debt I've incurred. On the other hand, the responsible thing for me to do - the mature thing for me to do - is to ask for help - not to "wash my hands" of anything, but rather to take this as an opportunity to look as multiprocessing as something greater than what I originally envisioned and submitted to core.

I hear from people every day who are using the module - every day, something I helped birth helps people get things done. Multiprocessing has grown up by virtue of becoming part of Python core, and daily - despite the bugs, the debt and the quirks - it helps developers achieve something they might have otherwise been unable (or at least, had a more difficult time) to do.

The module is expansive - it has pools, tools for distributed programming via managers, pipes for interprocess communications, it's feature set is both large, and ultimately complex in its underpinnings. That complexity - that feature set - is the reason why that debt, the bugs, the quirks has grown over time. If it wasn't being used - I wouldn't have so many emails about it - or bugs filed against it.

So where are we/it today?

Today, multiprocessing has widespread usage - in Python 3, there's actually a new module named concurrent.futures that builds on the building blocks of multiprocessing and threading. Packages like Celery use it extensively (and work around internal quirks). For Python 3 - the sky is the future for what multiprocessing could be - additional functionality, moving parts of it (such as the pool abstractions) into the concurrent namespace, extending and improving the Manager classes, etc. For Python 2.7 - bug fixes, doc fixes only.

If you search the Python bug tracker for the word "multiprocessing" regardless of assignee, you'll get 119 hits. That's right; 119 - not all of them are multiprocessing bugs - and many of them are dupes, or fixed in recent versions. What that query gets you is an idea of the debt that has to be paid down and resolved. Each one of those bugs needs to be looked at, reproduced, de-duped and patched. Some of them may be documentation issues, some are pretty hairy (like the aforementioned http://bugs.python.org/issue6721 as well as http://bugs.python.org/issue4106 and http://bugs.python.org/issue8713).

What I asking for - rather than washing my hands of anything, or any attempt to absolve myself of responsibility, is for help. I am stretched thin - too thin to do this myself, or to be the only person who can maintain, understand or work on this module. It's too big for that, it's too important for me to be the arbiter of it any longer. It's bigger than me.

Aside from the bug queue; there's a short list of things that need to be done - the docs need to have a fresh, hard set of eyes on them, there are things (behaviors, features) that are undocumented. The test suite needs a complete overhaul - when I inherited the code, this is the first thing I should have done - but I didn't. The problem is that the test suite is mired in magic and complexity, and without an expansive, maintainable test suite, I don't feel confident that the bug list can be addressed with confidence.

So, I come to you with my hat in my hands, a humbled man. It's unreasonable for me to ask for others to "pay down" the debt I've incurred; but it's irresponsible, immature and misguided of me to think that I alone, or any single person can go at this alone. So I need your help - and, if in time, someone choses to be the "leader" for the module, then I will gladly step back. Until then, I will try to continue to be a guiding hand and at least point people in the proper direction, commit patches, etc.

If you are interested; please speak up - or just wade into the bug queue. You can sign up to the multiprocessing-sig list, and ask questions there, or if you're new to core Python, and want some additional mentorship, check out the Python Core Mentorship program - that list serves as a gentle and polite, welcoming introduction to core development. No question - no matter how green - is off limits, and it's already got an excellent track record of helping people get up to speed.

In closing; I'm going to apologize - we all know lives change, careers change, and interests change. All of these things have happened to me, but in changing so quickly and taking on different roles, I left something important behind. In doing so, I have done a great disservice to you, the community and users.

I will also thank you; without you - the users, current and future helpers, multiprocessing wouldn't exist or be relevant in any context. Without you, I wouldn't have the drive to even write this post, fight to get multiprocessing into the std lib to begin with, or perform any of the other roles I do.

So; thank you.