Stirred up dem bees: Should BSDDB be removed from Python?

by jesse in ,


This week, we've seen a push dev-wise to get RC1 completed and ready to go - I've spent some time giving multiprocessing some love (still not done) and a lot of other people have been working around the clock to close out the large number of release blockers. As of last night though, the trigger was pulled on removing bsddb (the berkley DB python module) from the standard library in the 3.0 timeline (2.6 adds deprecation warnings).

Now, before anyone thinks this is an arbitrary decision, here's the argument (in a nutshell):

  • bsddb has always been painful to maintain
  • Jesus Cea is the only person who has stepped up to maintain it
  • bsddb is "heavy weight" - out most of the standard library, it has the most dependencies and nuances to cross platform maintenance.
  • Until Jesus Cea stepped up later in the 2.6/3.0 process it was "one of those packages" that no one wanted to maintain.
  • For most of 2.6 and 3.0 it's been a buildbot fail train.

See PEP 3108:

Maintenance Burden

Over the years, certain modules have become a heavy burden upon python-dev to maintain. In situations like this, it is better for the module to be given to the community to maintain to free python-dev to focus more on language support and other modules in the standard library that do not take up a undue amount of time and effort.

bsddb3

  • Externally maintained at http://www.jcea.es/programacion/pybsddb.htm .
  • Consistent testing instability.
  • Berkeley DB follows a different release schedule than Python, leading to the bindings not necessarily being in sync with what is available.

This thread is where the hammer fell.

Now, note that Jesus Cea has done an amazing amount of work updating/upgrading the bsddb support for 2.6 and 3.0 (see his recent announcement here). I feel for him in a lot of respects: He busted his butt to fix, maintain and resolve all open issues with bsddb and the buildbots for the release, but the decision had been made back in July to remove/deprecate the bsddb package (see above).

Now, there is a lot more discussion occurring around the removal:

Edit: I finally got a free moment to do an update - in an email this afternoon on Python 3000, the BDFL (GvR) made the final decision on bsddb - it's out as of py3k:

I am still in favor of removing bsddb from Python 3.0. It depends on a 3rd party library of enormous complexity whose stability cannot always be taken for granted. Arguments about code ownership, release cycles, bugbot stability and more all point towards keeping it separate. I consider it no different in nature than 3rd party UI packages (e.g. wxPython or PyQt) or relational database bindings (e.g. the MySQL or PostgreSQL bindings): very useful to a certain class of users, but outside the scope of the core distribution.

Python 3.0 is a perfect opportunity to say goodbye to bsddb as a standard library component. For apps that depend on it, it is just a download away -- deprecating in 3.0 and removal in 3.1 would actually send the *wrong* message, since it is very much alive! I am grateful for Jesus to have taken over maintenance, and hope that the package blossoms in its newfound freedom.