A (brief) introduction to Python-Core development | Completely Different

February 4th, 2009 § 2 comments

This is a reprint of an arti­cle I wrote for Python Mag­a­zine as a Com­pletely Dif­fer­ent col­umn that was pub­lished in the August 2008 issue.

In the early sum­mer of this year I had the chance to really get started work­ing on/with the core Python source. I had spent some time putting together a Python Enhance­ment Pro­posal (PEP) which was accepted. Now, I just needed to learn the code base, prac­tices and buy a hel­met. Shortly after get­ting the ini­tial patch accepted, I ended up break­ing the build, tests and caused the beta to slip. This arti­cle is an intro­duc­tion to Core devel­op­ment, in which we’ll cover what you need to get started, and where I per­son­ally screwed up.

Intro­duc­tion

Core Python devel­op­ment (or, “hack­ing on python-core” as it may be called) is, like all great open-source projects, a highly dis­trib­uted, highly active, and high par­tic­i­pa­tion project. There are devel­op­ers all over the world fil­ing bugs, sub­mit­ting patches for code and doc­u­men­ta­tion, as well as par­tic­i­pat­ing on the python-dev mail­ing list and IRC channel.

Like all other good open source com­mu­ni­ties, it’s a mer­i­toc­racy of the tech­ni­cal per­sua­sion. A good idea is sim­ply that: a good idea. If a good idea is the best of breed, it will be adopted or adapted to the lan­guage and project. If an idea or a patch is clear, con­cise, and solves a prob­lem, there is gen­er­ally no dif­fi­culty in get­ting trac­tion or get­ting a patch put into core code base.

Let’s start from the beginning

While Python is a mer­i­toc­racy where any per­son can sub­mit a patch, file a bug, or send emails to python-dev (some­times, that last is more of a curse than a bless­ing), there is a par­tic­u­lar group of peo­ple that has com­mit priv­i­leges. This group is respon­si­ble for judg­ing all patches, pro­posed bugs and asso­ci­ated fixes, and ulti­mately com­mit­ting the actual code to the tree.

Python’s code, doc­u­men­ta­tion, PEPs, and other arti­facts are all hosted within a Sub­ver­sion (svn) repos­i­tory. While the core is in svn, you can also access it via other pop­u­lar ver­sion con­trol tools. There are Bazaar, Git, and Mer­cu­r­ial mir­rors of the svn repos­i­tory. All of the exam­ples in this arti­cle will revolve around sub­ver­sion, though, because the other trees are still experimental.

In order to view the repos­i­tory, you need to check out a read-only ver­sion of the source tree. Write access is only avail­able via svn+ssh authen­ti­cated access, but you can use HTTP for a read-only copy. So, to check it out:

mkdir -p python/trunk
svn co http://svn.python.org/projects/python/trunk python/trunk

This is your own, pris­tine copy: any edits you make in this tree will come up on a ”svn diff” (which you’ll use to make patches). Avoid edit­ing files you don’t need to so you don’t acci­den­tally taint a diff or checkin.

The basic lay­out of the tree is unsur­pris­ingly sim­ple, so I’ll only really cover the impor­tant files/directories:

”Doc/” con­tains all of the doc­u­men­ta­tion for the lan­guage, which will be dis­cussed in more detail later. If you want to see the stan­dard library doc­u­men­ta­tion, look in Doc/library.

You will find the brain-melting gram­mar def­i­n­i­tion for the Python lan­guage in ”Grammar/”.

Header files for C code go in ”Include/”.

Libraries writ­ten in Python are in ”Lib/”. You’ll note a dis­tinct lack of C code in this direc­tory. That’s because C mod­ules go in the ”Mod­ules” direc­tory. Also found in ”Lib/” is the ”test/” direc­tory, which we’ll be focus­ing on later. If you want to see some pretty Python code, read the files in this direc­tory. Except any­thing I’ve done.

C exten­sions, such as mul­ti­pro­cess­ing, ctypes, cStrin­gIO, et cetera can be found in ”Modules/”. Gen­er­ally speak­ing, these are opti­mized mod­ules for the stan­dard library. Some of them are in sub­di­rec­to­ries for clean­li­ness, but most of them are in the top level Modules/ direc­tory. Note that there is a style guide for C code for the stan­dard library, out­lined in PEP 7.)

The ”Misc/” direc­tory con­tains things that don’t belong else­where within the tree. This includes the NEWS file, build notes, con­fig­u­ra­tion for val­grind (a code profiling/debugging util­ity), a cheat sheet (some­what dated, but still use­ful), and some edi­tor plu­g­ins. A really good file here is SpecialBuilds.txt, which goes over all the magic flags for Python builds you should know about.

Python objects are defined in ”Objects/”. It con­tains all C code, and is pretty well doc­u­mented. If you sud­denly get the urge to make a new type, start here.

Mis­cel­la­neous tools go in ”Tools/”. I haven’t had to use much of any­thing down here except for the scripts in the ”scripts/” sub­di­rec­tory. The ”script” direc­tory is just filled with cool things like untabify.py, crlf.py, and google.py

There are two build files. The main build file, sort of, is ”setup.py”. I list it here because you need to look at this file to real­ize how things are built. The make steps we cover later are wrap­pers around this script for the most part. The the “other” build file is ”Makefile.pre.in”. It works with ”setup.py” to con­trol the entire com­pi­la­tion process and has some nifty tar­gets, like “make tags”. Who knew the build process could spit out a tags file for ”vi”?

It is impor­tant that you pay atten­tion to both ”setup.py” and ”Makefile.pre.in”. When I for­got one line in the Make­file, my exten­sion mod­ule seemed to work, but didn’t really. I could “import mul­ti­pro­cess­ing” from within the svn tree using the local python inter­preter. How­ever, after run­ning “make install” the exten­sion mod­ule was not installed, so it did not work with the installed inter­preter. I finally dis­cov­ered this was due to a sin­gle miss­ing entry in LIBSUBDIRS.

Whew. That’s a lot of direc­to­ries. I skipped over the Win­dows build stuff, and I am going to con­tinue to do so, not­ing that I am not a Win­dows expert. I do know that if you are on Win­dows you will need to look in the ”PCBuild/” direc­tory for build infor­ma­tion, Visual Stu­dio projects, etc.

Build­ing

Before we go any fur­ther, let’s walk through the basic build process. Remem­ber, I’m a Linux and OS X guy, so I will be walk­ing you through the steps you would take on a Unix machine. Win­dows users will need to either use Visual Stu­dio, or install Cyg­win (a Unix tool chain for Win­dows). Installing the Cyg­win tool chain means you should be able to com­pile just fine fol­low­ing these directions.

First off, the ./configure step. If you’re famil­iar with auto­conf, automake, and the like, you’re more than famil­iar with this. For those that aren’t, the con­fig­ure, make, etc. steps are com­mon to con­fig­ur­ing and compiling/installing a given appli­ca­tion. See the link to Auto­conf in the require­ments sec­tion for more details. There are some cus­tom options for con­fig­ure (of course), which you can see with ”./configure –help”. The main one you want to know about and use is ”–with-pydebug”, which enables a spe­cial debug build of Python. You are going to want to have the debug build if you start heav­ily work­ing on the core of the inter­preter. The ”–with-pydebug” flag enables, in no par­tic­u­lar order, LLTRACE, Py_REF_DEBUG, Py_TRACE_REFS, PYMALLOC_DEBUG, C code asser­tions, and all code that has ”#ifdef Py_DEBUG” blocks. In other words, it turns on just about every debug­ging fea­ture you could pos­si­bly need or want, short of some­thing that fixes your code for you automatically.

For the exact details on all of the con­fig­ure flags, includ­ing plat­form spe­cific options, see Misc/SpecialBuilds.txt.

To start a build, just fire off a

$ ./configure --with-pydebug

in ”python/trunk”. Once this is done, unless you really want to twid­dle the options, you shouldn’t need to do this again for a while. Brett Can­non once told me, when talk­ing about some devel­op­ment Text­Mate macros, “I left out con­fig­ure stuff because that becomes rather personal”.

Next up, exe­cute ”make” in the python/trunk direc­tory. You’ll see your nor­mal make out­put, but there are a few caveats to keep in mind.

Here is some exam­ple out­put from the ./configure and make steps:

$ ./configure
checking for --with-universal-archs... 32-bit
checking MACHDEP... darwin
checking EXTRAPLATDIR... $(PLATMACDIRS)
...snip...
creating Modules/Setup
creating Modules/Setup.local
creating Makefile
woot:python-trunk jesse$ make
... gcc output snipped ...
Failed to find the necessary bits to build
these modules:
_bsddb             gdbm               linuxaudiodev
ossaudiodev        readline           spwd
sunaudiodev
To find the necessary bits, look in setup.py in
detect_modules() for the module's name.

running build_scripts
$

Pay atten­tion to the build out­put. If you’re work­ing on a mod­ule with C exten­sions or the inter­preter itself, what can go wrong here will go wrong. For exam­ple, while work­ing on inte­grat­ing the _multiprocessing library to ”Modules/”, the ini­tial issues around sim­ple com­pi­la­tion were exposed here.

As you can see, there is an impor­tant report at the end of the make step (the log line looks like: “Failed to find the nec­es­sary bits to build these mod­ules:”). The infor­ma­tion given in that report is espe­cially impor­tant if you need access to the skipped mod­ules. For exam­ple, on OS X the ”read­line” mod­ule doesn’t com­pile out of the box. You will need to resolve the depen­den­cies listed in ”trunk/setup.py” in order to get it up and running.

If you want to “quiet down” the make step, adding the “-s” flag will make it less ver­bose. Also, if you want to speed it up, con­sider using the “-j NUM” to increase the num­ber of con­cur­rent com­mands being performed.

Once the build com­pletes suc­cess­fully, you should have a work­ing Python binary in your local direc­tory. On OS X and Win­dows it’s named ”python.exe” and on Unixes it’s named sim­ply ”python”. If you wanted, you could fire this ver­sion up and poke around, but for devel­op­ment your next step should be to run the tests.

Run­ning Tests

Python’s source tree’s tests are pri­mar­ily exe­cuted with the ”Lib/test/regrtest.py” util­ity (this may change in the future) and ”make test”. If you were to run ”make test” in the ”trunk/” direc­tory right after build­ing, you would run a sub­set of all of the tests located in ”Lib/test”. Cer­tain tests, such as large file tests and oth­ers that take a lot of time or resources are excluded in favor of brevity.

For details on what a ”make test” step does, open Makefile.pre.in and search for “# Test the inter­preter” (it should be around line 660). You will find the def­i­n­i­tions for what hap­pens dur­ing the ”test*” steps as well as the options that invoke ”regrtest.py”. You can change the test options via the ”TESTOPTS=” flag to ”make test”. For exam­ple, to run a sin­gle test:

$ make test TESTOPTS=test_multiprocessing

The real magic hap­pens in regrtest.py, the Python regres­sion test exe­cu­tion script). You need to run this for any change made to the code, period. A basic run is the same as the basic ”make test” exe­cu­tion. This means that cer­tain tests are excluded, but you can enable those tests (and a lot more) via addi­tional argu­ments to regrtest.py. There is even an option to enable cov­er­age analysis.

A basic invo­ca­tion of regrtest.py looks like this:

$ ./python.exe Lib/test/regrtest.py
test_grammar
test_opcodes
test_dict
...snip...
test_zlib
327 tests OK.
32 tests skipped:
    test_al test_bsddb test_bsddb3 test_cd test_cl
    ...
    test_winsound test_zipfile64
Those skips are all expected on darwin.

Pretty pain­less, but if some­thing goes wrong, there’s not a lot of infor­ma­tion to go on. A bet­ter way to run it is with the ”-w” option, which will re-run any failed test with addi­tional ver­bosity. For exam­ple, I added a line that would cause one of the tests to crash in List­ing 1.

List­ing 1:

$ ./python.exe Lib/test/regrtest.py test_multiprocessing
test_multiprocessing
test test_multiprocessing crashed -- : name 'mportasdl' is not defined
1 test failed:
    test_multiprocessing
$ ./python.exe Lib/test/regrtest.py -w test_multiprocessing
test_multiprocessing
test test_multiprocessing crashed -- : name 'mportasdl' is not defined
1 test failed:
    test_multiprocessing
Re-running failed tests in verbose mode
Re-running test 'test_multiprocessing' in verbose mode
test test_multiprocessing crashed -- : name 'mportasdl' is not defined
Traceback (most recent call last):
  File "Lib/test/regrtest.py", line 549, in runtest_inner
    the_package = __import__(abstest, globals(), locals(), [])
  File "/Users/jesse/open_source/subversion/python-trunk/Lib/test/test_multiprocessing.py", line 6, in 
    mportasdl;fj
NameError: name 'mportasdl' is not defined
$

There’s one more impor­tant flag to regrtest.py you need to know about, and that’s ”-uall”. This option will run all of the tests, and obvi­ously, when you’re chang­ing some­thing really low level, you need to run these tests. They take a long time, so I rec­om­mend run­ning them before going to bed.

Doc­u­men­ta­tion

Yes, even doc­u­men­ta­tion has bugs. All of Python’s doc­u­men­ta­tion resides in the ”Doc/” direc­tory, and it has its own build scripts and sys­tem, called Sphinx. The stan­dard library doc­u­men­ta­tion mod­ule overviews we all know and love are located in ”Doc/library/”. When you are mak­ing a change that will be pub­lic in nature (say, adding a method) you need to find and update the asso­ci­ated documentation.

Also, when adding new pack­ages, mod­ules or meth­ods, you should really con­sider adding an exam­ple in the appro­pri­ate sec­tion of the module’s .rst file (not the ”Doc/examples” direc­tory). It is com­mon for new Python users to have dif­fi­culty find­ing clear exam­ples on stan­dard library mod­ule usage, so the more exam­ples the merrier.

If you’re stuck with the doc­u­men­ta­tion, feel free to send an email to docs@python.org and ask for help. There are a lot of good peo­ple signed up for that list and they’re will­ing to help you if you’re stuck.

The doc­u­men­ta­tion is all in ReST (ReStruc­tured Text) for­mat and there is some Python-specific syn­tax that can be of use to you. See the “Doc­u­ment­ing Python” page for more infor­ma­tion. A nice nugget I found was break­ing the big­ger exam­ples out of the main ”module.rst” file (the doc­u­men­ta­tion file for a give mod­ule, in ReStruc­ture Text for­mat), and include them sep­a­rately with:

.. literalinclude:: ../includes/mp_webserver.py

This means you can drop the python code into the ”Doc/includes” direc­tory and it will be popped in place when the doc­u­men­ta­tion is built.

When you want to try build­ing the docs, sim­ply go into ”trunk/Docs” and type ”make html” to con­vert all of the doc­u­men­ta­tion into the HTML files you know so well from the Python doc site. Don’t worry about installing Sphinx in advance, the build rules do that for you. Once built, the html doc­u­ments live in ”Doc/build/html”.

At very least, when­ever you make a change to core, you should update the ”Misc/NEWS” file to add a brief descrip­tion of your change, and also add your name to ”Misc/ACKS”.

Mak­ing a change

Let’s assume for the moment you’re about to pro­vide a patch to fix a bug from the python bug tracker. Most fixes will require the fol­low­ing min­i­mal changes:

  • Updated Python module
  • Updated doc­u­men­ta­tion (At least an entry in the NEWS file)
  • Updated Tests (you will update the tests)

In a few cases you also will need to update the C code. After you’ve done the ini­tial check out of the branch you’ll be work­ing on, and you’ve con­firmed the build and tests pass on your machine, you should be set to make your changes locally, apply any patches you are test­ing, etc.

When you’re updat­ing or adding new tests you need to drop into the ”Lib/test” direc­tory and find the “best place” for the test. Typ­i­cally, if you’re mak­ing a bug fix, you’re sim­ply going to append the test onto the suite for the mod­ule. Larger scale changes, includ­ing cre­at­ing new pack­ages or mod­ules, will need their own ”test_*.py” file in ”Lib/test”.

It’s impor­tant when you’re adding tests that your tests are clear, well doc­u­mented, and most of all smart. They will need to know when not to run (say, a net­work test should not run when no net­work is present) and they need to be reli­able (i.e.: they should never just hang). The tests and code you sub­mit will be viewed by many peo­ple, and com­piled and tested on more plat­forms than most of us have ever used. The smarter you make the test, the bet­ter off every­one will be.

An impor­tant tool in the test developer’s arse­nal is the ”test_support” library included in ”Lib/test/test_support.py”. In it you will find a vari­ety of func­tions, excep­tions, and tools to help you to write core tests. Most of all, look at the other tests!

Once your changes work, you should run a ”make check” to per­form some house­keep­ing oper­a­tions you want to do prior to gen­er­at­ing the diff. These include fix­ing white­space, check­ing the NEWS/ACKS file for updates, and remind­ing you to run the test suite! See ”Tools/scripts/patchcheck.py” for every­thing ”make check” does.

On Code Bombs

It’s impor­tant to avoid mak­ing wide­spread changes in a vac­uum. Large scale refac­tor­ing or changes to an API used by a lot of the stan­dard library should be reviewed care­fully and often. Typ­i­cally, it’s bet­ter to post an ini­tial patch up on the bug tracker and then revise it as other people/contributors make com­ments than to drop a huge patch on every­one and say “it’s done”.

A recent python-dev post from Guido high­lighted this issue, the take-away quote (from both his email, and the blog post he linked to) being: “The story’s main moral: sub­mit your code for review early and often; work in a branch if you need to, but don’t hide your code from review in a local repos­i­tory until it’s ‘per­fect’.” For more details, see the “Code Bombs” thread listed in Related Links above.

One of the tools at your dis­posal for pub­lish­ing patches for review is Rietveld, the review appli­ca­tion cre­ated by Guido Van Rossum. Typ­i­cally, if you have a small enough change, putting a patch in the bug tracker is sufficient.

How do you gen­er­ate a patch, big or small? It’s easy: cd into your ”trunk/” direc­tory and run ”svn diff >mychange.patch”. This will cre­ate a patch con­tain­ing only your changes which can then be uploaded to the bug tracker, emailed to the com­mu­nity, etc.

Apply­ing the patch is also easy. Just hop into the ”trunk/” direc­tory and run ”patch –p0

Con­clu­sion

A good first step to con­tribut­ing to core is to con­sult the bug tracker. There you can find every­thing from mind-melting inter­preter issues to sim­ple one-line fixes (famous last words). There’s even a query to find “Easy” issues (see the side­bar on bugs.python.org).

One great thing about Python devel­op­ment is that any­one can pro­pose an idea. Should it stand on it’s own merit, it will prob­a­bly be accepted. So even if you don’t find a bug in an area you’re pas­sion­ate about, why not find some­thing you are inter­ested in and make a Python Enhance­ment Pro­posal for the change? Pub­lish it to python-dev and put together the patch for the code. You can do this for exist­ing mod­ules or even new ones.

Ulti­mately, Python is your lan­guage. With­out the peo­ple con­stantly con­tribut­ing to core in the form of bug fixes, doc­u­men­ta­tion and new pro­gram­ming con­cepts, Python would sim­ply die on the vine. The more help, the bet­ter the lan­guage becomes, and the wider the appeal and audience.

Related Links

  • http://www.drbrett.ca/ Brett C.

    Holy crap, out of those 13 related links I con­tribute some­how to nine of them! I think my phi­los­o­phy degree has led to a lot of writ­ing for me. =)

  • http://www.drbrett.ca/ Brett C.

    Holy crap, out of those 13 related links I con­tribute some­how to nine of them! I think my phi­los­o­phy degree has led to a lot of writ­ing for me. =)

What's this?

You are currently reading A (brief) introduction to Python-Core development | Completely Different at jessenoller.com.

meta