Chewing on Import

October 2nd, 2006 § 0 comments

Brett Cannon’s recent posts on rethink­ing import Here and Here got me think­ing about a pack­age I saw a few weeks ago called URLImport.

Look­ing at the about, he men­tions PEP 302 which cov­ers (his words):

Basi­cally, python sup­ports what is called a path hook, which enables you to hook a spe­cific path item to an import han­dler of your choice (an url importer in this case). The PEP men­tioned also gives details on the importer pro­to­col, a pro­to­col which all importers must com­ply to (by defin­ing find_module() and load_module(), among other details).

Why is this inter­est­ing? Well, the two are really unre­lated: I’ve just been pon­der­ing the work at Archivas I have been doing recently, basi­cally, build­ing out a Python library tests and frame­works can hook into — noth­ing spe­cial in the grand scheme of things, it’s a bunch of code which has to be shared.

Inter­nally, like all com­pa­nies, we have an SCM (Source Code Man­age­ment) appli­ca­tion, and we build prod­uct builds rapidly and fre­quently as the code changes (again, noth­ing new) — part of the build sys­tem is run­ning disu­tils to build up a tar­ball of the shared pack­age for consumption.

The prob­lem is a chicken and egg prob­lem: While our builds and code is dis­trib­uted to large clus­ters as part of a nor­mal prod­uct deploy­ment, the shared library is meant and tar­get­ted for the clients — not the actual nodes. This means that we have to go “out of band” to go out, grab the dis­tu­tils pack­age and install it on the test clients.

I plugged that into the frame­work, and it works well: when you have many branches, with dis­parate code on each branch for the shared frame­work, you should always be installing the library for that par­tic­u­lar prod­uct build branch.

This works “good enough” — test clients are always guar­an­teed to have the lat­est drop of the shared library which is directly tied to the build of the prod­uct. But Brett’s posts and the URLImport pack­age got me pon­der­ing: Is there a Bet­ter Way?

Well — there could be. If you can design a cus­tom import han­dler, there’s no rea­son why you can not build a cus­tom import han­dler that ties directly to your SCM (please don’t hit me) — this way tests and appli­ca­tions can inte­grate tightly with your envi­ron­ment, and instead of hav­ing a pull-to-client system:

… down­load lat­est build, ver­ify, install
from foo.bar import baz

You could have:

try:
import sys
sys.path += [’scm://head/branch/dir’]
from foo.bar import baz
except:
… down­load lat­est build, ver­ify, install
from foo.bar import baz

This way, the first attempt to import every­thing comes from the SCM — the sec­ond would only occur if that import failed (which allows the pro­gram to be portable to an extent — you can always fall back to local import). The seri­ous draw­back? Any­time your code branches (assume the test code branches with it) your code has to change — or it has to be coded to take this into account from the get-go. (i.e: assume the caller is pass­ing in the branch).

I guess the ben­e­fit to this is that pro­grams will always have the lat­est ver­sion of the dependencies/libraries they need, with lit­tle over­head inside the call­ing appli­ca­tion. Changes to shared libraries can go live the sec­ond they are checked into the SCM, rather than wait­ing for a full build-test-publish loop.

It’s not ground­break­ing — but just a thought. Of course, the QA guy inside of me cringes with this, given you’d be tak­ing poten­tially untested code, of course, we’re talk­ing about libraries and mod­ules that is the actual test code.

SCM Python mod­ules:
http://trentm.com/projects/px/
http://subversion.tigris.org/ (sub­ver­sion comes with python bindings)

It’s all navel-gazing. For now I’ll stick to wrap­ping my head around the var­i­ous var­garies of import.

What's this?

You are currently reading Chewing on Import at jessenoller.com.

meta