Django, mod_wsgi, Apache and OS X — do it.

July 24th, 2009 § 25 comments § permalink

whut_2.jpgSo, I’m one of those peo­ple where I don’t like run­ning things “too far” from what a pro­duc­tion setup might look like (I code on OS/X, deploy to Linux). This is why I jump(ed) through var­i­ous hoops on my OS X sys­tem to get Apache/Django/mod_wsgi/etc all up and run­ning and happy (not for serv­ing the site; just developing).

Since I like simple/succinct guides, I thought I’d post what I did so oth­ers can fol­low in my stead.

Note: These instruc­tions work with the python 2.5 ver­sion which ships with Leop­ard, or a self-compiled ver­sion of 2.6 (which is what I pre­fer) — see the Instal­la­tionOn­Ma­cOSX mod_wsgi page. Addi­tion­ally, see the “Miss­ing Code For Archi­tec­ture” sec­tion for pos­si­ble work-arounds if you find your­self need­ing 32 bit exe­cu­tion of Apache; I think the “Forc­ing 32 Bit Exe­cu­tion” are pre­ferred over the “thin­ning” of the Apache binary.

First, down­load and install mod_wsgi on leop­ard, this is as easy as (on Leopard):

curl -o mod_wsgi.tgz http://modwsgi.googlecode.com/files/mod_wsgi-2.5.tar.gz
tar -xzf mod_wsgi.tgz
cd mod_wsgi-2.5
./configure
make
sudo make install

Now, edit (via sudo) /etc/apache2/httpd.conf and add the line:

LoadModule wsgi_module libexec/apache2/mod_wsgi.so

After the rest of the Load­Mod­ule lines. Cool.

Invari­ably all of my direc­tions play with virtualenv/virtualenvwrapper and pip:

mkvirtualenv django
cdvirtualenv
easy_install pip
pip install http://media.djangoproject.com/releases/1.1/Django-1.1-rc-1.tar.gz
django-admin.py startproject mysite
django-admin.py startapp myapp
cd mysite
mkdir apache
mkdir media

Now, that just sets up the skele­ton — the meat of the wsgi con­fig­u­ra­tion goes in apache/ in the mysite/apache direc­tory. The first file is named mysite.wsgi:

?View Code PYTHON
1
2
3
4
5
6
7
8
9
10
11
import os, sys
 
#Calculate the path based on the location of the WSGI script.
apache_configuration = os.path.dirname(__file__)
project = os.path.dirname(apache_configuration)
workspace = os.path.dirname(project)
sys.path.append(workspace)
 
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

This does the needed wsgi project magic for the Django appli­ca­tion — don’t worry about the inter­preter path; we’ll do that next.

Next up is a file named apache_django_wsgi.conf, this looks like this:

# mod_wsgi configuration directives - I like having stdout access, the other two
# options run mod_wsgi in daemon mode - more on this in a minute.
WSGIPythonHome /<path to virtualenv>
WSGIRestrictStdout Off
WSGIDaemonProcess django
WSGIProcessGroup django

#
# This should be the path of the /mysite/media directory
# for example "/Users/jesse/mysite/media/"
#
Alias /site_media/ "<PATH TO>/mysite/media/"
<Directory "<PATH TO>/mysite/media">
Order allow,deny
Options Indexes
Allow from all
IndexOptions FancyIndexing
</Directory>

#
# Directory path to the admin media, for example:
#

Alias /media/ "<PATH TO>/virtualenv/site-packages/django/contrib/admin/media/"
<Directory "<PATH TO>/virtualenv/site-packages/django/contrib/admin/media">
Order allow,deny
Options Indexes
Allow from all
IndexOptions FancyIndexing
</Directory>

#
# Path to the mysite.wsgi file, for example:
# "/Users/jesse/mysite/apache/mysite.wsgi"
#

WSGIScriptAlias / "<PATH TO>/mysite/apache/mysite.wsgi"

<Directory "<PATH TO>/mysite/apache">
Allow from all
</Directory>

The apache_django_wsgi.conf file is the meat-and-potatoes here. This sets up all the paths/permissions, and is in Apache httpd.conf for­mat. You can pretty much log­jam any apache con­fig­u­ra­tion direc­tive here that you like.

Your final step is to once again edit (via sudo) /etc/apache2/httpd.conf and add a line like this at the ver­rrrry bottom:

Include "/path to/mysite/apache/apache_django_wsgi.conf"

And then run “sudo apachectl restart”

You should now be able to hit http://127.0.0.1/ and see the friendly and invit­ing django wel­come page. Note, that if you are using sqlite as your data­base, you should chmod a+rw the file, so that processes which are not you can mess with it.

There’s a final piece to this though. Nor­mally, if you run mod_wsgi in embed­ded mode, you’re going to need to restart apache every sin­gle time you make a change to your django app.

Ah! But we’re run­ning in dae­mon mode. This means all you need to do when you change a file is:

touch mysite/apache/mysite.wsgi

This will trig­ger a reload and magic hap­pens. Me being as lazy as I am (ask my wife) ended up snag­ging Bruno Bord’s tdae­mon script, and hack­ing it up a bit. The tdae­mon script will watch a direc­tory and run tests. Well, I wanted it to watch a direc­tory (and let me fil­ter sub direc­to­ries) and then run that touch com­mand. So I reused my watcher.py (here) — I used this to mon­i­tor my sphinx tree and run builds as well (and other stuff). Here’s how I’d use this:

workon django
cdvirtualenv
cd mysite
python ~/.slash/bin/watcher.py --command "touch apache/mysite.wsgi" -f media

This will auto-fire the touch com­mand when­ever it detects a file change (includ­ing svn updates).

You can also do this another way
In my rush to reuse a tool I use a bit (watcher) I skipped past the mod_wsgi doc­u­ment sec­tion on code reload­ing that shows how to setup a mon­i­tor which will watch .py file changes and kill the wsgi dae­mon, here. If you scroll down a bit, you’ll see the “Mon­i­tor­ing For Code Changes” sec­tion. All you need to do here is copy the code from the wiki into a mod­ule on your PYTHONPATH — in my case, I wrote it to mysite/apache/wsgi_monitor.py (just for this exam­ple! you should put it some­place else!) and then changed the mysite.wsgi file to import it, and set it up:

?View Code PYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import os, sys
 
#Calculate the path based on the location of the WSGI script.
apache_configuration = os.path.dirname(__file__)
project = os.path.dirname(apache_configuration)
workspace = os.path.dirname(project)
sys.path.append(workspace)
 
sys.path.append(apache_configuration) # you probably shouldn't do this.
import wsgi_monitor
wsgi_monitor.start(interval=1.0)
 
os.environ['DJANGO_SETTINGS_MODULE'] = 'ui.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

This method — once you reload apache — will watch the project for changes and then kill the wsgi dae­mon (forces a reload). So there you go — two ways of doing it.

The nice thing about this setup is that I can make pro­duc­tion ver­sion of the wsgi scripts and check them in, but keep local “my copies” (ala local_settings.py) addi­tion­ally, I don’t have to jump through hoops to get sta­tic media and con­tent served up via the django devel­op­ment server.

Addi­tional reading:

PEP 370 — Per user site-packages, and environment stew

July 19th, 2009 § 16 comments § permalink

cyber.jpgSo, fol­low­ing up from my hard-hitting rant on the sub­ject of deal­ing with pack­ag­ing a portable python ver­sion (with­out hard­coded she­bang lines) for OS/X, and later cut­ting over to a kick­start based vir­tualenv setup, I thought I’d dig into PEP 370 “a bit” as some­one pointed out to me this might just cure some of the heart burn.

I put “a bit” in quotes for a rea­son — PEP 370 itself was prob­a­bly one of the sim­plest dis­cus­sions around a fea­ture on python-dev. It came in on the 2.6-and-forward boat last year. It’s also only about 2–3 pages long, depend­ing on your font size.

The idea is this — when you run python2.6/3.0 (from now on, I’m stick­ing with 2.6) you will get a ~/.local direc­tory (for those “not in the know” — ~ is your home direc­tory, e.g. /Users/jesse on OS/X).

This direc­tory is laid out like this:

.local/
    bin/
    lib/
        pythonX.X (wherein X.X is the version number)
            site-packages

Disu­tils was mod­i­fied to sup­port the –user argu­ment. This means you can run “python setup.py –user” and your .local direc­tory will get pop­u­lated with the deli­cious nougat pay­load of the app.

pip sup­ports this just fine, for example:

zim:~ jesse$ /Library/Frameworks/Python.framework/Versions/2.6/bin/pip install \
--install-option="--user" yolk

Downloading/unpacking yolk
  Downloading yolk-0.4.1.tar.gz (80Kb): 80Kb downloaded
  Running setup.py egg_info for package yolk
Installing collected packages: setuptools, yolk
  Running setup.py install for yolk
    Installing yolk script to /Users/jesse/.local/bin
Successfully installed yolk

Hooray! Look! Files!

zim:~ jesse$ ls -lr .local/
total 0
drwxr-xr-x@ 6 jesse  jesse  204 Mar 31 18:35 lib
drwxrwxr-x  3 jesse  jesse  102 Jul 18 22:09 bin
zim:~ jesse$ ls -lr .local/lib/python2.6/site-packages/
total 0
drwxrwxr-x   9 jesse  jesse  306 Jul 18 22:09 yolk-0.4.1-py2.6.egg-info
drwxrwxr-x  17 jesse  jesse  578 Jul 18 22:09 yolk
zim:~ jesse$ ls -lr .local/bin/
total 8
-rwxr-xr-x  1 jesse  jesse  323 Jul 18 22:09 yolk
zim:~ jesse$

Yes, this means yolk is now installed into my local direc­tory — not the global direc­tory. I can also add .local/bin to my PATH and gain access to the yolk binary. This is a huge step for­ward. Oh, wait. There’s only one yolk binary:

zim:~ jesse$ cat .local/bin/yolk
#!/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
# EASY-INSTALL-ENTRY-SCRIPT: 'yolk==0.4.1','console_scripts','yolk'
__requires__ = 'yolk==0.4.1'
import sys
from pkg_resources import load_entry_point

sys.exit(
   load_entry_point('yolk==0.4.1', 'console_scripts', 'yolk')()
)

Hmm. As you can see, the hard­coded she­bang line is there — it’s a disu­tils thing. But this means if I have 3.x installed (and 2.7, and 3.1) and I install yolk into any of those, the yolk binary will get over­writ­ten and have a hard­coded she­bang line for the last-installed version.

By default, some pack­ages will also lay down scripts which include the ver­sion num­ber, for example:

-rwxr-xr-x   1 jesse  jesse   357B Jul 19 21:44 easy_install
-rwxr-xr-x   1 jesse  jesse   365B Jul 19 21:44 easy_install-2.6
-rwxr-xr-x   1 jesse  jesse   386B Jul 19 21:40 easy_install-3.1

In this exam­ple, the hard­coded she­bang line is treated as lifo — last in, first out. In this exam­ple, I installed the python 3.1 ver­sion, and then the 2.6 ver­sion. If you look in easy_install, you’ll see that it points to the 2.6 ver­sion. Sure — I have version-specific names as well, but good luck remem­ber they’re there (I always for­get), and they’re not symlinks.

I think a bet­ter way of man­ag­ing this (and I’m shoot­ing this to python-ideas) is to move the bin direc­tory under a match­ing python ver­sion direc­tory. So that way it mir­rors .local/lib/pythonx.x. You would get a .local/bin/pythonx.x direc­tory as well, and wouldn’t need to worry about con­flicts. Or we just ditch the ver­sions with­out the ver­sion num­ber in them alto­gether. (link to python-ideas thread)

In any case, this is great for the sim­ple case: you don’t need to install into the global site-packages direc­tory any longer. You just pass in –user to all of the install scripts, for example:

  • python setup.py install –user FooPackage
  • pip install –install-option=”–user” FooPackage

Notice easy_install isn’t here: that’s because it doesn’t allow the pass-through of the –user com­mand to disu­tils, favor­ing setup­tools method of doing things. That’s lame­sauce, but setuptools/easy_install also pre-dates PEP 370, so we’ll just skip past that.

Alright — so, a per-user site-packages direc­tory, minus some binary issues — well, when pok­ing around I sus­pected there might be some other un-versioned high level direc­to­ries, so I went dig­ging for a pack­age on pypi which had a mil­lion depen­den­cies — or more than one.

zim:~ jesse$ /Library/Frameworks/Python.framework/Versions/2.6/bin/pip install \
--install-option="--user" paver-templates

  Running setup.py install for Sphinx
  Running setup.py install for paver-templates
  Running setup.py install for Paver
  Running setup.py install for PasteDeploy
  Running setup.py install for docutils
  Running setup.py install for Pygments
  Running setup.py install for Jinja2
  Running setup.py install for Cheetah
  Running setup.py install for Paste
Successfully installed paver-templates

I abbre­vi­ated the out­put a bit — so 8 depen­den­cies in total, which resulted in a large increase of “stuff” in the .local/lib/python2.6/site-packages direc­tory — but also in a new .local/docs directory:

zim:~ jesse$ ls -lah .local/docs/
total 1096
drwxrwxr-x  17 jesse  jesse   578B Jul 19 14:12 .
drwxr-xr-x@  5 jesse  jesse   170B Jul 19 14:12 ..
-rw-rw-r--   1 jesse  jesse   125K Jul 19 14:12 api.html
-rw-rw-r--   1 jesse  jesse   7.2K Jul 19 14:12 changelog.html
-rw-rw-r--   1 jesse  jesse    99K Jul 19 14:12 extensions.html
-rw-rw-r--   1 jesse  jesse    13K Jul 19 14:12 faq.html
...snip...

points.jpg
More top-level un-versioned stuff, which will again con­flict if I go and install this in say, python3.1. The same issue could arise with any data files stored in the top-level (although most of the pack­ages plop them into site-packages with the code, which is the cor­rect way to do it).

So where does this leave us? Well, first off, I would say this — this is a huge improve­ment over the old site-packages method. Huge. Mas­sive. Why? Even with the ver­sion­ing issues I’ve sort of harped on above, this is sim­ply a bet­ter way to install and man­age pack­ages a user needs.

That being said — installing into the user’s local site-packages should be the pre­ferred deploy­ment method in dis­tu­tils, rather than need­ing to pass in –user, we should pass in the inverse, –global. I know this is flame­bait — but really, in a world where more and more oper­at­ing sys­tem crit­i­cal things are being writ­ten in Python and using the installed frame­work (see Fedora as a prime exam­ple), it’s really not smart to go muck­ing around in the global bin direc­to­ries, or the global site packages.

I’d also make the argu­ment that even the .local struc­ture out­lines in pep 370 doesn’t remove/replace the need for some­thing like vir­tualenv. Here’s why.

Run­ning my exper­i­ments for this, I man­aged to add 38 direc­to­ries and files into my .local/lib/python2.6 direc­tory. This includes pack­ages, .pth files, egg-info direc­to­ries, and actual pack­age code direc­to­ries. What if I just wanted to use it for a sin­gle appli­ca­tion? How do I deal with some apps or pack­ages which want ver­sions? Now, instead of run­ning “sudo rm –rf /Library/…/site-packages/xxx” I can eas­ily run “rm –rf ~/.local/lib/python2.6/xxx” — but that’s still equiv­a­lent to need­ing to treat .local/lib/xxx like a bon­sai tree.

I’d rather treat it like my girl­friends used me as a teenager; spin it up and then drop it off in the bad part of town never to be heard from again. Mean­ing, build it, install it, delete it.

Not to men­tion, some­thing like vir­tualenv (and it’s inte­gra­tion with pip — or is it pip’s inte­gra­tion with vir­tualenv?) offers addi­tional niceties above and beyond the use it and delete it use-case. You can build an iso­lated envi­ron­ment, and then run pip over it to gen­er­ate a bun­dle, or require­ments file, which you can then share with other peo­ple (for example).

It also allows me to keep things com­part­men­tal­ized in a near OCD-level. Now, I could do this with the fea­tures in PEP 370, sort of. It sup­ports the PYTHONUSERBASE envi­ron­ment vari­able, which means you could make a tree like this:

.local/
    app1/
        bin/
        lib/
            python2.6/...

And then write a quick bash func­tion to say “switch PYTHONUSERBASE to .local/app1” — if that’s what floats your boat (and swaps scripts-without-versions to sym­links so you can count on it point­ing to the right ver­sion). But why not use some­thing which does this for you, like vir­tualenv? It also iso­lates the inter­preter itself, not just the pack­ages you want.

gran-torino-clint-eastwood.jpg
And it works with the fea­tures of PEP 370. Mean­ing, if you cre­ate a vir­tualenv, it will still load the .local direc­tory when you load that vir­tualenv. How­ever, while some might find this desir­able, I don’t, and not in the “clint-eastwood-in-gran-torino-get-off-my-lawn” way. Also add the fact that if .local is exposed in the vir­tualenv, you’ll still lack access to the scripts out­side the vir­tualenv (more on this in a moment). I end up dis­abling .local load­ing in the inter­preter by export­ing PYTHONNOUSERSITE (see the pep) within vir­tualen­vwrap­per when­ever I call “workon” for a given environment.

Right now, if you run “vir­tualenv –no-site-packages flub­ber” you (pur­pose­fully) sand­box your­self away from the global site-packages direc­tory. You how­ever, do not get the option to omit the .local direc­tory (yes, I’m going to file a bug — I’m up to two or three to file so far). If I want a sand­box, I want a sand­box. It’s like own­ing cats — you want them to go in the lit­ter box, not the lit­ter box + a five foot radius.

Also, using vir­tualenv com­part­men­tal­izes installed bina­ries. Mean­ing if I make “flub­ber” and install say, pylint into it, the pylint bina­ries stick to that vir­tualenv. And therein lies a dif­fer­ent catch.

In my other post, I griped about hard coded paths in the she­bang line (#!). This prob­lem is still here, all I’ve done is out­line some of the fea­tures of the pep and vir­tualenv. Let’s talk sce­nar­ios. Let’s say I install pylint into my .local direc­tory. It’s she­bang will point to the ver­sion of the python 2.6 binary I’ve got installed. If I make a vir­tualenv and try to run pylint on code which depends on a library I’ve sand­boxed, it won’t work. Why? Because you need to rein­stall it into that vir­tualenv, so it can point to that interpreter.

If the she­bang line instead used “/usr/bin/env python” — you could side step this, as any packages-with-binaries installed into the user direc­to­ries, or the global dirs could just load the inter­preter of the vir­tualenv instead… except… wait for it… it wouldn’t have it’s needed libraries in that vir­tualenv, which is why it has the hard­coded she­bang line in the first place (whee!).

Back to square one.

Using vir­tualenv though, you can make a boot­strap script to install com­mon util­i­ties (such as pylint) into the envi­ron­ment dur­ing cre­ation. Look at the after_install hook. So this works around the entire script-outside the sand­box (but you still get things from the .local direc­tory). You can also use the .local ver­sion of pip (should you have it installed) to install a library into a vir­tualenv sand­box.

Here’s where we are. Installing pack­ages into the global direc­to­ries (/usr, site-packages, etc) is con­sid­ered unsan­i­tary and may lead to bad things. So don’t do it — unless you have to, and the times you have to should be rare.

Installing things into your .local direc­tory makes a lot of sense, and which is what you should do, espe­cially for things like libraries you want to use. Scripts get dumped (unver­sioned) into .local/bin. Using a vir­tualenv on top of all this is still use­ful and a good way to man­age things — you get (mostly) iso­lated envi­ron­ments, you can point it at any inter­preter and gen­er­ate an envi­ron­ment for just that ver­sion of python (which is what I do). You can also use it to make sand­boxes within san­boxes. For exam­ple, I make a “mas­ter” python2.6 one, named “python2.6″ — inside it’s direc­tory, I can make a direc­tory named “sand­boxes”, install vir­tualenv within it, and make sub-sandboxes within that.

So, PEP 370 is a great change, and pretty darned use­ful. It still has some of the draw­backs of the global direc­to­ries (but makes your life as a user/consumer much eas­ier) but its made bet­ter (as in the global case) by adding vir­tualenv on top of it.

For me, I com­pile python into it’s own direc­tory (/Users/jesse/slash) and then make a “mas­ter” vir­tual machine for each ver­sion, and end up using that 95% of the time for experimentation/coding/etc. I made a cus­tom boot­strap envi­ron­ment, and a pip require­ments file to man­han­dle the addi­tional things I want in every envi­ron­ment I make.

None of this — PEP 370, vir­tualenv, etc are with­out their draw­backs, or things I’d like to improve — they’re an improve­ment on the sta­tus quo, and can def­i­nitely be made bet­ter. Per­son­ally, I can’t live with­out vir­tualenv and vir­tualen­vwrap­per. I don’t think I’d use vir­tualenv as much with­out virtualenvwrapper.

For bonus read­ing, check out this email from Tarek describ­ing the con­sumer use-cases, I think it’s a good, suc­cinct outline.

Trapped in python package; send food.

July 17th, 2009 § 14 comments § permalink

So, I (and many oth­ers) have lamented pack­ag­ing issues in Python. Some peo­ple are focused on schrodingers-lolcat1.jpginte­grat­ing with ven­dor sys­tems (such as apt (.deb) and yum (rpm)) — while oth­ers are con­cerned with disutils/setuptools/etc.

Still oth­ers (like me, and maybe I’m alone) are trapped in a tween-state. We’re par­tially using ven­dor sys­tems, and par­tially using self-compiled ver­sions of python.

The car­di­nal “rule” has been not to “touch” the vendor-specific instal­la­tions of python (this includes you, Linux). For exam­ple, on OS/X — any time you run easy_install or pip you install into the global site-packages direc­tory. The same applies when you do the same on linux, and when you run apt-get install/yum-install. Things go into that global, shared directory.

This sucks. Here’s why:

  • Ver­sions. Some appli­ca­tions depend on very spe­cific ver­sions of libraries. This is because the main­tain­ers of the libraries they depend on are bad, and break back­wards compatibility.
  • site-packages becomes a toi­let. Before my near OCD lev­els of clean­li­ness, I checked my system’s site-packages direc­tory — I think all told I had about 250 dif­fer­ent .eggs/packages/modules/etc all lit­tered in there. And .pth files, and half-exploded things with meta­data direc­to­ries. And I think I found a squir­rel in there.
  • “glob­ally” installing things like nose, pip and setup­tools put the binary scripts in /usr, /usr/local and so on. This again causes those direc­to­ries to become a toilet.
  • In some cases, upgrad­ing some­thing out­side of your ven­dor pack­ages — say, some­thing pre-installed into RedHat’s python ver­sion can in fact, break and side-effect the sys­tem as a whole.

So, I guess you could say “system-level site-packages con­sid­ered harm­ful”. Once I real­ized the hor­ri­ble error of my ways, I switched to vir­tualenv/vir­tualen­vwrap­per. This works great for me. But at least on OS/X — some­thing was lacking.

That some­thing was depen­den­cies needed to com­pile some­thing like read­line into python. I could install the read­line egg from pypi and just “work around it”. Or I could install mac­ports (which is bro­ken in many ways) and install the read­line devel­op­ment libraries in there.

Unfor­tu­nately, mac­ports also side effects your sys­tem in unde­sir­able ways. Sud­denly you’re link­ing to things you don’t real­ize, you’ve got things com­piled in you don’t need/want, and so on.

So, what’s a guy sup­posed to do?

Well, since I’m not afraid of com­pil­ing things, I built a mini-macports for myself. I made a direc­tory (named “slash”) in my home direc­tory, and com­piled things like read­line into it. I then point the python com­piles to that direc­tory and move on with my life (I love you, –pre­fix). After compiling/installing PIL, Read­line, etc into this direc­tory as well as a pile of python ver­sions, and slap­ping vir­tualenv on top of it I was feel­ing pretty good. I get only what I need, and vir­tualenv keeps things out of the global directories.

Well. Minus the fact that it’s huge, non portable and it’s sort of a pain in the ass.

Then, I got an itch — I wanted to build a “python mega­pack” — I lov­ingly named it python-kitchensink. My goal was to repeat what I did above, and then offer it as a down­load for peo­ple who want to avoid this pain them­selves on OS/X.

Easy enough. Minus one nit.

You can’t tar the damned thing up. I don’t know if it’s a side effect of disutils/setuptools, but scripts being installed into this root, were hav­ing the #! lines hard coded to the exact path of the inter­preter. This means if you went through all this com­pi­la­tion, and then installed easy_install — and say you did this in “/Users/jesse/myslash” — easy_install would get “#!/Users/jesse/myslash/bin/python2.6″ hard coded into it.

Instead of kitchensink, I should have named it “jesse cusses a lot”.

So, back to square one. Or rather “think about this in the back of my mind, for­get about it and then change to a new job”.

For­get­ting about try­ing to do this for OS/X, I end up need­ing to do some­thing eerily sim­i­lar on Fedora Core. Now, com­pi­la­tion of python with all the bells and whis­tles on Fedora is sim­ple — “yum install xxx-devel” and then just run the compile.

The goal was to make a fully-featured python 2.6 install on FC10, and then boot­strap the user(s) into a vir­tualenv so that noth­ing got plopped into the global directories.

Well — minus the fact fedora core 10 ships with python 2.5. And tools like virtualenv/etc from the yum repos lag behind the ver­sions I want/need. Damnit. Do I stick to RPMs? Do I boot­strap it enough to “just work” and then pip install the rest? What about python2.6? Where are my pants?

There’s another catch: it has to work on *first boot* and there’s no net­work on that first boot.

So, for­get­ting my expe­ri­ences with com­pil­ing all this stuff myself on OS/X, that’s what I do at first. I install all the devel pack­ages, build an RPM which con­sumes a tar­ball I cre­ate, and add it to a local repo, and throw it in the kick­start file which spews out the images.

Oh but wait. The hard­coded #!‘s come back and bite me in the ass. The build server com­piles things in a tem­po­rary direc­tory, and then installs easy_install and all of the other tools into the –prefix’ed python install. That temp direc­tory is named some­thing like “–TMPxx1341234DFLKJ1341234.xxx.hahaha”. Soooo, I get “#!/–TMPxx1341234DFLKJ1341234.xxx.hahaha/bin/python”. That’s about as use­ful as a bee­hive in my toilet.

Easy fix though: just make sure the build­server doesn’t have any­thing in the even­tual loca­tion of the installed ver­sion from the rpm (/opt/lazercats (ok, not really)) and just com­pile every­thing there.

Suc­cess, and win. Heck, I even get it to boot­strap vir­tualenvs for the users. Then I find out I’ve increased the image size by 40 or so megabytes. This imme­di­ately wipes the grin off my face and makes me real­ize I have again, failed. You see, I can’t freely increase the image size like that.

I need python 2.6. So, step one is to swap to fc11. Ok, good. I also want to avoid using the lag-behind ven­dors pack­ages except for the bare min­i­mum foot­print I need to boot­strap the envi­ron­ment. This means mod­i­fy­ing the kick­start pack­ages list like this (note: I also can not install a com­piler — which is needed for a lot of packages):

# Python utilities
# python-lxml is == 2mb
python-lxml
python-setuptools
python-crypto
python-paramiko
python-pycurl
# Needed for virtualenv < 1.0 mb
python-devel
python-setuptools-devel

Why on earth is python-devel needed for vir­tualenv? Why python-setuptools-devel? Whyyyy??!
Ok, so I’m only going to be stuck with upstream ver­sions of lxml, setup­tools (which hasn’t revved since the earth cooled) and a few oth­ers. Fine.

I then jump into kick­start file and pop in:

%post --nochroot
cp python-dependencies.txt $INSTALL_ROOT/root/python-dependencies.txt
%post
%include post.txt
%end

In post.txt:

# Python environment setup

# Temporarily make DNS work
echo "nameserver 10.1.1.10" >/etc/resolv.conf

# Python environment setup
( cd /root
    /usr/bin/easy_install virtualenv
    /usr/bin/easy_install virtualenvwrapper
    /usr/bin/virtualenv /opt/thatthing
    /opt/foobar/bin/easy_install pip
    /opt/foobar/bin/pip -E /opt/thatthing install -r /root/python-dependencies.txt
    rm -rf build/ python-dependencies
    echo "export WORKON_HOME=/opt" >>/home/jnoller/.bash_profile
    echo "source /usr/bin/virtualenvwrapper_bashrc" >>/home/jnoller/.bash_profile
)
rm -f /etc/resolv.conf

# End Python setup

The python-dependencies.txt is a pip require­ments file and looks like this:

# use pip install -r


# http://code.google.com/p/boto/
boto

# http://docs.fabfile.org/0.9/
fabric

# http://ipython.scipy.org/moin/
ipython

# http://tools.assembla.com/yolk
yolk

# http://code.google.com/p/httplib2/
httplib2

# http://ipaddr-py.googlecode.com

http://ipaddr-py.googlecode.com/files/ipaddr-1.1.1.tar.gz

Note, I can’t also plop svn, hg, git, etc in here — so pack­ages not on the cheese­shop in or pack­aged right are a no-go.

The trick here is that the %post com­mands in the kick­start envi­ron­ment run in a chroot of the OS being cre­ated. This means, once the new image is loaded (say, in EC2) I can ssh in, and hit “workon thatthing”. In real­ity, the WORKON dir should be else­where, but I’m going to let users over­ride that. As it is, the “one true python” ver­sion is the one in /opt — no one (even me) gets to touch the sys­tem ver­sion of python.

I now have a python envi­ron­ment, avail­able on first boot, iso­lated from the OS-provided one. I can spawn infi­nitely more vir­tualenvs and play all day long. The few global things I have are easy_install and some libraries which I hope I don’t need to rev myself.

I still haven’t licked the OS/X part. I’m prob­a­bly just going to have to com­pile the barest pos­si­ble envi­ron­ment in some­thing like /opt/python-ks and go from there. Given I’d need to com­pile all of the depen­den­cies into it (such as read­line) I may just end up writ­ing a big script to grab all the bits and then com­pile it into a loca­tion the user pro­vides. The nice thing is that once I boot­strap python and vir­tualenv into the basic tree, I can use pip bundles/requirements files to pull in the rest.

All told, I sit here look­ing at the mess I’ve slogged through — and then I real­ize the entire python-packaging dis­cus­sion on python-dev just exposes a whole ‘nother can of worms. Ver­sion­ing in a sin­gle site-packages direc­tory, how app devel­op­ers con­flict with OS ven­dors, etc. It’s a mess. OS Ven­dors lag behind devel­oper released ver­sions, and come to depend on what’s installed there (have you ever bro­ken yum on a Fedora box? I have.).

I hope Tarek gets a chance to clean a lot of this up — and while I’m against “every­thing and the kitchen sink” in the stdlib — hav­ing some method/API of build­ing out “an official-like” vir­tualenv setup (maybe mak­ing virtualenv’s life eas­ier) would be nice.

Edit to add: I real­ize that hard­cod­ing the she­bang line is desir­able in many cases, the obvi­ous rea­son is that you need to be pointed at the inter­preter which has your dependencies/libraries in it. Not hav­ing a clear way of alter­ing that behav­ior (other than a “clever” sed script) is unfortunate.

See this fol­lowup as well

Where am I?

You are currently viewing the archives for July, 2009 at jessenoller.com.