July 24th, 2009 § § permalink
So, I’m one of those people where I don’t like running things “too far” from what a production setup might look like (I code on OS/X, deploy to Linux). This is why I jump(ed) through various hoops on my OS X system to get Apache/Django/mod_wsgi/etc all up and running and happy (not for serving the site; just developing).
Since I like simple/succinct guides, I thought I’d post what I did so others can follow in my stead.
Note: These instructions work with the python 2.5 version which ships with Leopard, or a self-compiled version of 2.6 (which is what I prefer) — see the InstallationOnMacOSX mod_wsgi page. Additionally, see the “Missing Code For Architecture” section for possible work-arounds if you find yourself needing 32 bit execution of Apache; I think the “Forcing 32 Bit Execution” are preferred over the “thinning” of the Apache binary.
First, download and install mod_wsgi on leopard, this is as easy as (on Leopard):
curl -o mod_wsgi.tgz http://modwsgi.googlecode.com/files/mod_wsgi-2.5.tar.gz
tar -xzf mod_wsgi.tgz
cd mod_wsgi-2.5
./configure
make
sudo make install
Now, edit (via sudo) /etc/apache2/httpd.conf and add the line:
LoadModule wsgi_module libexec/apache2/mod_wsgi.so
After the rest of the LoadModule lines. Cool.
Invariably all of my directions play with virtualenv/virtualenvwrapper and pip:
mkvirtualenv django
cdvirtualenv
easy_install pip
pip install http://media.djangoproject.com/releases/1.1/Django-1.1-rc-1.tar.gz
django-admin.py startproject mysite
django-admin.py startapp myapp
cd mysite
mkdir apache
mkdir media
Now, that just sets up the skeleton — the meat of the wsgi configuration goes in apache/ in the mysite/apache directory. The first file is named mysite.wsgi:
1
2
3
4
5
6
7
8
9
10
11
| import os, sys
#Calculate the path based on the location of the WSGI script.
apache_configuration = os.path.dirname(__file__)
project = os.path.dirname(apache_configuration)
workspace = os.path.dirname(project)
sys.path.append(workspace)
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler() |
This does the needed wsgi project magic for the Django application — don’t worry about the interpreter path; we’ll do that next.
Next up is a file named apache_django_wsgi.conf, this looks like this:
# mod_wsgi configuration directives - I like having stdout access, the other two
# options run mod_wsgi in daemon mode - more on this in a minute.
WSGIPythonHome /<path to virtualenv>
WSGIRestrictStdout Off
WSGIDaemonProcess django
WSGIProcessGroup django
#
# This should be the path of the /mysite/media directory
# for example "/Users/jesse/mysite/media/"
#
Alias /site_media/ "<PATH TO>/mysite/media/"
<Directory "<PATH TO>/mysite/media">
Order allow,deny
Options Indexes
Allow from all
IndexOptions FancyIndexing
</Directory>
#
# Directory path to the admin media, for example:
#
Alias /media/ "<PATH TO>/virtualenv/site-packages/django/contrib/admin/media/"
<Directory "<PATH TO>/virtualenv/site-packages/django/contrib/admin/media">
Order allow,deny
Options Indexes
Allow from all
IndexOptions FancyIndexing
</Directory>
#
# Path to the mysite.wsgi file, for example:
# "/Users/jesse/mysite/apache/mysite.wsgi"
#
WSGIScriptAlias / "<PATH TO>/mysite/apache/mysite.wsgi"
<Directory "<PATH TO>/mysite/apache">
Allow from all
</Directory>
The apache_django_wsgi.conf file is the meat-and-potatoes here. This sets up all the paths/permissions, and is in Apache httpd.conf format. You can pretty much logjam any apache configuration directive here that you like.
Your final step is to once again edit (via sudo) /etc/apache2/httpd.conf and add a line like this at the verrrrry bottom:
Include "/path to/mysite/apache/apache_django_wsgi.conf"
And then run “sudo apachectl restart”
You should now be able to hit http://127.0.0.1/ and see the friendly and inviting django welcome page. Note, that if you are using sqlite as your database, you should chmod a+rw the file, so that processes which are not you can mess with it.
There’s a final piece to this though. Normally, if you run mod_wsgi in embedded mode, you’re going to need to restart apache every single time you make a change to your django app.
Ah! But we’re running in daemon mode. This means all you need to do when you change a file is:
touch mysite/apache/mysite.wsgi
This will trigger a reload and magic happens. Me being as lazy as I am (ask my wife) ended up snagging Bruno Bord’s tdaemon script, and hacking it up a bit. The tdaemon script will watch a directory and run tests. Well, I wanted it to watch a directory (and let me filter sub directories) and then run that touch command. So I reused my watcher.py (here) — I used this to monitor my sphinx tree and run builds as well (and other stuff). Here’s how I’d use this:
workon django
cdvirtualenv
cd mysite
python ~/.slash/bin/watcher.py --command "touch apache/mysite.wsgi" -f media
This will auto-fire the touch command whenever it detects a file change (including svn updates).
You can also do this another way
In my rush to reuse a tool I use a bit (watcher) I skipped past the mod_wsgi document section on code reloading that shows how to setup a monitor which will watch .py file changes and kill the wsgi daemon, here. If you scroll down a bit, you’ll see the “Monitoring For Code Changes” section. All you need to do here is copy the code from the wiki into a module on your PYTHONPATH — in my case, I wrote it to mysite/apache/wsgi_monitor.py (just for this example! you should put it someplace else!) and then changed the mysite.wsgi file to import it, and set it up:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| import os, sys
#Calculate the path based on the location of the WSGI script.
apache_configuration = os.path.dirname(__file__)
project = os.path.dirname(apache_configuration)
workspace = os.path.dirname(project)
sys.path.append(workspace)
sys.path.append(apache_configuration) # you probably shouldn't do this.
import wsgi_monitor
wsgi_monitor.start(interval=1.0)
os.environ['DJANGO_SETTINGS_MODULE'] = 'ui.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler() |
This method — once you reload apache — will watch the project for changes and then kill the wsgi daemon (forces a reload). So there you go — two ways of doing it.
The nice thing about this setup is that I can make production version of the wsgi scripts and check them in, but keep local “my copies” (ala local_settings.py) additionally, I don’t have to jump through hoops to get static media and content served up via the django development server.
Additional reading:
So, I'm one of those people where I don't like running things "too far" from what a production setup might look like (I code on OS/X, deploy to Linux). This is why I jump(ed) through various hoops on my OS X system to get Apache/Django/mod_wsgi/etc all up and running and happy (not for serving ...
July 19th, 2009 § § permalink
So, following up from my hard-hitting rant on the subject of dealing with packaging a portable python version (without hardcoded shebang lines) for OS/X, and later cutting over to a kickstart based virtualenv setup, I thought I’d dig into PEP 370 “a bit” as someone pointed out to me this might just cure some of the heart burn.
I put “a bit” in quotes for a reason — PEP 370 itself was probably one of the simplest discussions around a feature on python-dev. It came in on the 2.6-and-forward boat last year. It’s also only about 2–3 pages long, depending on your font size.
The idea is this — when you run python2.6/3.0 (from now on, I’m sticking with 2.6) you will get a ~/.local directory (for those “not in the know” — ~ is your home directory, e.g. /Users/jesse on OS/X).
This directory is laid out like this:
.local/
bin/
lib/
pythonX.X (wherein X.X is the version number)
site-packages
Disutils was modified to support the –user argument. This means you can run “python setup.py –user” and your .local directory will get populated with the delicious nougat payload of the app.
pip supports this just fine, for example:
zim:~ jesse$ /Library/Frameworks/Python.framework/Versions/2.6/bin/pip install \
--install-option="--user" yolk
Downloading/unpacking yolk
Downloading yolk-0.4.1.tar.gz (80Kb): 80Kb downloaded
Running setup.py egg_info for package yolk
Installing collected packages: setuptools, yolk
Running setup.py install for yolk
Installing yolk script to /Users/jesse/.local/bin
Successfully installed yolk
Hooray! Look! Files!
zim:~ jesse$ ls -lr .local/
total 0
drwxr-xr-x@ 6 jesse jesse 204 Mar 31 18:35 lib
drwxrwxr-x 3 jesse jesse 102 Jul 18 22:09 bin
zim:~ jesse$ ls -lr .local/lib/python2.6/site-packages/
total 0
drwxrwxr-x 9 jesse jesse 306 Jul 18 22:09 yolk-0.4.1-py2.6.egg-info
drwxrwxr-x 17 jesse jesse 578 Jul 18 22:09 yolk
zim:~ jesse$ ls -lr .local/bin/
total 8
-rwxr-xr-x 1 jesse jesse 323 Jul 18 22:09 yolk
zim:~ jesse$
Yes, this means yolk is now installed into my local directory — not the global directory. I can also add .local/bin to my PATH and gain access to the yolk binary. This is a huge step forward. Oh, wait. There’s only one yolk binary:
zim:~ jesse$ cat .local/bin/yolk
#!/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
# EASY-INSTALL-ENTRY-SCRIPT: 'yolk==0.4.1','console_scripts','yolk'
__requires__ = 'yolk==0.4.1'
import sys
from pkg_resources import load_entry_point
sys.exit(
load_entry_point('yolk==0.4.1', 'console_scripts', 'yolk')()
)
Hmm. As you can see, the hardcoded shebang line is there — it’s a disutils thing. But this means if I have 3.x installed (and 2.7, and 3.1) and I install yolk into any of those, the yolk binary will get overwritten and have a hardcoded shebang line for the last-installed version.
By default, some packages will also lay down scripts which include the version number, for example:
-rwxr-xr-x 1 jesse jesse 357B Jul 19 21:44 easy_install
-rwxr-xr-x 1 jesse jesse 365B Jul 19 21:44 easy_install-2.6
-rwxr-xr-x 1 jesse jesse 386B Jul 19 21:40 easy_install-3.1
In this example, the hardcoded shebang line is treated as lifo — last in, first out. In this example, I installed the python 3.1 version, and then the 2.6 version. If you look in easy_install, you’ll see that it points to the 2.6 version. Sure — I have version-specific names as well, but good luck remember they’re there (I always forget), and they’re not symlinks.
I think a better way of managing this (and I’m shooting this to python-ideas) is to move the bin directory under a matching python version directory. So that way it mirrors .local/lib/pythonx.x. You would get a .local/bin/pythonx.x directory as well, and wouldn’t need to worry about conflicts. Or we just ditch the versions without the version number in them altogether. (link to python-ideas thread)
In any case, this is great for the simple case: you don’t need to install into the global site-packages directory any longer. You just pass in –user to all of the install scripts, for example:
- python setup.py install –user FooPackage
- pip install –install-option=”–user” FooPackage
Notice easy_install isn’t here: that’s because it doesn’t allow the pass-through of the –user command to disutils, favoring setuptools method of doing things. That’s lamesauce, but setuptools/easy_install also pre-dates PEP 370, so we’ll just skip past that.
Alright — so, a per-user site-packages directory, minus some binary issues — well, when poking around I suspected there might be some other un-versioned high level directories, so I went digging for a package on pypi which had a million dependencies — or more than one.
zim:~ jesse$ /Library/Frameworks/Python.framework/Versions/2.6/bin/pip install \
--install-option="--user" paver-templates
Running setup.py install for Sphinx
Running setup.py install for paver-templates
Running setup.py install for Paver
Running setup.py install for PasteDeploy
Running setup.py install for docutils
Running setup.py install for Pygments
Running setup.py install for Jinja2
Running setup.py install for Cheetah
Running setup.py install for Paste
Successfully installed paver-templates
I abbreviated the output a bit — so 8 dependencies in total, which resulted in a large increase of “stuff” in the .local/lib/python2.6/site-packages directory — but also in a new .local/docs directory:
zim:~ jesse$ ls -lah .local/docs/
total 1096
drwxrwxr-x 17 jesse jesse 578B Jul 19 14:12 .
drwxr-xr-x@ 5 jesse jesse 170B Jul 19 14:12 ..
-rw-rw-r-- 1 jesse jesse 125K Jul 19 14:12 api.html
-rw-rw-r-- 1 jesse jesse 7.2K Jul 19 14:12 changelog.html
-rw-rw-r-- 1 jesse jesse 99K Jul 19 14:12 extensions.html
-rw-rw-r-- 1 jesse jesse 13K Jul 19 14:12 faq.html
...snip...

More top-level un-versioned stuff, which will again conflict if I go and install this in say, python3.1. The same issue could arise with any data files stored in the top-level (although most of the packages plop them into site-packages with the code, which is the correct way to do it).
So where does this leave us? Well, first off, I would say this — this is a huge improvement over the old site-packages method. Huge. Massive. Why? Even with the versioning issues I’ve sort of harped on above, this is simply a better way to install and manage packages a user needs.
That being said — installing into the user’s local site-packages should be the preferred deployment method in distutils, rather than needing to pass in –user, we should pass in the inverse, –global. I know this is flamebait — but really, in a world where more and more operating system critical things are being written in Python and using the installed framework (see Fedora as a prime example), it’s really not smart to go mucking around in the global bin directories, or the global site packages.
I’d also make the argument that even the .local structure outlines in pep 370 doesn’t remove/replace the need for something like virtualenv. Here’s why.
Running my experiments for this, I managed to add 38 directories and files into my .local/lib/python2.6 directory. This includes packages, .pth files, egg-info directories, and actual package code directories. What if I just wanted to use it for a single application? How do I deal with some apps or packages which want versions? Now, instead of running “sudo rm –rf /Library/…/site-packages/xxx” I can easily run “rm –rf ~/.local/lib/python2.6/xxx” — but that’s still equivalent to needing to treat .local/lib/xxx like a bonsai tree.
I’d rather treat it like my girlfriends used me as a teenager; spin it up and then drop it off in the bad part of town never to be heard from again. Meaning, build it, install it, delete it.
Not to mention, something like virtualenv (and it’s integration with pip — or is it pip’s integration with virtualenv?) offers additional niceties above and beyond the use it and delete it use-case. You can build an isolated environment, and then run pip over it to generate a bundle, or requirements file, which you can then share with other people (for example).
It also allows me to keep things compartmentalized in a near OCD-level. Now, I could do this with the features in PEP 370, sort of. It supports the PYTHONUSERBASE environment variable, which means you could make a tree like this:
.local/
app1/
bin/
lib/
python2.6/...
And then write a quick bash function to say “switch PYTHONUSERBASE to .local/app1” — if that’s what floats your boat (and swaps scripts-without-versions to symlinks so you can count on it pointing to the right version). But why not use something which does this for you, like virtualenv? It also isolates the interpreter itself, not just the packages you want.

And it works with the features of PEP 370. Meaning, if you create a virtualenv, it will still load the .local directory when you load that virtualenv. However, while some might find this desirable, I don’t, and not in the “clint-eastwood-in-gran-torino-get-off-my-lawn” way. Also add the fact that if .local is exposed in the virtualenv, you’ll still lack access to the scripts outside the virtualenv (more on this in a moment). I end up disabling .local loading in the interpreter by exporting PYTHONNOUSERSITE (see the pep) within virtualenvwrapper whenever I call “workon” for a given environment.
Right now, if you run “virtualenv –no-site-packages flubber” you (purposefully) sandbox yourself away from the global site-packages directory. You however, do not get the option to omit the .local directory (yes, I’m going to file a bug — I’m up to two or three to file so far). If I want a sandbox, I want a sandbox. It’s like owning cats — you want them to go in the litter box, not the litter box + a five foot radius.
Also, using virtualenv compartmentalizes installed binaries. Meaning if I make “flubber” and install say, pylint into it, the pylint binaries stick to that virtualenv. And therein lies a different catch.
In my other post, I griped about hard coded paths in the shebang line (#!). This problem is still here, all I’ve done is outline some of the features of the pep and virtualenv. Let’s talk scenarios. Let’s say I install pylint into my .local directory. It’s shebang will point to the version of the python 2.6 binary I’ve got installed. If I make a virtualenv and try to run pylint on code which depends on a library I’ve sandboxed, it won’t work. Why? Because you need to reinstall it into that virtualenv, so it can point to that interpreter.
If the shebang line instead used “/usr/bin/env python” — you could side step this, as any packages-with-binaries installed into the user directories, or the global dirs could just load the interpreter of the virtualenv instead… except… wait for it… it wouldn’t have it’s needed libraries in that virtualenv, which is why it has the hardcoded shebang line in the first place (whee!).
Back to square one.
Using virtualenv though, you can make a bootstrap script to install common utilities (such as pylint) into the environment during creation. Look at the after_install hook. So this works around the entire script-outside the sandbox (but you still get things from the .local directory). You can also use the .local version of pip (should you have it installed) to install a library into a virtualenv sandbox.
Here’s where we are. Installing packages into the global directories (/usr, site-packages, etc) is considered unsanitary and may lead to bad things. So don’t do it — unless you have to, and the times you have to should be rare.
Installing things into your .local directory makes a lot of sense, and which is what you should do, especially for things like libraries you want to use. Scripts get dumped (unversioned) into .local/bin. Using a virtualenv on top of all this is still useful and a good way to manage things — you get (mostly) isolated environments, you can point it at any interpreter and generate an environment for just that version of python (which is what I do). You can also use it to make sandboxes within sanboxes. For example, I make a “master” python2.6 one, named “python2.6″ — inside it’s directory, I can make a directory named “sandboxes”, install virtualenv within it, and make sub-sandboxes within that.
So, PEP 370 is a great change, and pretty darned useful. It still has some of the drawbacks of the global directories (but makes your life as a user/consumer much easier) but its made better (as in the global case) by adding virtualenv on top of it.
For me, I compile python into it’s own directory (/Users/jesse/slash) and then make a “master” virtual machine for each version, and end up using that 95% of the time for experimentation/coding/etc. I made a custom bootstrap environment, and a pip requirements file to manhandle the additional things I want in every environment I make.
None of this — PEP 370, virtualenv, etc are without their drawbacks, or things I’d like to improve — they’re an improvement on the status quo, and can definitely be made better. Personally, I can’t live without virtualenv and virtualenvwrapper. I don’t think I’d use virtualenv as much without virtualenvwrapper.
For bonus reading, check out this email from Tarek describing the consumer use-cases, I think it’s a good, succinct outline.
So, following up from my hard-hitting rant on the subject of dealing with packaging a portable python version (without hardcoded shebang lines) for OS/X, and later cutting over to a kickstart based virtualenv setup, I thought I'd dig into PEP 370 "a bit" as someone pointed out to me this might just cure some ...
July 17th, 2009 § § permalink
So, I (and many others) have lamented packaging issues in Python. Some people are focused on
integrating with vendor systems (such as apt (.deb) and yum (rpm)) — while others are concerned with disutils/setuptools/etc.
Still others (like me, and maybe I’m alone) are trapped in a tween-state. We’re partially using vendor systems, and partially using self-compiled versions of python.
The cardinal “rule” has been not to “touch” the vendor-specific installations of python (this includes you, Linux). For example, on OS/X — any time you run easy_install or pip you install into the global site-packages directory. The same applies when you do the same on linux, and when you run apt-get install/yum-install. Things go into that global, shared directory.
This sucks. Here’s why:
- Versions. Some applications depend on very specific versions of libraries. This is because the maintainers of the libraries they depend on are bad, and break backwards compatibility.
- site-packages becomes a toilet. Before my near OCD levels of cleanliness, I checked my system’s site-packages directory — I think all told I had about 250 different .eggs/packages/modules/etc all littered in there. And .pth files, and half-exploded things with metadata directories. And I think I found a squirrel in there.
- “globally” installing things like nose, pip and setuptools put the binary scripts in /usr, /usr/local and so on. This again causes those directories to become a toilet.
- In some cases, upgrading something outside of your vendor packages — say, something pre-installed into RedHat’s python version can in fact, break and side-effect the system as a whole.
So, I guess you could say “system-level site-packages considered harmful”. Once I realized the horrible error of my ways, I switched to virtualenv/virtualenvwrapper. This works great for me. But at least on OS/X — something was lacking.
That something was dependencies needed to compile something like readline into python. I could install the readline egg from pypi and just “work around it”. Or I could install macports (which is broken in many ways) and install the readline development libraries in there.
Unfortunately, macports also side effects your system in undesirable ways. Suddenly you’re linking to things you don’t realize, you’ve got things compiled in you don’t need/want, and so on.
So, what’s a guy supposed to do?
Well, since I’m not afraid of compiling things, I built a mini-macports for myself. I made a directory (named “slash”) in my home directory, and compiled things like readline into it. I then point the python compiles to that directory and move on with my life (I love you, –prefix). After compiling/installing PIL, Readline, etc into this directory as well as a pile of python versions, and slapping virtualenv on top of it I was feeling pretty good. I get only what I need, and virtualenv keeps things out of the global directories.
Well. Minus the fact that it’s huge, non portable and it’s sort of a pain in the ass.
Then, I got an itch — I wanted to build a “python megapack” — I lovingly named it python-kitchensink. My goal was to repeat what I did above, and then offer it as a download for people who want to avoid this pain themselves on OS/X.
Easy enough. Minus one nit.
You can’t tar the damned thing up. I don’t know if it’s a side effect of disutils/setuptools, but scripts being installed into this root, were having the #! lines hard coded to the exact path of the interpreter. This means if you went through all this compilation, and then installed easy_install — and say you did this in “/Users/jesse/myslash” — easy_install would get “#!/Users/jesse/myslash/bin/python2.6″ hard coded into it.
Instead of kitchensink, I should have named it “jesse cusses a lot”.
So, back to square one. Or rather “think about this in the back of my mind, forget about it and then change to a new job”.
Forgetting about trying to do this for OS/X, I end up needing to do something eerily similar on Fedora Core. Now, compilation of python with all the bells and whistles on Fedora is simple — “yum install xxx-devel” and then just run the compile.
The goal was to make a fully-featured python 2.6 install on FC10, and then bootstrap the user(s) into a virtualenv so that nothing got plopped into the global directories.
Well — minus the fact fedora core 10 ships with python 2.5. And tools like virtualenv/etc from the yum repos lag behind the versions I want/need. Damnit. Do I stick to RPMs? Do I bootstrap it enough to “just work” and then pip install the rest? What about python2.6? Where are my pants?
There’s another catch: it has to work on *first boot* and there’s no network on that first boot.
So, forgetting my experiences with compiling all this stuff myself on OS/X, that’s what I do at first. I install all the devel packages, build an RPM which consumes a tarball I create, and add it to a local repo, and throw it in the kickstart file which spews out the images.
Oh but wait. The hardcoded #!‘s come back and bite me in the ass. The build server compiles things in a temporary directory, and then installs easy_install and all of the other tools into the –prefix’ed python install. That temp directory is named something like “–TMPxx1341234DFLKJ1341234.xxx.hahaha”. Soooo, I get “#!/–TMPxx1341234DFLKJ1341234.xxx.hahaha/bin/python”. That’s about as useful as a beehive in my toilet.
Easy fix though: just make sure the buildserver doesn’t have anything in the eventual location of the installed version from the rpm (/opt/lazercats (ok, not really)) and just compile everything there.
Success, and win. Heck, I even get it to bootstrap virtualenvs for the users. Then I find out I’ve increased the image size by 40 or so megabytes. This immediately wipes the grin off my face and makes me realize I have again, failed. You see, I can’t freely increase the image size like that.
I need python 2.6. So, step one is to swap to fc11. Ok, good. I also want to avoid using the lag-behind vendors packages except for the bare minimum footprint I need to bootstrap the environment. This means modifying the kickstart packages list like this (note: I also can not install a compiler — which is needed for a lot of packages):
# Python utilities
# python-lxml is == 2mb
python-lxml
python-setuptools
python-crypto
python-paramiko
python-pycurl
# Needed for virtualenv < 1.0 mb
python-devel
python-setuptools-devel
Why on earth is python-devel needed for virtualenv? Why python-setuptools-devel? Whyyyy??!
Ok, so I’m only going to be stuck with upstream versions of lxml, setuptools (which hasn’t revved since the earth cooled) and a few others. Fine.
I then jump into kickstart file and pop in:
%post --nochroot
cp python-dependencies.txt $INSTALL_ROOT/root/python-dependencies.txt
%post
%include post.txt
%end
In post.txt:
# Python environment setup
# Temporarily make DNS work
echo "nameserver 10.1.1.10" >/etc/resolv.conf
# Python environment setup
( cd /root
/usr/bin/easy_install virtualenv
/usr/bin/easy_install virtualenvwrapper
/usr/bin/virtualenv /opt/thatthing
/opt/foobar/bin/easy_install pip
/opt/foobar/bin/pip -E /opt/thatthing install -r /root/python-dependencies.txt
rm -rf build/ python-dependencies
echo "export WORKON_HOME=/opt" >>/home/jnoller/.bash_profile
echo "source /usr/bin/virtualenvwrapper_bashrc" >>/home/jnoller/.bash_profile
)
rm -f /etc/resolv.conf
# End Python setup
The python-dependencies.txt is a pip requirements file and looks like this:
# use pip install -r
# http://code.google.com/p/boto/
boto
# http://docs.fabfile.org/0.9/
fabric
# http://ipython.scipy.org/moin/
ipython
# http://tools.assembla.com/yolk
yolk
# http://code.google.com/p/httplib2/
httplib2
# http://ipaddr-py.googlecode.com
http://ipaddr-py.googlecode.com/files/ipaddr-1.1.1.tar.gz
Note, I can’t also plop svn, hg, git, etc in here — so packages not on the cheeseshop in or packaged right are a no-go.
The trick here is that the %post commands in the kickstart environment run in a chroot of the OS being created. This means, once the new image is loaded (say, in EC2) I can ssh in, and hit “workon thatthing”. In reality, the WORKON dir should be elsewhere, but I’m going to let users override that. As it is, the “one true python” version is the one in /opt — no one (even me) gets to touch the system version of python.
I now have a python environment, available on first boot, isolated from the OS-provided one. I can spawn infinitely more virtualenvs and play all day long. The few global things I have are easy_install and some libraries which I hope I don’t need to rev myself.
I still haven’t licked the OS/X part. I’m probably just going to have to compile the barest possible environment in something like /opt/python-ks and go from there. Given I’d need to compile all of the dependencies into it (such as readline) I may just end up writing a big script to grab all the bits and then compile it into a location the user provides. The nice thing is that once I bootstrap python and virtualenv into the basic tree, I can use pip bundles/requirements files to pull in the rest.
All told, I sit here looking at the mess I’ve slogged through — and then I realize the entire python-packaging discussion on python-dev just exposes a whole ‘nother can of worms. Versioning in a single site-packages directory, how app developers conflict with OS vendors, etc. It’s a mess. OS Vendors lag behind developer released versions, and come to depend on what’s installed there (have you ever broken yum on a Fedora box? I have.).
I hope Tarek gets a chance to clean a lot of this up — and while I’m against “everything and the kitchen sink” in the stdlib — having some method/API of building out “an official-like” virtualenv setup (maybe making virtualenv’s life easier) would be nice.
Edit to add: I realize that hardcoding the shebang line is desirable in many cases, the obvious reason is that you need to be pointed at the interpreter which has your dependencies/libraries in it. Not having a clear way of altering that behavior (other than a “clever” sed script) is unfortunate.
See this followup as well
So, I (and many others) have lamented packaging issues in Python. Some people are focused on integrating with vendor systems (such as apt (.deb) and yum (rpm)) - while others are concerned with disutils/setuptools/etc.
Still others (like me, and maybe I'm alone) are trapped in a tween-state. We're partially using vendor systems, and partially using ...