CheeseShopping: python-application, mglob and pickleshare

September 8th, 2007 § 2 comments

Well, it’s been a bit since I’ve done one of these run throughs. One of the RSS feeds I watch (out of 120) is for the Python Cheese­Shop — this is where a lot of very inter­est­ing mod­ules are uploaded by com­mu­nity authors, some of more inter­est than others.

When look­ing at mod­ules at the cheese­shop I always keep an eye towards code exam­ples — find­ing a par­tic­u­larly inter­est­ing imple­men­ta­tion of some­thing (say, the debug/memory.py mod­ule in python-application) always helps me improve my code/applications/etc.

I like to check out (albeit briefly) and write down notes about mod­ules of inter­est that I see — I have a back­log of around fourty mod­ules I have notes on. This morn­ing, I saw 3 that piqued my inter­est. (Note, I started writ­ing this ear­lier this week — only just now fin­ish­ing it.)

As a side note: many of these mod­ules can be installed via easy_install.py — I tend not to ran­domly install mod­ules (and pol­lute my path!), pre­fer­ring instead to grab the tar­ball and poke around in a sandbox/workingenv.py style environment.

First up is python-application (v 1.0.9) which is, to quote:

This pack­age is a col­lec­tion of mod­ules that are use­ful when build­ing python appli­ca­tions. Their pur­pose is to elim­i­nate the need to divert resources into imple­ment­ing the small tasks that every appli­ca­tion needs to do in order to run suc­cess­fully and focus instead on the appli­ca­tion logic itself.

I snagged this, and there are some excel­lent code examples/useful tid­bits in the pack­age — I don’t know if I would use the entire thing in a given appli­ca­tion — of par­tic­u­lar note was this snip­per from application/debug/memory.py:

?View Code PYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
import gc
def memory_dump():
    print "\nGARBAGE:"
    gc.collect()
    print "\nGARBAGE OBJECTS:"
    for x in gc.garbage:
        s = str(x)
        if len(s) > 80:
            s = s[:77] + '...'
        print "%s\n  %s" % (type(x), s)
gc.enable()
gc.collect() ## Ignore collectable garbage up to this point
gc.set_debug(gc.DEBUG_LEAK)

The mod­ule is doc­u­mented well — all you have to do it import * from the mod­ule and then call memory_dump() later. The datatypes.py mod­ule in the con­fig­u­ra­tion direc­tory was also a very nice exam­ple. Also, the process.py mod­ule on the top level. I’d sug­gest tak­ing a look at it just to learn more — every­one has their own mise en place or tool box so to speak — we all have our own lit­tle bits of code we carry from appli­ca­tion to appli­ca­tion. This pack­age is an good exam­ple of sim­ple, use­ful things that we always end up doing (and I learned a few tricks too).

As a side note, my tool­box code has evolved so rapidly and so dras­ti­cally from when I first started hack­ing python, I some­times won­der if one day, my basic if __name__ == “__main__”: setup will gain sen­tience. I’m glad that most pro­gram­mer like to share though — if we guarded our tools as closely as BBQ mas­ters guard their rubs and sauces, we’d be in trouble.

Update to note: The author of both pick­le­share and mglob added a com­ment to this post out­lin­ing some good infor­ma­tion, as well as the fact both tools are/will be in IPython. Also, that mglob’s syn­tax was inten­tional — “it’s opti­mized for brevity and con­ve­nience” (which is why I found it cryp­tic). For this par­tic­u­lar tool (mglob) Path would not have helped him.

The next one is pick­le­share (v0.3) quote:

Pick­le­Share — a small ‘shelve’ like data­s­tore with con­cur­rency sup­port
Like shelve, a Pick­le­ShareDB object acts like a nor­mal dic­tio­nary. Unlike shelve, many processes can access the data­base simul­ta­ne­ously. Chang­ing a value in data­base is imme­di­ately vis­i­ble to other processes access­ing the same data­base. Con­cur­rency is pos­si­ble because the val­ues are stored in sep­a­rate files. Hence the “data­base” is a direc­tory where all files are gov­erned by PickleShare.

Another quote from the readme:

Ver­sion note: this is an early beta ver­sion of the mod­ule. It has been tested (and works) in both Linux and Win­dows. This will prob­a­bly end up as the inter­ac­tive per­sis­tence sys­tem for IPython 0.7.2+, to make inter-ipython-session data shar­ing pos­si­ble in real time.

This is an inter­est­ing mod­ule — shared objects/dbs in a con­cur­rent sys­tem run the risk of var­i­ous dead­lock issues/data sync­ing issues/etc. This mod­ule aims to bypass that with the sim­ple file-based workaround. In my (admit­tedly small) test­ing it seems to get the job done just fine — the fact that the “data­base” is writ­ten to disk (and there­fore acces­si­ble with­out the pick­le­share mod­ule itself and main­tained through app runs obvi­ously) is quite nice.

Crack­ing open the pickleshare.py mod­ule itself showed some very inter­est­ing code (again, teach­ing more tricks) — for more enlight­en­ment, read the test() method44. This is a very inter­est­ing mod­ule, and the usage/class style: PickleShareDB(UserDict.DictMixin) was very useful.

I’d like to play with this mod­ule + the pro­cess­ing module.

And finally, mglob (v0.4), which is:

Usable as stand-alone util­ity (for xargs, back­ticks etc.), or as a glob­bing library for own python pro­grams. Glob­bing the sys.argv is some­thing that almost every Win­dows script has to per­form man­u­ally, and this mod­ule is here to help with that task. Also Unix users will ben­e­fit from enhanced fea­tures such as recur­sion, exclu­sion, and direc­tory omission.

I put this in my lit­tle ~/toolbox binary dir as soon as I started play­ing with it — as a com­mand line util­ity, it’s insanely use­ful (yes, I also know there are other tools out there like this). The com­mand line syn­tax is sort of counter-intuitive at first (for exam­ple, I wanted to fine all of my mp3s):

woot:~/Desktop/Downloads/tmp/mglob-0.4 jesse$ python mglob.py rec:/Users/=*.mp3

The syn­tax is rec: (recur­sive) /Users/ (direc­tory to trans­verse) =*.mp3 (files to find). This is of course explained in the help func­tions of the script. Using this in a python appli­ca­tion is also sort of cryp­tic, but matches the com­mand line:

?View Code PYTHON
1
2
from mglob import expand
expand("rec:/Users/=*.mp3")

You get a full list back from the result of the glob — this would be a seri­ous prob­lem for massive-recursive globs (the option to use a gen­er­a­tor to yield() would be nice). I like the command-line usage, but for pure code-globbing, things like Jason Orendorff’s Path mod­ule and other alter­na­tives just feel bet­ter API-wise444

  1. I’ve got­ten into the habit of TDD/reading tests to deter­mine func­tion­al­ity more and more lately learn­ing Java444
  2. Jason’s site seems down, I put a copy of path.py I had here444

4

  • Ville Vainio

    I’m the author of ‘pick­le­share’ and ‘mglob’, and I’d like to add some notes:

    - Pick­le­share is cur­rently inte­grated into IPython — all the per­sis­tent infor­ma­tion is stored in a pick­le­share db. You can find out what’s there by going to ~/ipython/db and look­ing around. Like­wise, you can eas­ily add infor­ma­tion to the IPython db by doing

    _ip.db[’mykey’] = [’some’,[’data’,‘i want to’,12], ‘store’]

    - About mglob syn­tax — actu­ally, it’s opti­mized for brevity and con­ve­nience rather than clar­ity (i.e. for lazy peo­ple). Typ­i­cally, you wouln’t write rec:/users=*.py, but rather rec:*.py (which recurses from cur­rent dir). Jason Orendorff’s path mod­ule does not really help here, I know, I checked :-P

    BTW, mglob is also avail­able as magic func­tion in IPython 0.8.2 (release on next week or so). Now, you can do cool stuff like:


    [q:/ipython]|114> mglob rec:plat*py
    SList (.p, .n, .l, .s, .grep() available). Value:
    0: .\build\lib\IPython\platutils.py
    1: .\build\lib\IPython\platutils_dummy.py
    2: .\build\lib\IPython\platutils_posix.py
    3: .\build\lib\IPython\platutils_win32.py
    4: .\IPython\platutils.py
    5: .\IPython\platutils_dummy.py
    6: .\IPython\platutils_posix.py
    7: .\IPython\platutils_win32.py
    [q:/ipython]|115> _.grep "dummy",prune=1
    ----------------> _.grep("dummy",prune=1)
    SList (.p, .n, .l, .s, .grep() available). Value:
    0: .\build\lib\IPython\platutils.py
    1: .\build\lib\IPython\platutils_posix.py
    2: .\build\lib\IPython\platutils_win32.py
    3: .\IPython\platutils.py
    4: .\IPython\platutils_posix.py
    5: .\IPython\platutils_win32.py
    [q:/ipython]|117> ls -s $_[3]
    4 .\IPython\platutils.py

  • http://www.jessenoller.com jesse

    I updated the orig­i­nal post Ville — thank you for the infor­ma­tion and thank you again for the tools.

What's this?

You are currently reading CheeseShopping: python-application, mglob and pickleshare at jessenoller.com.

meta