Cheese shopping: Modules to follow up on: Hive.

by jesse in ,


My RSS reader betrays me each week. Every day, I get notifications on new items on the python cheeseshop and each day, I "mark as flagged" (inside of NetNewsWire) for future followup. Each week, my "flagged items" queue (and bookmark folder labeled "TO READ") gets larger and larger. Maybe 'logging about some of the ones that have caught my eye would help.

So here is an interesting pieces of cheese:

Hive:

This is a basic concurrency module that uses only dependencies available in the Python 2.5 standard library. It allows the creation of a jobfile for uses to queue work that any number of worker processes with access to the jobfile can pull from the queue and run.

Hive looks interesting (except it's 2.5 only - I can't touch it, have to keep my head in 2.4 compat mode only). Calvin's created a little module that accepts in a text file of jobs which then loops back on itself spawning the subprocesses:

    def worker(self, globalcallable=None):
        """Create a worker process. Optionally associate it with a global
        callable, which it will process and ignore others.
        """
        worker = subprocess.Popen(['python', __file__, '-j', self.filename], shell=False,
            stdout=subprocess.PIPE,
            stdin=subprocess.PIPE,
            stderr=subprocess.PIPE)
        self._workermap.setdefault(globalcallable, []).append(worker)
        return worker.pid

Calvin also made Hive aware of the platform (unix vs. win32) which is nice - the one thing I do need to figure out is the format of the job file. (I think it's "functionName *args **kwargs"). Calvin parses the job and inserts it into a sqlite database, which is actually pretty smart given you can then pass the DB around to the various subprocesses.