Talking about PyProcPs

by jesse in ,


(I know, it's been a little bit since my last update - been crazy busy!) I spend most of my time writing tools and software against large numbers of Linux servers. I've sank a great deal of time into making multithreaded/multi-node code. But one of the things I've been noticing recently is that I have to start tracking the individual performance of nodes in the clusters.

Typically, on a Linux system, you'd use some combination of top/vmstat to track memory usage and swap activity over time. This doesn't scale well unless you spend a good deal of time looking at the individual statistics of each machine after a given run or date/time period. (And if you have to selectively track attributes of a process on a per process basis)

When we're talking about tracking things across 100 nodes in a given cluster - or more, any interactive program (or non-network enabled application with built in aggregation) falls on it's face.

In any case, yes, there are packages out there that support this type of tracking. But that wouldn't be fun, now would it? Not to mention, I need to selectively track statistics on a per-process basis.

Instead, I ran across PyProcPs a few weeks ago, and never had a chance to bump back into it until the last couple of days. It's a really cool package - it gives you a nice handy module that exposes the /proc/ filesystem to python. This allows you to attach to a given PID and track the information of that PID.

For example:

[jesse@wasabi /tmp/pyprocps-0.2]# python Python 2.4.1 (#1, May 16 2005, 15:19:29) [GCC 4.0.0 20050512 (Red Hat 4.0.0-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pyprocps >>> jvmpid = open('/var/run/guard/jvm.pid', 'r').read().rstrip('\n') >>> jvminfo = pyprocps.pidinfo(jvmpid) >>> jvminfo.vsize '311906304' >>> jvminfo.rss '39113' >>> jvminfo.nswap '0' >>> jvminfo.cnswap '0' >>> jvminfo.size '76149

This is fairly neat. Remember to read the docs: here. Obviously, you can leverage a tool like SAR on Linux hosts, but again, I enjoy doing things via Python 99% of the time.

Hopefully, more to come later. I'd be interested to see if setting up something like this with PyRo or another similar python package would work well.