<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Chroot and Python discussion and random pyc thoughts</title>
	<atom:link href="http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/feed/" rel="self" type="application/rss+xml" />
	<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/</link>
	<description>python, programming and other things</description>
	<pubDate>Thu, 24 Jul 2008 05:35:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: Doug Napoleone</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-134</link>
		<dc:creator>Doug Napoleone</dc:creator>
		<pubDate>Thu, 28 Jun 2007 22:15:12 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-134</guid>
		<description>Sylvain,

It's on my list. I am hoping to submit a talk proposal on it for PyCon2008, but that is so far in the future as to me never.
I will add it to my list of things to blog about. I hope to get pygments integration before then. There are some NDA concerns, but not too many, as long as I stay away from copyrighted code and actual grid configurations. Something geared towards Amazon's S3 should work well.

NOTE: that should read 256bits, not 256bytes above. The math does not work otherwise.</description>
		<content:encoded><![CDATA[<p>Sylvain,</p>
<p>It&#8217;s on my list. I am hoping to submit a talk proposal on it for PyCon2008, but that is so far in the future as to me never.<br />
I will add it to my list of things to blog about. I hope to get pygments integration before then. There are some NDA concerns, but not too many, as long as I stay away from copyrighted code and actual grid configurations. Something geared towards Amazon&#8217;s S3 should work well.</p>
<p>NOTE: that should read 256bits, not 256bytes above. The math does not work otherwise.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jesse</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-133</link>
		<dc:creator>jesse</dc:creator>
		<pubDate>Thu, 28 Jun 2007 21:23:13 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-133</guid>
		<description>&lt;b&gt;This is even more interesting considering the large amount of projects relying on eggs and setuptools that pollute sys.path with each single egg directory in the path.&lt;/b&gt;

Ugh. Don't remind me about that. Avoiding putting/installing things in the main system library is one of my goals. I've started using a sitecustomize.py file that points to /Users/jesse/python/modules and installing everything I can there.</description>
		<content:encoded><![CDATA[<p><b>This is even more interesting considering the large amount of projects relying on eggs and setuptools that pollute sys.path with each single egg directory in the path.</b></p>
<p>Ugh. Don&#8217;t remind me about that. Avoiding putting/installing things in the main system library is one of my goals. I&#8217;ve started using a sitecustomize.py file that points to /Users/jesse/python/modules and installing everything I can there.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sylvain</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-132</link>
		<dc:creator>Sylvain</dc:creator>
		<pubDate>Thu, 28 Jun 2007 21:19:59 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-132</guid>
		<description>Douglas,

This is really interesting. Is there a chance you discuss it more on your blog or somewhere public? I'm looking at potential large grids like that in the future and would be definitely interested in your experience (module any NDA).

This is even more interesting considering the large amount of projects relying on eggs and setuptools that pollute sys.path with each single egg directory in the path.

Thanks for your share anyhow.</description>
		<content:encoded><![CDATA[<p>Douglas,</p>
<p>This is really interesting. Is there a chance you discuss it more on your blog or somewhere public? I&#8217;m looking at potential large grids like that in the future and would be definitely interested in your experience (module any NDA).</p>
<p>This is even more interesting considering the large amount of projects relying on eggs and setuptools that pollute sys.path with each single egg directory in the path.</p>
<p>Thanks for your share anyhow.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Napoleone</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-131</link>
		<dc:creator>Doug Napoleone</dc:creator>
		<pubDate>Thu, 28 Jun 2007 19:45:54 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-131</guid>
		<description>Yes I do want to help, but I have some serious backlog I need to resolve first. I don't want to commit to something until I am sure I can keep that commitment. I will be sending you an e-mail with more details soon.

Thanks for the award, but I don;t think I deserve it :-)</description>
		<content:encoded><![CDATA[<p>Yes I do want to help, but I have some serious backlog I need to resolve first. I don&#8217;t want to commit to something until I am sure I can keep that commitment. I will be sending you an e-mail with more details soon.</p>
<p>Thanks for the award, but I don;t think I deserve it :-)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jesse</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-130</link>
		<dc:creator>jesse</dc:creator>
		<pubDate>Thu, 28 Jun 2007 19:24:25 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-130</guid>
		<description>I had not even though about the grid ramifications on this - I'm in a distributed system, but the cluster is comprised of individual nodes with no shared back end (and even if it *is* shared, it's fiber to a SAN).

Wanna help with driving 304 forward?

Also, you win for "best comment anywhere ever" award.</description>
		<content:encoded><![CDATA[<p>I had not even though about the grid ramifications on this - I&#8217;m in a distributed system, but the cluster is comprised of individual nodes with no shared back end (and even if it *is* shared, it&#8217;s fiber to a SAN).</p>
<p>Wanna help with driving 304 forward?</p>
<p>Also, you win for &#8220;best comment anywhere ever&#8221; award.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sylvain</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-129</link>
		<dc:creator>Sylvain</dc:creator>
		<pubDate>Thu, 28 Jun 2007 19:15:22 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-129</guid>
		<description>Nice article.</description>
		<content:encoded><![CDATA[<p>Nice article.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Napoleone</title>
		<link>http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-128</link>
		<dc:creator>Doug Napoleone</dc:creator>
		<pubDate>Thu, 28 Jun 2007 19:12:23 +0000</pubDate>
		<guid isPermaLink="false">http://jessenoller.com/2007/06/28/chroot-and-python-discussion-and-random-pyc-thoughts/#comment-128</guid>
		<description>'Could' is the operative word here. This is a huge issue for grid environments where multiple versions of python access the same 'release' of python code.

The network traffic just to find out that there are no +w permissions on a directory can be huge for micro-tasks on a grid. If there is +w access (as many times there needs to be for sandbox development), then you have all those writes and cross writes, lock misses, race conditions, etc. This is why SEO (Sony Entertainment Online) has their own custom python which does not do .pyc or .pyo at all. We do our own custom hacks, but being able to specify the 'build' directory at runtime is a feature us grid folks would love.

As a workaround we have our own special python compile code which ensures the full path to the .py is compiled into the .pyc, then we move the .pyc/.pyo off to a special build directory, and run directly from the .pyc/.pyo's. This means that the exception stacks are correct, but no .pyc .pyo building occured for 'released' versions. This can also be done for development, but it means that an extra 'build' step is required. There are other extensive hacks done with custom import hooks to reduce the pythonpath searching which is also network intensive. 

Theoretical example: 1000 machines, 8 python processes each, just one network drive on the python path (yea right), and say only 25 modules. search for .so, .pyd, .pyc, .pyw, .py  = 1Million network lookups in under a second. Each network operation can be actually up to 12 network operations/transactions using up over 256 bytes each. Total network traffic for just FINDING the .py files (not loading them) in this modest example would be 256Meg/sec.

Just looking for .py files is 1/4th of your theoretical max gigabit backbone. Yes there are ways around this (some described above), but they are not simple or elegant.

So yes, this is a 'hot' issue for some people :-)</description>
		<content:encoded><![CDATA[<p>&#8216;Could&#8217; is the operative word here. This is a huge issue for grid environments where multiple versions of python access the same &#8216;release&#8217; of python code.</p>
<p>The network traffic just to find out that there are no +w permissions on a directory can be huge for micro-tasks on a grid. If there is +w access (as many times there needs to be for sandbox development), then you have all those writes and cross writes, lock misses, race conditions, etc. This is why SEO (Sony Entertainment Online) has their own custom python which does not do .pyc or .pyo at all. We do our own custom hacks, but being able to specify the &#8216;build&#8217; directory at runtime is a feature us grid folks would love.</p>
<p>As a workaround we have our own special python compile code which ensures the full path to the .py is compiled into the .pyc, then we move the .pyc/.pyo off to a special build directory, and run directly from the .pyc/.pyo&#8217;s. This means that the exception stacks are correct, but no .pyc .pyo building occured for &#8216;released&#8217; versions. This can also be done for development, but it means that an extra &#8216;build&#8217; step is required. There are other extensive hacks done with custom import hooks to reduce the pythonpath searching which is also network intensive. </p>
<p>Theoretical example: 1000 machines, 8 python processes each, just one network drive on the python path (yea right), and say only 25 modules. search for .so, .pyd, .pyc, .pyw, .py  = 1Million network lookups in under a second. Each network operation can be actually up to 12 network operations/transactions using up over 256 bytes each. Total network traffic for just FINDING the .py files (not loading them) in this modest example would be 256Meg/sec.</p>
<p>Just looking for .py files is 1/4th of your theoretical max gigabit backbone. Yes there are ways around this (some described above), but they are not simple or elegant.</p>
<p>So yes, this is a &#8216;hot&#8217; issue for some people :-)</p>
]]></content:encoded>
	</item>
</channel>
</rss>
