July 30th, 2010 § § permalink
I’ve obviously been quiet here on my personal blog — as everyone who reads regularly knows I’m neck-deep in a pretty exciting startup call Nasuni as well as doing other projects, like the PSF Sponsored sprints thing. That combined with twitter means my time for other additional long-form content is minimal. So here’s a small roundup of interesting things:
Nasuni
Yup, still running Python and Django! We’re actually pretty proud to be a sponsor for DjangoCon 2010 coming up in September — I’ll be attending, so I hope to see all the familiar Django faces I know, and meet some new ones.
I’ve been blogging semi-regularly for the Nasuni blog itself — my posts are focused on product-things more than anything else. Here’s a small list of posts which I’ve done:
- The Road to Release — Feature Previews — this is actually my latest one, and the first in a series where I’ll be showing off some of the new features we’re adding in the latest release.
- Looking at OpenStack, a Rackspace and NASA initiative — For those of you who don’t know, Rackspace and NASA announced OpenStack — the awesome part? It’s all python — I had the swift component (which powers Rackspace’s cloudfiles system) of OpenStack running pretty quickly. I’d recommend snagging the code from launchpad and taking a look. Swift (the storage component) uses eventlet — and Nova (the compute part) uses Tornado and Twisted.
- Storage Switzerland Test Drives the Filer — This is a response to an article written about the product — I actually used it to preview some of the work going into the next release of the Filer.
- Thanks to Django — This piece goes into some detail about our use of Django, it’s one of our ways of saying thanks. I still need to rework it so we can send it over for the Django Success Stories page.
- Thanks to the Supporting Cast — This is an earlier thank you post — but to the other people who have helped out a ton, including Greg Newman, Lincoln Loop, and Revsys.
- The Donut Solution — This was a fun one, mainly to show that yes — we’re listening hard to customer feedback, and we’re improving/iterating quickly. Also, I get to show off UI improvements.
- Finally — The Nasuni Blog team — this is the rosetta stone for the authors of the blog, describing who we are. I didn’t write this piece, but it’s good reading to figure out who is who.
If you’re interested in Nasuni — or cloud storage in general — I’d encourage you to sign up for the RSS feed. We’re trying to keep the information useful outside of “just us” (despite my urge and predilection to churning out completely product-related posts) — and if you ever have feedback, drop us a line.
PSF Sponsored Sprints
The project continues on — we’ve funded two sprints so far, and have several more coming down the pike. We’re always in need of volunteers to help us do things like the manuals and site maintenance/content authoring. Here’s some highlights:
Please! If you’re thinking about holding a sprint - send us an application! Heck, even if we’re not sponsoring it, we’ll help promote you via the blog, and the sprint calendar we have up. A little fact? The sprints we’ve funded so far, and that are on deck for funding are all outside of the US, which is both awesome, and surprising!
PSF Board
Some of you probably know that I’m currently on the board of directors for the PSF — things progress well here, but I mainly wanted to call out the excellent blog Doug Hellmann has been authoring for PSF news. You should really be watching that because yes — we do do things, and hopefully over the next year, we’ll be doing more awesome things.
I’ve actually got a bigger post in the works for what I think the ultimate mission of the PSF is/should be as well as “how do you get money from us” as well. Must find the time!
I've obviously been quiet here on my personal blog - as everyone who reads regularly knows I'm neck-deep in a pretty exciting startup call Nasuni as well as doing other projects, like the PSF Sponsored sprints thing. That combined with twitter means my time for other additional long-form content is minimal. So here's a ...
February 9th, 2010 § § permalink
The company I’ve worked for since July of last year — Nasuni Corporation (a startup in Massachusetts) has gone live! This is the culmination of a lot of hard, but exceedingly fun and exciting work over the past months.
The Nasuni team is an excellent one — and one I am very, very proud to be a part of. Our product is called the Nasuni Filer — a simple-to-use, versioned, encrypted and cloud-storage backed virtual NAS (network attached storage) server (click here for more information).
Without going into all of the features, our goal in making this was to make cloud storage simple, accessible and secure — and I know we’ve accomplished all three. All you do is download it, boot it and start using it — once you do so you have access to truly unlimited storage. It’s an unlimited filesystem for the cloud. Here’s the elevator pitch:
Nasuni has developed a virtual file server, called the Nasuni Filer, that delivers unlimited file storage and complete file protection for businesses. Working in partnership with leading cloud storage vendors, the Nasuni Filer leverages the vast capacity of the cloud to store and protect company files offsite, while retaining the local functionality and performance of a traditional NAS.
This technology allows businesses to use the cloud provider of their choice as a replacement for traditional primary storage. Snapshots, file versioning, and offsite storage are integrated into the file server itself — ensuring business file are safe and secure at all times. No need to manage complex backup and DR schemes — if the file server is running, files are protected.
We’ve launched the Beta of the product today — anyone can sign up, download and use it. Anyone can give us feedback and suggestions — I encourage all of you who might need something like this to download and give it a try. If you want — go check out the videos we’ve put together showcasing the Filer (and better yet — check out the awesome animated cartoon we have on the front page).
Most of you know that my blog is mainly Python oriented. Suffice it to say, Nasuni — and the Nasuni Filer make use of Python for a wide range of tasks. We use Python, Django and as much of the Python ecosystem as we can to drive everything from the website, to the GUI on the appliance itself — Python is part of the DNA of the company, and it has served us well. Without Open Source and Python — I don’t think it would have been possible to build what we have built in as little time as we have.
We have a strong dedication to not just Python, but open source in general (and a fair number of us will be at PyCon this month). As time progresses, now that we’re exiting stealth mode we plan on possibly open sourcing stuff we feel would benefit the community. Some of us already push patches back where and when we can, but as I said — as time progresses this involvement will only increase.
So not only am I proud to announce the product, be part of this team and to see what we’ve made, I’m also happy to thank so many people in the Python and OSS community which have helped us reach this point.
So go — check it out, let us know what you think.
The company I've worked for since July of last year - Nasuni Corporation (a startup in Massachusetts) has gone live! This is the culmination of a lot of hard, but exceedingly fun and exciting work over the past months.
The Nasuni team is an excellent one - and one I am very, very proud ...
August 22nd, 2007 § § permalink
Before I give the links to the videos, I want to give the typical disclaimer:
Disclaimer: The opinions expressed here are my personal opinions, views, discussions, etc. Content published here is not read or approved in advance by HDS, my wife or anyone else for that matter and does not — in any way — reflect the views and opinions, positions/etc of my employer. This is my personal, largely python-related blog. Not my employers.
That being said: A few months ago, I discovered (much by accident) that HDS (Hitachi Data Systems) has started a viral marketing compaign involving Mr. T — yes, the man from the A-Team (whose face graced my lunchbox as a child). Note that massive “lulz” were attained when watching these.
Without passing judgement or in any way stating a direct opinion, here are the videos, in order of creation:
For additional amusement, I will direct you to the Archivas (before we were bought by HDS) viral/spoof/etc video that made it to youtube, here.
Before I give the links to the videos, I want to give the typical disclaimer:
Disclaimer: The opinions expressed here are my personal opinions, views, discussions, etc. Content published here is not read or approved in advance by HDS, my wife or anyone else for that matter and does not - in any way - ...
February 19th, 2007 § § permalink

I saw this post on Slashdot the other day — it’s a paper called ” Failure Trends in a Large Disk Drive Population”. It’s a good read for anyone in the storage business — hell, it’s a good read for anyone interested in computer. In section 5, under conclusions, they state:
In this study we report on the failure characteristics of consumer-grade disk drives. To our knowledge, the study is unprecedented in that it uses a much larger population size than has been previously reported and presents a comprehensive analysis of the correlation between failures and several parameters that are believed to affect disk lifetime. Such analysis is made possible by a new highly parallel health data collection and analysis infrastructure, and by the sheer size of our computing deployment.
One of our key findings has been the lack of a consistent pattern of higher failure rates for higher temperature drives or for those drives at higher utilization levels. Such correlations have been repeatedly highlighted by previous studies, but we are unable to confirm them by observing our population. Although our data do not allow us to conclude that there is no such correlation, it provides strong evidence to suggest that other effects
may be more prominent in affecting disk drive reliability in the context of a professionally managed data center deployment.
These two points are interesting. In some of the labs I’ve worked in, an astonishing number of drives die regularly. The manufacturer/distributor excuse has always been “heat issues” or “use cases”. Admittedly, the temp. range Google capped at was 50 celsius (122 Fahrenheit). In a rack with densely stacked servers (1-2U machines, rack filled) and with those machines running close to 75% and above CPU load with non-stop disk I/O (read, write, delete/format) and constant machine power cycles the temp. inside the racks could spike far past the 122 mark at which point the failure-trend Google marks starts to spike again.
Of course, in the labs I’ve been in, we were using these as test bed machines — total/high reliability was not something direly important for the simple fact that these machines were disposable.
Even with that in mind: You should always assume your disk drives are going to fail sooner than you expect. The MTBF on a large enough pool of disks not configured in a “smart” configuration (i.e. raid, arrays, etc). I’m not talking about consumer-use patterns (although, I just had a drive go south on my laptop) — I’m talking about datacenter/IT/etc use cases.
The Google paper is a good reference case, but you should remember that all use patterns are different. An application/test or system that really puts the disks to use can cause drive failures much earlier than you (or any paper) might assume. A good chunk of the “storage industry” realized this long ago — this is why companies (cough) work on software applications and intelligent hardware “wrappers” (arrays, raids, etc) to work around the basic assumption that in a large enough pool of drives, you’re going to have near constant drive failure. People might disagree with the prices or methodology, but the fact remains that the basic assumption is true.
Of course, that reasoning can be held for any piece of hardware in the typical data center. Apply too much heat/load to a pool of machines and your failure rate it going to be high unless the machines were designed with high-reliability in mind (which normal indicates RAID/Fiber/etc storage).
In any case, the paper is a good read. I’ve gone and started rambling. If you’re looking for some tools to test drives/filesystems in general, I’d take a look at the standard Bonnie/Bonnie++ and other tools, but also take a look at Rugg (built in python) and also remember that it’s important to stress a drive below the filesystem layer. Typically, this means raw-writing to the device — if you’re job is to test drive speed/reliability or test the reliability of drive drivers for your operating system, that’s a step you can’t forget.
Update: StorageMojo has a more detailed breakdown.
I saw this post on Slashdot the other day - it's a paper called " Failure Trends in a Large Disk Drive Population". It's a good read for anyone in the storage business - hell, it's a good read for anyone interested in computer. In section 5, under conclusions, they state:
In this study we ...
February 6th, 2007 § § permalink

That’s right. My awesome employer, a storage startup called Archivas is being purchased by Hitachi Data Systems, a wholly owned subsidiary of Hitachi Limited.
This is of course, awesome new for us as a company — but on a personal level this purchase shows a belief and commitment on the part of Hitachi when it comes to the product and technology I’ve put blood and sweat and tears into for the past 3 years.
Some of the news links:
From the HDS site
Search Storage
eWeek Coverage
MarketWatch
When I go to pyCon this month — I get to go as an HDS employee. This is truly awesome.
That's right. My awesome employer, a storage startup called Archivas is being purchased by Hitachi Data Systems, a wholly owned subsidiary of Hitachi Limited.
This is of course, awesome new for us as a company - but on a personal level this purchase shows a belief and commitment on the part of Hitachi when it ...
February 27th, 2006 § § permalink
January 16th, 2006 § § permalink
You should read this: The essence of a long-term digital archive
An excellent article describing digital archives in the modern IT field. For most people, this article might seem irrelevant (in the scope of things I have talked about previously — it is). However, given that this is the field I work in, it’s a subject near and dear to me.
Of course, in the interest of full disclosure, I work in this field, and I also happen to know the article’s author.
You should read this: The essence of a long-term digital archive
An excellent article describing digital archives in the modern IT field. For most people, this article might seem irrelevant (in the scope of things I have talked about previously - it is). However, given that this is the field I work in, it's a ...
October 4th, 2005 § § permalink
First, I saw this:Question: “What exactly is python used for…
The question itself is inane — but the answer is what caught my eyes.
To Quote:
Too busy to answer properly, but I’ll just mention that Fermilab uses a home-grown python package called Enstore to manage their data store of 3 Petabytes of physics data, growing at 1PB/year. The transfers of ~25TB/day to and from that system is what keeps me busy.
And then he provides this link: Presentation about the Enstore System
The answer was from a Mailing List.
The full response is at that link Googling Enstore up yields a few links:
Access to Mass Storage at FNAL
And: Dcache.Org
This is some code I have Got to see.
And to add: Information on the Enstore Project
First, I saw this:Question: "What exactly is python used for...
The question itself is inane - but the answer is what caught my eyes.
To Quote:
Too busy to answer properly, but I'll just mention that Fermilab uses a home-grown python package called Enstore to manage their data store of 3 Petabytes of physics data, growing at ...