July 17th, 2010 § § permalink
The Google Testing Blog has a good post up right now by James Whittaker called “There, but for the grace of testing, go I” — it’s a good read, and a pertinent one for any of you/us who feel strongly about quality.
Even though I’ve spent more time then not on “the other side” of the table (Developer, noun — “focus on making software (ergo, bugs)”) I find that James’ words ring pretty loudly for me still, especially his part on risk analysis:
I am thankful that the vast majority of bugs that affect entire user populations are generally nuisance-class issues. These are typically bugs concerning awkward UI elements or the occasional misfiring of some feature or another where workarounds and alternatives will suffice until a minor update can be made. Serious bugs tend to have a more localized effect. True recall class bugs, serious failures that affect large populations of users, are far less common. Testers can take advantage of the fact that not all bugs are equally damaging and prioritize their effort to find bugs in the order of their seriousness. The futility of finding every bug can be replaced by an investigation based on risk.
I’d recommend James’ post amongst the others there on that blog — it reminded me of an old rant of mine “The cost of (not) testing software”. Anyone in the business of making something is also in the business of making bugs. It’s important for us to keep that in mind when we deal with our day to day job — and when we think about our customers. It’s also important for us to keep that in mind when criticizing or dragging any person or company or code through the muck.
The Google Testing Blog has a good post up right now by James Whittaker called “There, but for the grace of testing, go I” — it’s a good read, and...
February 27th, 2009 § § permalink
… And the case of obsessive optimization. A little while ago, I posted a small snippet of code that was designed to generate data files of a given size, based off a seed very quickly (article here). The goals of this code is/was the following:
- Generate large amounts of semi-random data quickly
- Data generation can not use /dev/urandom or other system entropy buckets. These are to slow, and having hundred of threads pulling from these buckets is a bad idea. Oh — and it needs to work on windows.
- The data must never be sync’ed to disk: when you’re generating a large data set, on the scale of hundreds of millions of files, storing it on disk sucks, and the disk becomes the bottleneck.
- Creation of the files must be at least 1 gigabit/second — this means a single thread passing one of these generators to say, a pycurl handle could “in theory” hit line speed: the generator can not be the bottleneck
- The data in theses files must be able to be recreated at any time provided you have the seed.
- Setting a seed in python’s random() has side-effect issues, and can not be used. Besides, lots of random calls are expensive.
- I need the ability to swap out the data source, I use a lorem file here, but a different type will be needed later.
- The data source should only be parsed once for the import (singleton, ho!)
- The name, and the file data must be unique — they must hash differently (to prevent de-dupers from, well, de duping them)
I am revisiting this code as we found out the original version could only generate file data at around 500 megabits/second. This is much too slow for my tastes, as I might as well be reading it from disk. We can make it faster.
After cleaning things up, removing some overly complex logic (and several moments of “what the hell was I thinking”), I came up with this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
| LOREM = os.path.join(os.path.dirname(__file__), "datafiles", "lorem.txt")
WORDS = open(LOREM, "r").read().split()
def chunker(size, seed, chunksize=1000):
word_q = collections.deque(WORDS)
seed_q = collections.deque(int(i) for i in str(seed))
# Rotate the word_q by the seed so that small files are unique.
word_q.rotate(seed)
current_size = size
while current_size > 0:
data = ' '.join(word_q)
if chunksize > current_size:
chunksize = current_size
chunksize = (yield data[0:chunksize]) or chunksize
current_size -= chunksize
word_q.rotate(seed_q[0])
seed_q.rotate(1)
class SyntheticFile(object):
""" File-Like object backed by the ``chunker`` function. Allows the
construction of an object which can be passed to something like a pycurl
handle streaming data to a server """
def __init__(self, size, seed):
"""
**size**: integer, bytes
**seed**: integer
**chunksize**: optional, integer
"""
self.chunker = None
self.size = size
self.seed = seed
def write(self):
""" unsupported, throw an error if called """
raise Exception('not supported')
def read(self, readsize):
""" Support read() - **readsize** is in bytes. """
if not self.chunker:
self.chunker = chunker(self.size, self.seed, readsize)
return self.chunker.next()
try:
return self.chunker.send(readsize or 1000)
except StopIteration:
pass
return "" |
This version hit around 618 megabits/second and it used the generator’s send() capability to allow readers using the SyntheticFile implementation to alter the chunk size they’re reading on the fly, which is important if you have a consumer that wants the ability to read small/read big/read small. Well, that’s fine and all, but I was stymied — I wanted to make this thing fly. I want to be able to generate this data at at least 1 gigabit/second, if not faster.
Astute readers may point out that there’s other ways of doing this — mmap, simply embedding the unique seed or a uuid — well, this story isn’t about that, is it?
In any case, I suspected the “data = ’ ‘.join(word_q)” line was the culprit — deque is pretty optimized, and I had removed a massive chunk of code which didn’t make sense, and in fact, cProfile showed I was right:
woot:synthfiles jesse$ python -m cProfile synthfilegen.py
2441475 function calls in 203.791 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
...snip...
610352 180.282 0.000 180.282 0.000 {method 'join' of 'str' objects}
...snip...
180 out of 203 cpu seconds, on the join alone. Curses! So this is when I really went mental (this is what happens when you’re too close to something). I decided that I needed to find some magical way of skipping the join and only reading what I needed. I ran down that rathole for a bit, until a friend of mine point out “just make the words bigger”.
Full stop. I initially discounted it, I was zeroed in on that join — oh wait. The text in the lorem file when split on whitespace is 4368 words. Joining those back together within the loop is expensive — that much I knew. I hit on the idea that if instead of considering them words, I thought of them as chunks (which is how I was treating them).
I added a method (process_chunks) which treated the data source as chunks of bytes and made the WORDS variable a list of those chunks. Initially, I set the chunk size to 100 (bytes) and here’s the cProfile output:
woot:synthfiles jesse$ python -m cProfile synthfilegen.py
2441766 function calls in 51.549 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
...snip...
610352 29.712 0.000 29.712 0.000 {method 'join' of 'str' objects}
...snip...
And now the generator is kicking data out at 2.34 gigabits. Huge success. Obviously, if you increase the chunk size, it speeds up a bit more (e.g. 300 byte chunks is about 2.5 gigabits/second). I cleaned it up a bit and here is the code:
(thanks! bitbucket.org).
Note that the speeds I’m discussing are passing the SynthFileObject to a pycurl handle and streaming it across the wire: not to disk.
All told, it was a fun little jaunt, and I’ve succeeded to make something which I considered “throwaway” into something that’s a lot more useful, clean and fast. I’ve added a handful of unit tests to my sandbox, and I might make this a real module if anyone wants it. I want to rework the _process_chunk/globals stuff, but I farted around with this long enough for now. I also want to add the ability to remove the chunking altogether and simply insert the seed into the data response, and not mess with the lorem text.
edit: I just checked in a new version which removes the _process_chunks function and other globals and moves them into a class. I hate globals.
… And the case of obsessive optimization. A little while ago, I posted a small snippet of code that was designed to generate data files of a given size, based...
October 23rd, 2008 § § permalink
Fixes a minor issue with python config file parsing.
Next up, hierarchical YAML files!
Fixes a minor issue with python config file parsing. Next up, hierarchical YAML files!
October 10th, 2008 § § permalink
With the much-appreciated help of Brandon Barry (with whom I just happen to work) — there’s been a needed update to the testbutler code base I couldn’t get to — some highlights:
- Cleaned up the CSS, moved to blueprint for the larger portion of the CSS and the start of jQuery usage for the javascript portions
- Templates have been cleaned up/gotten a major facelift
- site-media has been cleaned up
- Deleted unused code I had in the prototype
- Models cleaned up
Overall, it’s looking much better. Yes, it’s in a production-use now, and I love markdown syntax. You can see some screen shots here.
We ditched the roomba-picture, I need to find someone handy with artwork to maybe make some custom icons/pics for us (I really want a cartoony-robot-butler)
There’s a lot I’d like to do, obviously — but first I have to get started on the results trackers and the corresponding nose plugin to feed the results to the system. I figure I am going to use the nose-xunit plugin and some custom XSLT for now.
One thing I need to figure out is if the django-markdown plugin allows for relative %url% links within a block of text so we can cross-link testcases, I may write a custom template tag.
Edit: Additionally, I just committed a change to remove all notion of “component” from the system. We decided that a test case could have any number of components, or none at all, and that it was more logically consistent to track those as tags-in-the-cloud. For example, a given test case might be tagged “gui, regression, smoketest, performance” or “smoke, storage, gui” etc. Being more flexible with sorting and organization was our goal.
With the much-appreciated help of Brandon Barry (with whom I just happen to work) — there’s been a needed update to the testbutler code base I couldn’t get to —...
September 17th, 2008 § § permalink
As a long-time automation-engineer/test-focused guy I’ve pondered the great existential question of “how much testing” is enough for awhile.
More recently, I’ve started focusing on the cost of not testing a product.
Take for example, Figure 1:
Let’s take a second for terminology:
- (A) Unit tests: These are tests focused on developer and maintainer productivity. These are “close to the code” tests that run in mostly simulated environments. Unit tests are a cornerstone of Agile methodology — generally speaking, you make these before your code.
- (B) Smoke/Simulation: These are the “next layer up” — they use partial systems (e.g. your code + the guy’s next to you module) to run more integration-style testing. Smokes are normally run on every compilation of the product along with unit tests. They do not require a fully deployed, functioning system — only a small group of parts.
- © Acceptance/Functional/Regression:
- Acceptance Test: These normally comprise a large number of your tests
in an organization. Acceptance tests prove that the specific
component/feature is sane in the context of the fully deployed product
– you might require these to be fully developed, executed and passing
before a specific component or feature is merged to trunk. Acceptance
tests prove that the feature/component works as intended (not
programmed). They should be short in execution time.
- Functional Tests: Functional tests are “larger” and should test as
much of the functionality of the feature/component as possible, they
should also test with an eye towards other parts of the product and
system (e.g. integration). Functional tests should be as expansive and
detailed as possible. These can also be called Regression tests.
- (D) Stress/Scalability Tests: This should be self-evident. Stress tests
build on functional areas to push the product to it’s limits — how
many files can it hold, how many connections can it withstand, etc.
- (E) Performance Tests: Characterization of key performance stats:
Objects/second records parsed/sec, and so on.
Now, I want to point out: These definitions are part-agile and part-continuous integration. They don’t wholly mesh with terminology used your workplace, or agile. I also know definitions are a holy war, but the definitions are secondary to what I want to talk about. I also excluded specifically calling out exploratory testing.
What the hell *am* I talking about?
If you look at figure A, You’ll note I put “Test” (test engineering) off to the side to represent their particular ownership in this model. Unit Tests (and by most measures, smoke and simulation tests) are under the ownership of the core developers.
The other test areas are the ownership of test engineering — obviously they would not exclude Dev from helping though (after all, they win as a team, and fail as a team) but Test is focused on verification that the product is as tested-as-possible before it gets into stage F — the hands of the user.
Ok, this is all fine and good — but hear me out.
This diagram is about cost — for each layer the code/feature passes through emanating from the developer, the cost to the team, and the difficulty in identification and resolution climbs.
This is why Developers write a lot of unit tests and check them in so they run with every check in. Right? You’re doing that, right?! The cost for a developer to find a bug with a unit test, and the cost to fix that bug introduced through new code/refactoring/etc, is essentially 1.
Here’s a new diagram with some straw-man costs:
Essentially, it is in your best interest, as a developer, as a team, to encourage lots and lots of tests lower in the stacks shown here. It starts with comprehensive, checked in unit tests. It continues with having a strong, repeatable testing discipline (for which I recommend test automation).
Why? Because — as you move higher in the stack, that damned bug someone checked in is hidden behind layer upon layer of code. The further from the unit level a bug gets, the more components and environment variables get involved. The more of these that get involved, the harder it is to identify and fix, and the higher the cost.
Now, your bug (our bug) has not only wasted your time, it’s holding up a release, test engineers time (albeit — this is our job) is wasted. The higher in the stack a bug gets — the higher the cost in wasted man, release and test hours.
For example — your typo in some messaging code manages to sneak its way through to the (E) Performance level. Let’s say your performance tests take, oh, a week to run to completion. For some reason, this sneaky beast only pops up when your system’s clocks resync after 6 days of runtime.
So, 6 days into a 7 day test — ding fries are done — the entire system poops itself. You now have to triage the crash, you have to fix it after you identify it (which is probably going to be hard — given it’s a performance test, you shut off non essential logging) and then you need to re run the test.
You lost 6 days. More than likely, those are 6 days of lost time you didn’t allocate for when you promised the fruits of this iteration/release to those wealthy swedish bankers, eh?
God help you if your bug gets to level (F). This is called the “aversion level” because after a few of these sneak out, and the CEO of the company starts getting phone calls at 4am from those swedish bankers — you’re either going to get a stern talking to, or some time in “the box” (all CEOs have a punishment box).
Your goal is to avert bugs from reaching Level F. F stands for F’ed in the literal sense.
My point isn’t just about cost. Given this tiered approach, and the need to find as many bugs as possible, you’re going to end up having some amount of code duplication between the higher levels of testing and the unit/smoke level — after all, most of the tests above that level are external-system level tests.
Some code — or logic — duplication on a higher level isn’t always bad, given the context of where the code is running. Not to mention, frequently, the code within the product may not be in the same language as the code that’s automating the tests. Duplication of unit test logic on a system-test-level is always going to happen.
Yes, you can and should reuse code as much as possible, but you can also do this through grey-box testing approaches (e.g. exposing APIs into system internals you would not normally have access to).
Also — this means you have to give your teams time to test. You need to give them ample time to automate what is reasonable, and you need to be willing to not ship a component or feature that simply isn’t ready. Much less one that hasn’t been tested.
The last thing you want is to have a bug — no matter what it is — hit level F. You, our job on a software engineering team is to put out the absolute best product possible — and you can’t do that without filling in all of the magical testing boxes. You need to understand that for every step away from the code you get, the higher the cost.
Letting preventable bugs get in the hands of users is not avoidable — but the risk can be mitigated, and many bugs that do end up in the hands of users are avoidable. The more (and sooner) you test, the lest wealth you expend, and the happier you will be. And the more profits you will reap. We like money.
As a long-time automation-engineer/test-focused guy I’ve pondered the great existential question of “how much testing” is enough for awhile. More recently, I’ve started focusing on the cost of not testing...
September 12th, 2008 § § permalink
… Or, learn to laugh at my total inability to do web design, and lack of django-fu
So, following up (albeit slowly) on my “Decent test case tracking/registration” post, I’ve actually managed to cobble together a google code project, and a rudimentary django application.
Right now, it’s in sub-prototype stages. I’ve done a semi-production deployment internally to get feedback/usage information and suggestions. All the code is checked in and now I need to begin cleaning things up from my rather random “pooping of code”.
Not only am I learning Django while I am doing this — I’m catching up on 6+ years of changes in the web development community. The last time I was involved in any sort of web-work was when I worked for Allaire/Macromedia — and even then that was primarily on the back end to ColdFusion, not end-user interfaces.
Writing user-interfaces above a command-line utility is not exactly my strong suit. But hell, Django made it wicked easy to start hacking things together. I had the rough-backend done in less than 2 hours, which let me spend the next few days pondering schemas, mucking with many-to-many fields and other django plugins.
If you go an look at the the google code site, you’ll see I’ve started fleshing out the bits needed to outline the path of the project, and the general reasoning behind it.
Not only do I want feedback — I want to let anyone who wants to join, to join. Contribute ideas, tell me I’m doing it wrong. I already know my django code is messy (I’m working on it) — but most of all I want to help build something useful for the testing community, so if something doesn’t mesh, I want to know.
Now, I just need to read my copy of James Bennetts “Practical Django Projects” book. And make a vector-image of a cartoony roomba, or find a better image of a robot butler.
… Or, learn to laugh at my total inability to do web design, and lack of django-fu So, following up (albeit slowly) on my “Decent test case tracking/registration” post, I’ve...
September 8th, 2008 § § permalink
Peer-to-Peer systems aren’t something new. Things like Bittorrent, AllMyData Tahoe, and others have been using it for file storage for some time.
Still others use the distributed-worker methodologies to do work parceling — they register with the system, and the system hands out chunks of work without factoring in client speed/etc (e.g. distributed.net).
What if you combined the two — you used something like Bittorrent which does peer-selection and allocation intelligently, with a large distributed architecture to manage large scale test execution?
Let’s think about a common problem with test engineering. Start with a simple version — you’re designing a load test app, this app needs to generate large amounts of load against a target system.
In a normal test environment in a lab — this is “easy” — you simply make sure you have a lab with a bunch of clients, all on the same LAN and you run a test client from all of them that generate load against the system under test.
Now, let’s complicate the problem: You don’t have enough “same same” test clients. You may have some “close enough” but dang — they’re not on the same subnet, or you don’t know about them. Not having enough clients in a lab is more common than you’d think.
So how do you make a test that can take advantage of those test clients, factor in their “differences” and still make a relevant test?
Next problem. You have an application you want to run a battery of tests against. You don’t have a dedicated client, but you have the possibility of “borrowing time” from some idle machines to run those tests.
The “idle machines” all have different ram, CPU and are varying distances from the system under test on the network. You need to 1> Find them, 2> Figure out which of the available test clients is the most desirable 3> Be able to figure out the main differences between the clients to factor them into results.
You simply want the more capable clients to get more of the “important” tests, and the less capable ones to run the lesser tests. Just to add to it, you want them to possibly be capable of being slaved to a given test to help it along (i.e. a performance or generalized load generation test).
Getting back to the original thought about peer-to-peer systems, I started considering the possibility of applying the peer to peer paradigm/weighted selection to test distribution.
You have a series of clients who volunteer to participate in the swarm. The client responsible for submitting the job (a test) to the swarm would use a Weighted Voting algorithm to rank, sort and choose the “most desirable” clients to distribute a test to.
Each client would respond to a submitted request with various attributes (weights) based on OS Type, number of hops from the client submitting the job and the system-under-test, amount of ram, network speed and so on.
In the case of performance based tests, you would be able to factor these attributes into the results of the test (e.g. latency) — in other tests, you only need to gather the results.
Of course, the concept of a “use idle machines to do something” isn’t exactly new — things like distributed.net, seti@home and others do this all the time as I mentioned before.
Then you have things like buildbot — buildbot uses a dedicated (or partially dedicated) pool of machines to compile a target and execute the local unit tests against the compiled thing.
Why not make the two go hand in hand and make an intelligent weighted selection for test distribution? Let’s go back to the localized example. You have a continuous build system which compiles and run units. It then looks at a pool of test-peers who have volunteered to be part of the test-swarm and fires off the functional/regression tests (as needed, it can locally deploy or remotely deploy to a test-server).
The buildbot reports the steps as compile: pass, units: pass, and then regression: pending — the buildbot passes out the various tests to the swarm which can be executed asynchronously until all tests are completed (or error’d at which point they’re passed back to another client in the swarm).
The nice thing is that this works on both a local LAN, and a globally distributed series of test swarm participants. All you do is weight in favor of the closer clients. (oh, and your application has to be available on the network).
Over time, peers participating in the swarm can be “pushed out” — meaning they have error’d out too many times, have been caught “lying” and so on. The swarm can adapt — clients can come and go as long as a given passed out suite eventually completes. If a client fails/drops, the test is simple re-passed out.
On a localized (meaning, internal-to-your-company) level, this means you can make any client on your network a peer on the system, and the weight-based selection system still applies and you can use any type of system on your LAN — desktops, servers, highly intelligent coffee makers — anything with a network drop.
Additionally, you could point test slaves at a cluster of installed system-under-tests — individual nodes in a web farm, or your application installed on various web hosts. Or a larger system installed in various data centers. This removes the bottleneck of a singular system being tested at once (but requires a lot of intelligence on the managerial level).
It’s an idea. Something of a disconnected series of thoughts — maybe it’s silly. I like the idea of being able to intelligently leverage a series of test peers distributed anywhere and everywhere. Having a peer-to-peer testing system would be neat-o.
It’s a zombie army used for testing –Anon :)
edit: Yes, a loosely coupled, highly distributed load test could be construed as a DDoS… But that’s semantics, right?
References/Interesting Reading:
August 29th, 2008 § § permalink
So, I find myself using more and more YAML lately via the pyyaml package. When I was writing nose-testconfig my “preferred” format was/is YAML.
Now, an interesting thing I’ve noticed about all of the test configurations I am developing/working with is that they have a lot of “shared” attributes (that change infrequently) and a good number of things which change all the time.
This is the perfect spot for something like a dictionary merge. If you have a test config like this:
application:
capability: 1
url: http://foo
subsystem:
max_users: 20
For each of your configuration files, you might only override something like, max_users. For cases like this, it makes sense to load the template document (the file above) and then perform a dict.merge() after loading the second document (overriding the values in the first load) or something akin to that.
This is where my mental dilemma comes in. I could in theory, add a custom !!tag to the yaml which would take a /path/to/file.yaml and load it first, then load the second document or I could do it within nose-testconfig where you might run:
nosetests . --tc-file=myconfig.yaml --tc-rootconfig=parent.yaml
And then I would jump through the hoops (with a merge probably) within the plugin. The problem with that is that I’m worried about coupling the plugin too closely to yaml.
Now, the plugin already supports overriding multiple values: However, this doesn’t scale if you have to override a lot of them.
The most common reason I’ve found for this so far is adding new parameters and values to the YAML files — not all child configurations need to override/define the new values, instead they could just inherit from the parent.
So, the question is — how do (would) you do this so you:
- Don’t sacrifice clarity/readability
- Scales
- Doesn’t require the root document to be in the same location or have a hard coded path in the child document
- Doesn’t couple the loader (nose-testconfig) tightly with the file format
Right now, it’s copy, paste, edit all configuration files I know about, etc.
So, I find myself using more and more YAML lately via the pyyaml package. When I was writing nose-testconfig my “preferred” format was/is YAML. Now, an interesting thing I’ve noticed...
August 22nd, 2008 § § permalink
Recently, there was a thread on the testing-in-python mailing list around a proposal for a new tool called “Pythoscope” (discussion here).
Pythoscope’s mission — from the website is: “To create an easily customizable and extensible open source tool that will automatically, or semi-automatically, generate unit tests for legacy systems written in Python.” To which my general response is “woop”.
The initial version was released earlier this week. It has a launchpad site, and a detailed website.
This is pretty awesome. Just on a lark — I decided I’d run it against Python-trunk (what will become 2.6) — unfortunately, trying to generate tests for both the multiprocessing module and the threading module worked not. This is quite probably due to the fact I was not running it under the py2.6 binary on my machine, but rather the default python 2.5 — there’s some confusion about the “with” keyword ;)
I’ll unscrew my environment and get back to you on that one.
Otherwise, I ran it on some personal code, and it came up with a pretty decent series of test stubs. Then I decided to run it on svnmerge.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| class TestGetRepoRoot(unittest.TestCase):
def test_get_repo_root(self):
assert False # TODO: implement your test here
class TestTargetToPathid(unittest.TestCase):
def test_target_to_pathid(self):
assert False # TODO: implement your test here
class TestSvnLogParser(unittest.TestCase):
def test_object_initialization(self):
assert False # TODO: implement your test here
def test_object_initialization(self):
assert False # TODO: implement your test here
def test_revision(self):
assert False # TODO: implement your test here
def test_author(self):
assert False # TODO: implement your test here
def test_paths(self):
assert False # TODO: implement your test here |
Pretty neat — it generated all the stubs you could possibly think of. I am going to keep monkeying with it — and possibly contributing as it will save me a ton of time in the long run.
Recently, there was a thread on the testing-in-python mailing list around a proposal for a new tool called “Pythoscope” (discussion here). Pythoscope’s mission — from the website is: “To create...
August 1st, 2008 § § permalink
Following up on my “Finding Python people is hard” I figured I’d send the call out again.
We’re looking for local-to-massachusetts (we’re in Acton, MA) people who are interested in joining a dynamic, quality-focused test/automation team. Ideally, candidates are fluent in both testing (areas may include: performance, regression, web, streaming video, storage) and Python programming.
If you’re a strong testing person with some programming — maybe you’re not fluent in python — we’d still be interested: We have no problem teaching you Python. If you’re a strong Python person, but maybe without a testing background — you’re also welcome. Internally, we use Java/C++ and Python — experience with all, or some of those languages is great.
Even if you’re just starting out — perhaps you’ve just graduated college — we’re looking for people that want to be great engineers. We look for strong engineer skills, contribution to open-source work — we’re looking for people of many skill levels to join the team.
The role is for someone to join the Engineering team with a focus on Automated test engineering. We don’t slot people into “just testing” or “just dev” — we hire great engineers, and people who want to be great engineers. The core developers help drive testing, and the testing-focused people help drive core development. The entire company is focused on providing the highest quality product to our customers.
As part of this role, you will be developing everything from simple unit tests to highly complex functional level tests. Some of the more challenging aspects include the fact that the product itself is very performance-driven, so the tests we develop (in Python) have to be able to drive a product capable of pushing tens of gigabits of video data across the wire to it’s very limits. The product uses distributed technologies and is a loosely-coupled system — we have to test and prove that as well.
Internally, we’re using such tools as the processing library for concurrency, Nose, YAML, etc. We encourage open-source contributions and community involvement (see the nose plugin I recently open sourced, and the PEP 371 work I’ve been able to do) and exploration of new technology that might help us devise more efficient testing strategies. If you like pushing boundaries — this is the place for you.
A great example of one of the challenges is a test I’ve worked on for some time — I’ve had to design a highly concurrent test that can leverage a single test client’s resources to the max to drive load against the system, while also generating statistical anomaly events to trigger internal behavior to the system. Of course — just designing it for one test client won’t scale: This test has to be locally concurrent as well as have the ability to spread out to multiple testing clients. Oh — and it has to generate hundreds of gigabytes of data as fast as it can to push the system.
Some of the technologies I’ve been personally exploring are the Actor-Model approach to concurrent programming, Twisted for asynchronous/concurrent testing, etc. No technology or approach is excluded — we approach all of the test development with the zen of python in mind:
There should be one– and preferably only one –obvious way to do it.
And just to add to that: There is only one way to do it: The way that works. If we find that an old approach doesn’t scale or do what we need it to do, and we have a new approach that can do it better, faster, cheaper, etc — we’re not afraid to adopt it.
This is a startup: and we’re really ramping up on the automation of the tests — so nothing is set in stone. We use Ubuntu, OS/X and even Windows for the development environments. Pick your editor, pick your machine — the team is focused on making us, and you successful. Every engineer in this organization is empowered to do what it takes to get the job done.
If what I am saying sounds interesting — send me an email, or post a comment. I’m very interested in talking to you.
Following up on my “Finding Python people is hard” I figured I’d send the call out again. We’re looking for local-to-massachusetts (we’re in Acton, MA) people who are interested in joining...