Schrödinger's Type (is a namespace a box?)

by jesse in ,


I chose the title for today's foray into Duck/Latent typing from the (in)famous thought experiment Schrödinger's Cat wherein a cat is placed in a box with a radioactive isotope, and one can not observe the state (type) of the cat without irrevocably altering said state - I felt it was a dutifully ironic view of the lack of "contractual enforcement" of types except during runtime within Python. My "type(Duck)’ing: On Duck vs. Static Typing" post received some minor attention, but as with all things of an obsessive nature, I wanted to followup and delve more into the Python(ic) aspects of typing as I understand them ((Disclaimer: I am not always correct, I too am constantly learning and adapting. What may seem clear now may change as technology changes or as I learn more)).

Henry Story - the author of the article which triggered my last post made several comments - the last of which stuck out the most for me:

...snip "But in fact these languages don’t do this. they just look to see that there is a method that is named “Quack”. If there is it gets called as if it was clear that was it means was the sound. But why could a dog not have a “quack” method that meant to kill a cat? There would be no interface broken here. Just a clash of words, and we have those a lot in the english language. Things such as “bank” (the place you deposit your money at) and “bank” of a river… We disambiguate english because we take context into account, just as a programmer makes sure that when he gives objects to a method that will duck type on “quack” he makes sure never to give the method objects where “quack” means something else. Notice how this has trouble scaling though.

Anyway, I would be glad to be shown to be wrong. I just looked up a book on Ruby, and that confirmed my thinking. Do you have a pointer to a piece of code that could resolve the issue?"

As others pointed out semantics and context matter when programming. There is simply no way around context being relevant - much like language ((If you want more on Language vs. Programming, speak to "r0ml" Lefkowitz)) context and intent give relevance and inflection. Yes - a word can mean many different things - hell, some people make stuff up.

Again ignoring the fact that Henry is trying to address the semantic web with the concept of a URI/heirarchy based "typing enforcement scheme" I figured I would delve into, well, Python's faculties in this area (hint: namespaces and usage give context, and ergo you can derive type). But first a digression into namespaces.

I could speak to a person and say "How was your Day". In their crazy moon-speak, maybe they redefined "Day" to mean "Underpants". So, I just asked them how their underpants are doing - while amusing, in most places I would get slapped or fired. This is how internet flamewars break out - someone has a broken understanding of the word/world - or they don't have context and/or relevance. Who is in fundamental violation of the contract of language? Me or the person who redefined the interface (word)?

Welcome to Static Typing - because in the static world we do not trust that other programmers, people, scripts, applications or for that matter - ourselves - understand intent or context ((I have a feeling this is where the verbosity critique of Java comes into play)).

Duck typing allows for fudge - ((with the implementation of ABCs, you can make the fudge more solid)) sure, crazy McCrazerton thinks that Day means pants, Duck means Dog and all sorts of wrong things - but what if his definitions weren't explicitly "wrong"? What if they were "close enough".

By "close enough" I mean that given the context of what I was saying - he could figure out I wasn't addressing his private parts - but rather I was inquiring as to his current state. Sure! His word has all sorts of weight behind it. When he says "Day" or "Dog" - the word can bring all sorts of interesting interfaces with it - but who cares what he tacked onto the damned word: to me, if it's "Day" - then I can at least have an idea of what the hell is going on.

If it walks like a duck: Quacks like a duck, has wings, then it could be a Dog in a Duck costume. Who cares?

In our case - the Type in the Cat, the Namespace is the Box, and the isotope in the box is the inferred context of the type when we "open the box".

Code time! Yay!

At Python's very core is the concept of namespace. That's an important thing as everything within Python is a namespace and has scope and "privacy ((see __methodName))" inherent in that design.

For instance - when you perform an import, the package loader walks into the target package's namespace and works on finding the target - for example:

import animal.species.dog.quack

This means that the interpreter walks into animal, then species and dog and snags the quack.py module contained there. It then compiles the associated python code to byte-code and now has something within the global scope pointing to the imported module. For example given this module "structure": woot:~/tmp/ jesse$ mkdir -p animal/species/dog woot:~/tmp/ jesse$ cd animal/ woot:~/tmp/animal jesse$ touch __init__.py woot:~/tmp/animal jesse$ cd species/ woot:~/tmp/animal/species jesse$ touch __init__.py woot:~/tmp/animal/species jesse$ cd dog/ woot:~/tmp/animal/species/dog jesse$ touch __init__.py woot:~/tmp/animal/species/dog jesse$ touch quack.py woot:~/tmp/animal/species/dog jesse$ vi quack.py And within quack.py, all I put is:

def quack():
	return "Woof"

I pop into the interpreter and do this: woot:~/tmp/duck jesse$ python Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import animal.species.dog.quack >>> globals() {'__builtins__': , '__name__': '__main__', '__doc__': None, 'animal': } >>> quack Traceback (most recent call last): File "", line 1, in NameError: name 'quack' is not defined >>> animal.species.dog.quack >>> animal.species.dog.quack() Traceback (most recent call last): File "", line 1, in TypeError: 'module' object is not callable >>> animal.species.dog.quack.quack() 'Woof' >>> As you can see: I told the interpreter to import ((see also: Importing Python modules)) the full "path"/name of the module from dog I wanted, quack. Of course you can do things like this: >>> from animal.species.dog.quack import quack >>> quack() 'Woof' >>> But given that import statements are in the first few lines of your script ((see PEP 8)), or immediately near the code which is referenced, the scope and intent of quack within this session/application is clear. Most of the time people will pull in the top-level package, or the package right above the foo.py they wish to reference, in my case - it would be dog. I would then reference dog.quack.quack() when I needed to call the quack method in quack.py.

Now, import tricks, relative imports, etc are all interesting - and by far they give some of the best context that you could want about the initial intent of a method, class or object - but import renaming is another fantastic thing: >>> from animal.species.dog.quack import quack as dogQuack I've just saved myself some heart burn. I don't like calling foo.bar.baz.yourmom - I like calling function() or method() without the import/namespace foreplay. But I have the decency to rename the import to dogQuack so that when I reference it, I am always reminded about whose quack I'm quacking ((duck analogy: officially beaten to death)).

You should read the import section in PEP8. No, seriously.

So now that we've gone and rambled on about the context of where a Class/Object or Method might come from, or be inferred we can move on to real things, like types!

Guido van Rossum: In Python, you have an argument passed to a method. You don't know what your argument is. You're assuming that it supports the readline method, so you call readline. Now, it could be that the object doesn't support the readline method.

Bill Venners: And then I'll get an exception.

Guido van Rossum: You'll get an exception, which is probably OK. If this is a mainline piece of code and something could possibly be passed to you that doesn't have a readline method, you'll discover that early on during testing. Just as much as in a typed language when you have an interface and you know you're getting something that has the right interface but doesn't implement the right thing, or it throws an unexpected exception. You'll hopefully find that during testing.

In addition in Python, because there aren't fixed protocols, something else can be passed that also supports readline and doesn't happen to be a file, but does exactly what you need. All you need at that point is something that returns lines.

This quote hints at one of the key things about Python: Exceptions should not pass silently. More on that later.

Python has some basic object types - you know, your run of the mill int(), float(), str(). One of the first things you learn in python is what each one means. For the sake of this discussion, we'll focus on dict() (dictionary) - Python's hash/map type.

If we have an object - let's say baz and we declare it to be a dict - We can politely inquire as to it's type: >>> baz = {} >>> type(baz) >>> The builtin type() method returns an object representing the type of the object you passed to it. Not the object itself, also not the string representation of the object's type. If we wanted to be banal about this, we could do this: >>> baz = {} # assume someone passed us baz >>> foo = {} # we make foo into an empty dict... >>> type(baz) == type(foo) True >>> And yes, this sort of check works if one actually has something in it: >>> foo = {'hi':'mom'} >>> type(baz) == type(foo) # yes I know, isinstance() - I'll get to that. True >>> But if we compare the two objects directly (instead of the type objects returned by type()) we see that they are different: >>> foo == baz False >>> So - back to the topic of ducks - what makes our dict quack()? The methods it supports. Again - if it looks like a dict, smells like a dict - then it works like a dict it is a dict. Right ((see Library Reference for Mapping Types))?

Let's look at the methods a dict has, shall we? >>> foo = {} >>> dir(foo) ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] >>> Now, I want to make my own object - say, MyDuck ((yes, I like BouncyCase classes and methods, leave me alone.)): >>> class MyDuck(dict): ... def __init__(self): ... pass ... >>> dir(MyDuck) ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__', '__weakref__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] Behold: MyDuck supports all of the same attributes of dict() - and this brings us full circle.

If I create a Quack object in animal.species.duck or animal.species.duck.quack - then you can be sure that what you are looking at is a duck's Quack - you have the context - you can look at the attributes of the Quack() to ensure that is in fact, Quack.quack-able.

If some genius makes a animal.species.dog Quack object - the if you want the duck's quack - why are looking at a dog to supply your much needed quack? Maybe you want the special dog quack that uses the ducks's quack: >>> class MyDog(MyDuck): ... def __init__(self): ... pass ... >>> dir(MyDog) ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__', '__weakref__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] >>> We now have MyDog, which has all of the attributes of MyDuck, and all of the attributes of a Dict object. This is fantastic as we can remove/override/extend things from MyDuck, or Dict inside of MyDog and move on with our lives.

What if the methods clash you say? What if both MyDuck and MyDog both have the .quack() interface, but in the case of MyDog, a woof is returned instead of a quack? The answer: Why are you using Dog's quack when you know it returns something you don't want, i.e: a ducks quack?

But what if you don't know that MyDog's quack is different than MyDuck's ((why are you coding?))? Easy - when you foolishly pass MyDog() into your script's makeQuackingNoises() function, and you look for the returned 'quack' value - you're going to get an exception - and exceptions should not pass silently in the night.

But you say: A compiler would have told me this ahead of time! To which my reply would be: A compiler protects you from the most base version of human error - the typo. Whether the typo is intentional (you changed the method without watching what someone was passing you) or unintentional (you meant Dog but typed Duck), the compiler can't protect you from more serious "pilot errors" (you pass in something that works, sort of). Ergo - Testing!

I know Python is not perfect ((see: python pitfalls, python warts, python gotchas)) and as I have said before, there is something nice about static contractual enforcement - that's why I like python 3000's ABCs (or even the Roles implementation).

Also, yes - method signature enforcement can get confusing sometimes - I had to chase down a bug I checked in yesterday that was the direct result of me playing fast and loose with the rules around method arguments ((See: Method signature checking decorators)).

Note that I'd also like to point out Collin Winter's typecheck module you can also add in.

Going back to the original points however - Duck typing is flexible, powerful and yes, like all things involving those - it can be dangerous but namespace provide your context, and the inquisition of objects at runtime is easy to do, you can enforce the required contractual obligations as much as you need or want to.

Python's power comes from it's flexibility - and believe it or not, duck typing. Without it, we would lack some of the grace and power that comes with the language. A dynamic language should be exactly that: Dynamic. Static type enforcement/interfaces has benefits but it only protects you against simple bugs that a compiler can be smart enough to test.

For more thoughts/information on this - also see isinstance() considered harmful. As well as the links in my previous post.