A short list of things I don’t like about Python


Yeah, I haven’t posted in awhile – since pycon I’ve been sick off an on, working my butt off at the place which allows me to purchase ice cream for my kid, and so on. Busy busy. Not to mention, I’ve been suffering a slight case of burnout – long story.

That all being said, I think it was last week when I twittered a minor philosophical point which was picked up and ran with by pydanny. The little point I made was something like:

I don’t think it’s unreasonable to be able to name at least 5 things you don’t like/would change about something you love. Implementation details are fair game too

stop-whining.jpg
Now, before I delve into my personal list, I want to provide some context to this comment. It actually has some history, and it’s not an original thought – I think Titus Brown started a meme around this last year. In his case, it was purely based around python.

In my case, I’ve long maintained that if you can not name things you would change, irk you or generally dislike about something (not just a language) you supposedly love, whether it be a tool, a language, an OS, etc – then it shows you have a certain lack of self-awareness or pragmatism (there is another word I’m grasping for here, but it escapes me).

Historically, I’ll ask this in interview situations whether I’m speaking with someone who is a test engineer (name and explain 5 things you love/hate about automated testing?) a programmer (name and explain 5 things you love/hate about language $FOO) – generally speaking, this is great discussion fodder, and allows you to probe the thought process of the candidate.

For example, if someone says “I hate Python’s whitespace” and they’re interviewing for a Python coding position, I think it fair game to dig into that a bit and see if its rational, and ultimately ask the question: If you hate something so fundamental, why do you use it/why do you want to program in it for the foreseeable future?

In any case, I promised a few people I’d give them the shortlist of nits (I don’t hate these things, I simply dislike them) I have with Python. It’s important to remember that:

  1. I contribute work to python-core (see the multiprocessing module)
  2. I program in Python daily for work, and in my free time too.
  3. I participate (when I’m not on a self imposed exile) on the mailing lists and discussions (see Python-dev, etc.
  4. I too, am a strong believer in “put up or shut up”

Now, part of me is sad that I have to preface me being critical with a disclaimer like the above; but alas – some people, especially those on the internet, thrive on controversy and fail to read more than 5 or 6 words before posting some half-witted response, or worse yet, someone skims to the gripes I have, finds one they want to take me to task about and says “SUBMIT A PATCH !!!11″.

I do contribute back, so you can avoid telling me to submit a patch, ok?

That all being said, here’s my list:

  1. Concurrency: This is actually a love/hate topic for me. Obviously, I’m the maintainer of the multiprocessing module, which sidesteps the GIL, but the GIL is still an irritant for me (given I do write a lot of threaded code). A lot of people are very familiar with the fact I am a proponent of threads and processes/IPC, as both serve different (yet overlapping) purposes. There is room for both. Hopefully unladen-swallow will be able to get rid of the GIL, and then we can all move on with our lives: So long as in killing it, we don’t hose the ecosystem of C extensions.
    • Additionally, I would love to see a decent coroutine implementation included in the standard library, once PEP 380 is done and in the bag, if you need justification, see David Beazley’s coroutine talk. Again, while people might disagree with this, saying that coroutines/processes/threads all “do the same thing” and would violate “TIOWTDI” (There’s Only One Way To Do It) I would strongly disagree with them. In the case of concurrency, different solutions fit different problems. We do not have a grand unified theory of concurrency within python.
    • Also in the concurrency vein, I would like to see a cross language messaging/serialization system/format eventually come in. Right now, we have pickle; most recently, JSON – and JSON might be the final answer in this regard, but something akin to protocol buffers has also piqued my interest. Given we have JSON, I’m not terribly hot on this one.
    • Finally, I’d like to see more of the java.util.concurrent abstractions migrated in. I mean, using python threads isn’t hard, seriously, but more/better abstractions make things nicer for everyone.

  2. The Standard Library: This, again, is a love/hate thing – I love the standard library, and I will gladly argue with anyone who suggests getting rid of it. However, that said – I would like to see the entire thing get a much better documentation treatment, the docs while good, could be 1000x better, more clear/etc. I would also make every single module in there PEP 8 compliant. I know that sounds like a style-nazi thing, but if that’s the style we’re to use, I think the first thing to adhere to that is the standard lib.
    • It’s also disorganized. While flat is better than nested, I’m sorry – but I think making it deeper and putting all the things like one another into the same namespaces does make sense.
    • I would also break out the stdlib from core. This idea was discussed at the python language summit, and I think almost everyone there was in agreement. The idea would be to separate out the stdlib into it’s own path inside the repo, and other python implementations (such as Jython/etc) could use that copy as their copy of the stdlib modules. Anything which was CPython specific (such as multiprocessing) would stay with core/be marked as CPython only.
    • Taking this concept of breaking out the standard library a little further: I would begin to evolve it a little more quickly. There’s a strong difference between changes to the language, and changes to the standard library. In the case of the former; it should evolve slowly, and carefully. In the case of the latter (the stdlib) I think it could – and possibly should, evolve more quickly. By evolve, I mean “get cleaned up, have things removed/added” more quickly. I do not, however, mean with less thought. There’s obviously a lot of “buts” and other concerns with this idea, but it’s just a thought. I think compartmentalizing this into python-core and python-stdlib meshes with how a lot of people think about things.

  3. The Docs: I touched on this in the stdlib one, but the standard library documentation, as well as rich examples for a lot of the core features are lacking. Many of them focus on syntax and not necessarily on use. For example, I would gladly integrate all of Doug Hellmann’s Python Module of the Week posts into the standard library documentation tomorrow, and wholesale if I could – his examples are much more rich than those we find in the current docs.
    Many people, including myself, have been working on making these better – in my case, I need to overhaul the multiprocessing docs when I have a chance.
    • Don’t get me wrong – I actually appreciate the docs we have, they keep me sane, but they can be better, more clear and in some cases, more practical. One or two examples for usage just doesn’t cut it.

    • update see: http://tosh.pl/gminick/gsoc/sphinx/

    cosmic-rex-excuse-me-wtf-r-u-doin.jpg

  4. Packaging: Ahhhhh! I’m not going to go too deep into this rabbit hole, especially given I know Tarek is hacking away at making python packaging a much better animal, but the entire setuptools/eggs/distutils/etc pile is well, frustrating. I just want a clean, standard way of packaging my packages, built into core, that doesn’t force me into install into the global site-packages directory. Also, uninstall, dammit. I know setuptools and easy_install and eggs were designed to scratch an itch: and I do use easy_install, but the entire pile of things need to be made into a standard, implemented in core and we need to move on.
    • However, as I pointed out during the language summit – I don’t think something like easy_install belongs in core, instead I think core should make what easy_install does (to a certain extent) easier and standard, so people can use whatever tools/scripts/etc they want. One ring to bind them!

  5. Linting: Ok, face it, if you’re on a big enough team, you need to have a pre commit hook for your VCS that lints the code, and yacks if it doesn’t conform. I would love for one to be built into the stdlib, but something like pylint is too big, pychecker is too simple, and I haven’t used pyflakes recently enough to comment. There was a thread on python-ideas about this recently – and maybe Jeremy Hylton is right, and it doesn’t belong in core, but if that’s the case, we need to pick one to “endorse” on the python doc website. Maybe in a “getting started with developing python” document, which is linked in size 30 font, and links to a linter, maybe the pep8.py and reindent.py scripts, etc. It should be painfully obvious where to get and how to use these tools. Yeah, I know “waaaaah why didn’t they link to mine” – well, because we liked this one over here more. QQ.
    • As it stands, I can not count the number of times I’ve been asked about linters and style checkers for python code. Maybe we make three packages: python-core, python-stdlib and python-tools.

  6. Optional Static Typing: This one doesn’t make me feel like I’ll make any friends, but I would love to have pre-runtime, static typing as an option to python – maybe as a –anal-types flag. Guido has discussed (part 2) the difficulties of this before, so I don’t think this will ever come (the closest we get to “type safety” is function annotations, which make me feel funny in sensitive places). The biggest reason I have for static typing of any flavor, is that I would prefer to have the ability to catch some errors prior to runtime. That’s all. On a big enough team, all hacking on the same (massive) python code base, I’ve found you do want the ability to turn something like this on – it helps you with a (small) class of very annoying bugs.
    • I do love me the dynamic/late typing system of python, and I use it to my advantage as much as possible. So, I wouldn’t trade the dynamism of Python for static types, it’s just an nit I have. Of course, maybe something like interfaces (as Jacob points out in his list) might solve some of the issues I have (mainly bad people doing silly things). The rest of the stuff is why I write unit tests and actually run the damned code.
    • Yes, I know the drawbacks of something like this; I also don’t have some sort of magic solution to be able to wave a wand and do this. Nor do I have a concrete proposal, otherwise you’d see an email on python-dev. Other people much smarter than me have pointed out the sheer enormity and numerous drawbacks to something like this. No, I don’t expect magic fairy dusty to suddenly appear and just “make this work”.

  7. Standard Library Part II: Yeah, you might notice a lot of my gripes are around the stdlib – but in particular, I want to point out the state of XML handling in the standard library is about as clear as wearing glasses made of meat. Additionally, the httplib/urllib/urllib2 thing? Yeah. No.
    • While I’m harping on this stuff, get rid of the commands module, anything that is not in subprocess should be put there. Since I mentioned subprocess, needs more documentation also, non blocking asynchronous input/output/handling of subprocess data should be easy, and built in. There’s a GSoC project around this spinning up, so we’ll see.

That all being said, would I trade python for something else? Not right now. Most of my nits are exactly that: nits, and most of all, they’re not impossible to change or resolve (given enough time, and resources).

I can make a similar list for OS/X, Linux and other things I use day in and day out – hell, I can make one for myself (ask my wife about me griping about me sometime). I can probably make a list like this for every single thing I’ve written, tools, scripts, apps, etc.

Like I said, being aware of, and trying to overcome your own shortcomings is how we all improve. In the case of a language, you can’t just keep adding things into a standard library and call it “better” – you have to take a constant look at what you’ve done to date with a critical eye, and ask yourself “what can we do better”.

  • Introspection is the word you're looking for, not self-awareness.
  • SUBMIT A PATCH !!! 1

    And besides, that's seven points, not five. You are breaking my heart.

    No, actually, I pretty much agree on 1, 4 and 7. For the others, I don't have a problem with that, because I'm a reasonable man. George Bernard Shaw wouldn't like me.
  • iestyn
    I have just started with Python. The threading issue concerns me. BUT, after reading about a bit, it only really affects number crunching stuff and not throughput code. I been doing a bit of threading stuff with c# and before that Delphi. You know, threading ain't that great a thing. With both languages you have all that you describe in your perfect world of Static typed, threaded systems that can access all processors. However, scalability of programs have led me to creating systems that interact and co-operate in a grid style. Also static type systems are a nightmare to add plug-in solutions. Have you ever dealt with a god damn Class inheritance tree that you can't change after code goes to production. If you look at the new C# stuff, all that is Python is being introduced into it in versions 3.5 and 4.0. Dynamic Typing, Linq (List comprehensions!!!). The only gripe in my small and limited time in Python is that of GIL being a referenced based thing. But I give it one thing, its fast and it does kill objects in a timely fashion which is more than I can say for C# and Java. By the way keep up the good work on the processor Modules, Its good stuff, I love the ability to create new processors, and not have the underlying VM decide if I should have one or not :)
  • What about an IDE? I come from a Java background, but I use Python for our system administration scripts and have been working on a Python + PyQt project as a hobby. I have yet to find something that will convince me to switch from Emacs.

    I would like to be able to rename a module or a function and have all the code that references it be updated. I like autocompletion. I'd like my IDE to statically analyze the code (perhaps via pyLint or PyChecker) and give me an overview of warnings and errors in the entire project. I'd like easier navigation throughout code - for example, being able to easily find all the places where a function is referenced and jump to those places.

    Of course, this is not the fault of Python. It just happens that Java is the "hot" language out there, and IBM decided to throw millions of dollars behind building an excellent IDE for it. Having said all of that, I do have to say that I've read good things about Eclipse + PyDev since the last time I tried it, so I will give it another shot this weekend.

    I would also like to see more "pure Python" modules. Selfishly, I would like to see a pure Python implementation of a crypto library. It certainly would be more convenient to distribute Python applications when you know your code and the libraries you use will simply work on any platform that runs Python.

    I think my nitpicks are not so much about things in Python that I'd like to improve, but a reflection of that Python has a smaller community compared to the major players (Java, .Net). I guess it just means I need to do my part to support and advocate for Python if I want to see said improvements :)
  • carljm
    Emacs with Rope, etags, and flymake/pyflakes: Autocompletion? Check. Syntax errors and warnings reported inline as I type? Check. Automated refactoring? Check. Fast navigation? Check. Anything else you'd like?
  • I've never heard of Rope before. I'll have to add it to my todo list!
  • akuchling
    Regarding httplib: Python's libraries were written in the early & mid 1990s when the web was new, so they're written to allow subclassing and customization in order to allow supporting future HTTP versions, since HTTP had gone from 0.x versions to 1.0, and 1.1 was visible in the future. But now the HTTP 1.1 RFC will turn 10 next month and no one is talking seriously about an HTTP/1.2 or 2.0, so today all that flexibility is just adding complication. It's easier to use libcurl or something similar.
  • I know - and I actually use libcurl 99% of the time to be honest. There are historic reasons for all of this, but I don't think there's a good reason to maintain the status quo and not improve ;) except of course, time and resources.
  • raz
    About co-routines: I fully agree.

    Mature implementations exist. We have been using Concurrence (http://opensource.hyves.org/concurrence/) in production for a while here and couldn't be happier.
  • There are plenty of coroutine implementations out there; if someone feels strongly enough about this, they should really post it to python-ideas, and write a PEP. As it is, PEP 380 should make implementations easier.
  • Interesting post. I agree with you on the points about the Docs, Standard Library and Packaging.
  • tef
    The urllib and xml libs in the standard libray ought to be taken out the back and shot.

    I've heard similar complaints about the email module too.

    Have you heard of success typings and the dyalizer tool for erlang? It might suggest a nicer way of integrating type annotations into a language
  • Well, I wouldn't shoot them: I would give them a nip/tuck and put some sparkles on them. I haven't used the email module enough to comment, but another module which I think might benefit from a facelift is the logging module
  • Paul Boddie
    I think that a few different phenomena are occurring here, and xml and urllib are interesting examples. With xml, it's completely unfashionable to improve these modules: there are more people out there with an investment in making snide remarks about minidom than there are people willing to improve such modules, but despite the presence of alternatives outside the standard library, the improvements would probably need to start with what's already there.

    With urllib, on the other hand, there's a need to look around and see what else is available. I believe itools has modules of a similar nature, and there are third-party HTTP libraries, too. So in this case, the inspiration has to come from elsewhere, and perhaps urllib and urllib2 merely need to live on in some kind of compatibility library. The problem here is that there's a small industry in the Python scene in ignoring what people have already done: expect a PEP or two which ignore most of the work out there, despite there probably being a Wiki page on python.org with all the relevant projects listed.

    I'd really like to improve the standard library, but given that any previous advice I've offered has been more or less been brushed aside, and given that we all have only a finite amount of time to do interesting stuff, I don't feel that it's rewarding work. And I guess that you know how demanding the work is, too. Maybe this discussion will change my attitude for a short period of time and I'll do something, but I can't help feeling that "Python's Neglect" is a product of a number of factors including but not limited to things like the way Python is "consumed" by its community and the way the permissive licensing encourages such consumption (as opposed to participation).
  • I'll respond per paragraph:

    1> I agree. I think it's easy for people to throw rocks at the XML handling then to find someone willing to step up (as Tarek did for disutils) to take something and improve it incrementally over time, and volunteer to be it's shepherd. The stdlib *needs* something for XML, finding someone to do it is another thing entirely.

    2> Again; someone needs to step up. To replace something, or even be added to the stdlib, it now has to be "best of breed" (see brett's post on this here: http://sayspy.blogspot.com/2009/05/staleness-of...). As for people ignoring other things that have happened, that's par for the course, and sometimes (not all the time) even if something has happened elsewhere, it may not be the right solution. It happens. And I hate wikis, it's impossible to find anything in them, information is spread all over the place, etc.

    3> The barrier to change things is high: you not only have to have a good idea, you have to be willing to fight for that idea. And yeah, I know we're all on borrowed time, and it is demanding, so things that could change, simply don't.

    "Python's Neglect" is largely a side effect of this - the core developers are few, people have to be willing to convince many other people that their idea has not only merit, but that the person proposing the idea will be willing to help create, test, document and maintain the thing they are offering. And Python, because it is in widespread use, has to evolve slowly and relatively carefully.

    It is a product of a number of things. Low resources (people, time), high barrier (you have to be willing to fight if you believe in it) and resistance to change (we've always been at war with eurasia).

    I do not, however, agree that this is somehow a side-effect of the licensing. One of the reasons I contribute, and have been able to convince my employers to let me contribute to python-core is *because* of the permissive licensing. In my discussions with others, I have heard this anecdotal evidence repeated time, and time again.

    Python core does nothing to deny participation, except limit commit privileges to a select group of people who have gained an amount of trust. Everything is open; all discussions are a matter of public record, all checkins are public. Anyone can step up and write a pep (even me) and anyone can submit a patch to the tracker which can find it's way in.

    I wouldn't trade the openness, and permissive licensing of the code for something which "forced" (ala the GPL) people and companies to give back. While that has worked for other projects, and continues to do so, I and many other people using python get the luxury of using it, and sometimes giving back to it where we might not be able to do so with something with a forced-participation license.

    Part of the beauty of it, is that it's good for individuals, and companies. Companies with half a brain (ala Google, companies I have worked for, etc) give back what they're comfortable with, and when they can. They use python to Get Things Done - and that's what Python is about, getting things done, and not adding restrictions or forced-participation clauses when it simply isn't needed. Part of the reason I love it is this fundamental pragmatism.

    Would I like it if people/companies contributed more? Yes. Do I wish we had more people in core dev getting paid to give back? Sure. If I won the lotto tomorrow I'd probably start a small group of people whose sole job was to do nothing but improve the hell out of it. But I see no reason to force this.

    There's a lot of things at work here, and a long road to hoe to get "big" changes into python-core, but in the immortal words of Jay-Z: "I got 99 problems but licensing ain't one".
  • Paul Boddie
    Jesse, it's great to have this discussion, even though we obviously don't agree on everything!

    My remark about permissive licences didn't single them out as the only factor - if you want an example of a project with good momentum and a certain amount of community interaction with the core developers, I suppose you could also consider PostgreSQL - but the kinds of community that form around projects using different styles of licences are often quite different. I have it on good authority that when the licensing of a project becomes more permissive, you do get an influx of people who are more "consumers" of the code than collaborators around the code. Perhaps the people who feel more "comfortable" about a permissive licence are also the people who are less likely to interact with the development community around that code, who possibly don't see the value in contributing, and who don't feel a common sense of ownership of the project.

    I note that your feelings about Wiki solutions has appeared more than in one context during this debate: "And I hate wikis, it's impossible to find anything in them, information is spread all over the place, etc." Although the API documentation probably doesn't belong in a Wiki, there isn't anything better for getting people to contribute or for letting them go off and do something that the documentation maintainers didn't expect. Information about alternative solutions for accessing remote services, for example, is precisely the kind of thing Wikis work well at recording and presenting. Various alternative initiatives around the Python documentation seem to have had all the hallmarks of up-front planning and narrow, predetermined methods of contribution which probably create a queue of change suggestions with a bottleneck similar to what we already have, and leave people wanting to expand the documentation (as opposed to change small pieces of it) out in the cold.

    I'm one of the people who administers the Python Wiki, and although I'll admit that it could be better organised, the policy so far has been to tread lightly and not be too severe with the editing. If I had a greater say in the python.org "assets", I'd make a refined version of the Wiki front the whole site experience, dropping the dated and inconvenient-to-edit main site with its glacially paced changes and increasing saturation of links to the Wiki. And I'm not really a Wiki advocate - I just remain baffled at the obsession that exists in various quarters that Wiki solutions are "all very well" but there has to be a "proper site" fronting the whole affair with little concrete justification for that position.
  • I agree; it's good to discuss, and no - we don't need to agree! That's the beauty of things.

    1> Yup. I know you didn't single licensing out; it's a sensitive topic for me, as I obviously fall in the more-permissive-is-good-and-helps-me-put-bread-on-the-table group. Maybe you're right; people will be more consumers than collaborators in a more permissive model, but as I said before, I wouldn't trade that for anything. You have to pick one, and take the good with the bad. *shrug* Sure, there's been plenty of GPL-and-the-like projects which are *huge* successes, that's simply undeniable, but I don't think it's impossible to have great success with a more permissive license, see Free/Open/etc BSD for an example.

    2> Wikis are a tough thing. I firmly believe the only reason wikipedia is successful is because of stringent guidelines, and a massive staff of editors constantly patrolling and ensuring quality and factual information. Yes, wikis are good for a low-barrier-to-add, but that's *also* the drawback. A low barrier means a need for constant policing. I'm not saying I don't want people to collaborate, but how much of the information there is of good, high quality?

    3> I like your ideas around cleaning it up, and making it a better user experience. I'd also like to add that adding links into the python documentation which references wiki pages or something which says "see wiki topics on xxxx" would make things a lot better too. For example a link in the multiprocessing documentation that says "see addition information at wiki/python_version/multiprocessing/" or wiki/multiprocessing" where the top-level multiprocessing page had information about all of the releases, and pointers to more information.

    I'm looking forward to Georg and the GSoC project around the sphinx docs - I am definitely one of those people who prefers structured, high quality documentation (ergo us discussing it), but I am definitely pro making it easier to contribute and find additional information.

    I don't know if you have the time, or the will - but maybe someone does need to step up and volunteer to be the BDFL (or a period shorter than life) for the python wiki experience, and be the one to unify/make the hard decisions. Sometimes making those decisions makes you unpopular, but having a good, singular view does make for a more unified experience.
  • I recall one of the topics of discussion at the PyCon Language Summit (and I'm recalling now, at dinner some days later) was the desire to get rid of the current python.org site architecture and replace it with something easier to administer. The barrier to contribution to python.org is extremely high, even for those of us with commit access. Even a wiki-like system with editing abilities locked down Google code-style would be a massive improvement over the status quo.

    What ever became of that discussion? I seem to recall Jacob Kaplan-Moss being interested in actually implementing it. I should ping him...
  • piramida
    Great post, thanks. Agree with your points 5 and 6 strongly. I have a side question related to multi threading, especially GIL - I've heard that getting rid of it may be a daunting task since it goes on for a decade now, where can I read more about current efforts?

    The reason I'm asking is proliferation of multi (32+)-core setups in production systems, which I expect would have serious locking problems running any heavily mt-code.
  • For XML generation, you might be interested in a library I wrote:
    http://www.from.bz/2009/03/28/announcing-xmlega...
  • Paddy3118
    If I were your interviewer I would ask you to expand on the type of static typing you would put into Python.
    It has been said by some that are working now on speeding up dynamic languages, that it is early days, and that the existing adage, that static is faster than dynamic is just the status quo rather than some proven fact. You can fly to the moon with that.
    If, on static typing, the idea is to stick on a manifest type checker then I would think that silly. Maybe we haven't had the effort and smarts being put into the issue of what can be done in a dynamic language to track types throughout execution? (Or maybe someone on Pypy hasn't published yet).
    Maybe a dynamic language needs a dynamic type checker that can monitor and infer type issues throughout the life of a program. I would love an interpreter that, in checking mode, would tell me that I can no longer execute this path through this conditional, as taken X % of the times before, due to this modification to this instance.
    Or a tool were I could run up to a point, enter the debugger, and say "type check the impact of making a specific modification", and have the tool make assumptions on future typing based on type history in its calculations.

    I guess I too would like more type checking in dynamic languages, but I wouldn't want us to be fettered by what is standard for static languages.

    - Paddy.
  • I agree :)
  • Carl T.
    Maybe I've just drunk too much of the Python coolaid, but I thought that if everything is done right, you don't need static typing (6) to catch errors. All that stuff about modularization, self-documenting code, testing, etc. - pylint is big on keeping functions short and classes manageable.

    I know there are a ton of corner cases where things aren't perfectly readable or a function call gets lost in the shuffle or a test just plain misses a situation you didn't anticipate. The reason why I'm sensitive to the static typing thing is that I'm working with a bunch of Java programmers who use the static typing catches errors argument all the time. I don't buy it, although it could be I just haven't worked on a big enough project yet to see where static typing in Python is needed.

    Thanks for the post; it made me think.
  • If you care about security, a good static type system is a powerful tool to have on your side, even if you already test well. Security properties are notoriously difficult to test for, in many cases intractably so, but many can be encoded into types, provided your type system is expressive enough. When those types are checked at compile time, the checks form a guarantee that your code does not contain any of the checked-for security holes and thus cannot be exploited at run time. In other words, you can make your type system prove that your code is secure, for any definition of "secure" that you can encode into types.
  • Your code isn't more secure. It's probably safer and that's about it.

    My issue though with static typing is the difficulty to evolve your code. The way static languages like Java or C# have worked around that was by providing interfaces where you pretend being something else which is fine as long as you were telling the compiler: "I'm acting accordingly to this contract". Doesn't actually make code more secure... more typing though.
  • > Your code isn't more secure. It's probably safer and that's about it.

    I don't think you understood my claims. I'm not saying that just using static types as usual in, say Java or C#, will secure your code as a side effect. Rather, I'm saying that you can encode security information into types and *then* the compile-time checker will prove that the related security properties hold (or do not); that is, the compiler will prove that your code is secure, for whatever definition of "secure" you have encoded.

    In short, your code *will* be more secure, and provably so. In fact, the compiler will discharge the proof obligation for you. (Yes, type systems can actually do meaningful work for you.)

    For example, you can use a type-based approach to secure your code against XSS vulnerabilities (and a host of related problems).

    > My issue though with static typing is the difficulty to evolve your code.

    What makes you think that this difficulty is a property of static typing and not of the popular languages that employ static typing -- and do so poorly?

    Cheers,
    Tom
  • You know, I was thinking about making the security argument last night, but I skipped it - so thanks. Of course, you can make the argument that good unit tests would protect you the same way.
  • > Of course, you can make the argument that good unit tests would protect you the same way.

    How? Seriously, try to make that argument and see how far you get before it starts to crumble under your own scrutiny. It's an enlightening exercise.

    Cheers,
    Tom
  • Mentally, I tend to think about the concept of optional static typing for python from more of a "catch a class of stupid errors earlier" point of view. In python, these "stupid errors" are passing in an object without the proper interface, which doesn't get caught until runtime.

    The Java guys are right: It does catch a class of errors. Errors which are annoying, and stupid - and if you were to ask me, yes, unit tests would help exercise your code to find these stupid and annoying bugs. No matter how smart your annotations, static typing, etc - you will never catch 100% of the errors your application will have. It's also completely possible write java code in such a way that the compiler won't catch a class of bugs.

    Yes, Python is designed to be modularized, self documented, clear, have unit tests (everyones code should have unit tests) - but that doesn't protect you from people around you who might not be as disciplined, or even yourself (when you're coding at 2am).

    As I said above; I think my gripe (a minor one) about wanting to catch a class of errors earlier, via optional static typing is far outweighed by the benefits of dynamic typing. If having optional static typing means gimping the dynamic nature of the language, then I'll stick to really good linters and unit tests, and simply yell at people who don't check code they tested/actually executed into source control :)
  • I'm not going to register for this short comment.

    I don't foresee type inference anytime soon as long as python keeps (mutable) open types. Open types make type inference clearly violate Rice's theorem, which makes it uncomputable and many compromises would be made in the implementation.

    (In rough theory sketch, type inference only terminate because types can't count and the types specify structural order on the computing machine rather than the language computed -- mutable types can count and no structural order on the computing machine could possibly verify they are consistent which leads to making assertions about the language in question which is uncomputable).
  • No, it won't come anytime soon. I think the only thing we'll have for the foreseeable future is function annotations, and ABCs. That's it.
  • Coming from C++ to Python I hated that Python did not catch my typos before I actually tried to run the typos. However, since I discovered Pydev + Pylint combo and started writing decent unit tests I haven't missed the compilation step.

    But if Python had optional static typing I would probably test with that every now and then.
  • Which is why I suggest it be optional. I run just about every piece of code I write through pep8.py, reindent.py, and pylint. I have one hotkey that does it for me. This plus my unit tests means I'm typically OK.

    But in bigger teams, where not everyone is operating under the same mental rules, or skill levels, things slip through that cause pain for everyone. Even if I could have it only for holy-crap-critical code, I'd be fine.

    I still want the option to shut it off :)
  • ken farmer
    First off, obligatory context statement: I'm a big python fan that's been using it for about nine years now in production systems.

    One way to look at static typing vs unit testing is that it's very similar to database declarative constraints vs application procedural constraints. While you can theoretically code a lot of procedural code around testing & constraining data - in reality it is far less reliable and more work to implement than the very simple to write & maintain declarative constraints or types. There should be no debate here - declaring a variable to be an integer or a database column to be a foreign key of another table is simple and nearly full-proof! Reliance on run-time checks puts systems with high availability requirements at risk.

    Of course, the declarative approach fails in comparison to the procedural when it comes to flexibility and adaptability - so each approach has its merits and cases in which it's the best. But I would *love* to occasionally be able to statically type some python variables.
  • Yes; runtime errors side effecting high availability begets you a lot of diaper programming - for example catching Exception when main() is called, so that nothing escapes.
  • Adam
    The documentation would probably be better if it was more straight forward to fix/update. Sending in unified diffs via the not-so-easy-to-use Roundup system has probably discouraged many people who would otherwise be happy to fix up / add documentation to the std lib.
    A wiki of some kind, even with changes moderated, or the 'unified diff' automatically generated and submitted appropriately would make life substantially easier for the non-core developers out there who can provide some documentation.
  • IIUC, one of the drivers for moving from svn to mercurial was simplifying patch submission.
  • Yeah, the one drawback I see to that is that it's still source code, and that's possibly too high a bar to someone just wanting to submit a very simple text patch.
  • Yeah, that's true. For very simple stuff, a comment form would work well. Even a simplified interface to dumping comments into RoundUp might be good enough. If it was built in to each page of the docs (or linked from them), the ticket could include the source page info.
  • Bingo. I'm envisioning a link at the footer/header of each doc page that shows an "edit this page" thing, and then it pops open a web-based editor, and instead of a "publish" button, it has a link to "submit to roundup" - the unified diff is auto-attached, etc.
  • Maybe, but I'm one of those people (dinosaurs, I guess) who is firmly in the "man I hate wikis" camp. Sure, if you completely lock it down and make only core devs have commit, you can keep it sane - but then core developers have to modify docs outside of source control (in another app). The way we have things now does make the developer-writing-docs workflow easier.

    Something to think about though; I wonder if it would be possible to implement some magic javascript-editor-or-something to provide a "hack on this doc" link, at which point the source would be loaded into your browser/editor, and then you could click on a "submit this diff to bugs.python.org" button.
  • csantos
    The Numpy community created an online doc editor last year. From the project site: "Pydocweb is a tool for collaboratively editing docstrings and documentation in a Python module via the web, and merging changes made easily back to the sources." More info in:
    http://code.google.com/p/pydocweb/
    http://scipy.org/Developer_Zone/DocMarathon2008
  • akuchling
    Searching for 'Javascript diff' finds a few implementations; http://snowtide.com/jsdifflib is a partial translation of Python's difflib, for example. It might be possible to implement the 'submit diff' feature purely in the client's web browser. (The diffs wouldn't be against the original reST, but that's probably OK.)
  • I pinged georg, and his comment was "something is coming soon", I almost danced a jig
  • I agree that wikis can be useful for tossing up some basic documentation, but you really need to pay attention to the organization and keep it weeded and pruned to make a wiki work long-term. I wonder if we can combine the commenting features from djangobook.com with the sphinx output used to build docs.python.org to give some interactive feedback without resorting to a wiki.
  • I spoke to Georg about comments at pycon, I am going to follow up with him today, as well as mention the possibility of making patches easier to generate (only on the docs).

    I strongly dislike wikis for something like the python docs - I can't state this enough, you need an army of moderators, cleaners and organizers to keep a wiki in check.
  • That made me think of Mozilla Bespin[1] immediately. I believe it has a Python backend (and maybe some others as well).

    [1] http://labs.mozilla.com/projects/bespin/
  • ngregory
    Yes the docs could be improved with more examples. As a pretty much python n00b who's still finding his way I find doug hellmans PYMOTW a very interesting read and it would be great to have examples and explanation like that as part of the standard documentation, sure you can have that open in another tab if you want, but thats when once you know about it.
  • Having good, practical and useful (concrete) examples are a required part of programming languages.
  • docguy
    What I find funny is that Perl is supposed to be this crusty old arcane language with a sorta suboptimal markup format (POD), and yet the Perl docs are some of the best I've seen *anywhere*.

    How can it be that Python -- a much simpler language, and with a more complete doc system (previously LaTeX, now reST) -- still cannot compete with Perl when it comes to docs?
  • Let me guess: Could the reason perhaps be that Python's developed by a combination of people who wants things to be as they've always been ("moving away from Latex for documentation? are you nuts? editing text in a web browser? will that work in lynx? using a web framework to build a web site? that's crazy talk. please go away.") and newcomers who prefer to reinvent everything because it's no fun to look for earlier work ("hey, I have this unique idea - let's write an article about each module in the standard library!") and think that they get some kind of karma by being the last one to think of something?

    Fact is that it usually takes the Python community a decade to implement any idea worth implementing. And they have to reinvent it a couple of times before it's accepted by the majority of the developers. This is probably a good thing when it comes to keeping the core language small and focused, but I'm not sure it's the best way to produce the best possible documentation, produce the best possible compilers and core libraries, or, for that matter, give the users the best possible product given the resources at hand.
  • I'd argue that it's partially age: Perl has been around forever, and additionally the complexity of Perl makes it a fundamental requirement that the docs be as vast as humanly possible. But I don't want to get in a language war; I'd rather focus on what the python community can improve.

    See, there's an argument which actually stand up fairly well - because python is so simple, and clean, there's a low barrier to simply cracking open the source and learning how things work. I know that once I exhausted the documentation, and the books I learned it from, I immediately started pulling apart the source code to learn.

    Even given that; the docs need to be improved: Trolling through source code is not something people who are learning are typically going to do until they hit a certain pain point or plateau, therefore we have to improve.
  • mark
    The answer is simple:

    Perl's syntax was designed so that people forget it after some time not writing in it.

    Thus, you must relearn from the beginning, and improve the documentation (because you must understand it again).

    That is why they have good documentation.

    PHP has good documentation too. Good documentation however is not a requirement to have a good LANGUAGE.
  • Ivan
    But having a good language with a poor documentation means you will lose a lot of potential users. I was first drawn to PHP because of the excellent docs which have gotten even better in the last couple of years.

    I'm just starting with python and it is a little annoying having to google for things because they are hard to find in the manual.
  • Rusco
    Hi Jesse,
    I am no python submitter nor guru like you, just a stupid user.
    I agree 100% with you, std. lib and xml handling are the most urgent topics to clean up. XMLSlurper and MarkupBuilder in Groovy are just so easy to use, why can't Python do this ?

    Optional static typing (see Groovy again and ActionSript) comes in also handy, both for speed optimization as well as for library integration in large projects. I think it should be up to the programmer to make this decision.

    I just hope this goes forward !!!

    Cheers !
    Rusco
  • I would not call myself a guru; nor you a stupid user. As for XML, most people I talk to (and myself) lean towards lxml as a solution, but that adds some additional build requirements many people would find onerous.

    As for optional static typing: Read the posts from Guido I pointed to - I do not think this will happen, nor should it if it grossly hampers what we have within the language today.

    Note I did not make solid proposals; this is more of a "things I don't like that should maybe change" - if I had the time and resources to change these things; I'd be doing that right now. Without more people scratching the itches, and dedicating time to python core, I'm afraid cleanup projects will take an ever increasing amount of time.
blog comments powered by Disqus