Safe Strings for Security?

August 17th, 2007 § 7 comments

I just got done read­ing this arti­cle “A bright future: secu­rity and mod­ern type sys­tems” which reminded me of the recent Django dis­cus­sion around AutoEscap­ing — specif­i­cally “Auto­matic Escap­ing is Not a “New­bie Fea­ture””

Now, I’ve railed for and dis­cussed (much to the dis­may of my wife — I’ll talk to any­one about this, infant included) “Duck” typ­ing and why I love it — I think the abil­ity to define your own String/File/etc–likeSAFE” objects is eas­ier, and more attrac­tive in python. Do you really want a com­piler to enforce social and domain-specific rules for you?

The dynamic nature of the typ­ing within python (remem­ber -> all objects are strongly typed at run­time) makes the on-the-fly def­i­n­i­tion and cre­ation of new types/objects which con­form to a sub­jec­tive series of rules (in this case: the def­i­n­i­tion of domain-specific “mali­cious” strings) an easy and rapid-to-do task.

With no com­piler in the way to enforce the check­ing of the most banal of errors one could argue that although you may have defined a new Type-Like object (a Bear in a Duck­suit) there are no rules within the com­piler or inter­preter to enforce that that type is used at run­time or in the rest of the appli­ca­tion. This means you must rely on social con­ven­tion or some other built-in-rules system.

Per­haps you’re smart about it — and when you’re build­ing your appli­ca­tion (web or oth­er­wise) you state that Object A accepts a “string-like” object of the SafeString type. When you defined SafeString you added some hook/method for ensur­ing that yes– in fact, some­one was pass­ing you a string-like-SafeString object.

I think that the con­cept of “$N-Like” is the key to under­stand­ing what dynamic typ­ing within a lan­guage is.

But I’ve wan­dered off the track/point — Mali­cious strings are a bit of a tip of the ice­berg when it comes to appli­ca­tion design — as of the last few years it’s mainly been the con­cern of WebApp devel­op­ers who need to worry about cross-site-scripting and other style attacks where you have a large amount of two-way information.

It makes sense to add “Safe” types into frame­works (Django, RoR, Grails, etc) for web appli­ca­tion devel­op­ers. Heck — it might even make sense to cre­ate an “Escaped­Char” object in the lan­guage (python) itself. But in a lan­guage that relies on N-Like seman­tics it makes much more sense to push that up into the imple­men­ta­tion of the framework/webapp itself.

The def­i­n­i­tion of “mali­cious” is murky at times — some input that might be blan­ket escaped could be use­ful and intended for some appli­ca­tions, so it’s impor­tant that the imple­men­ta­tion is flex­i­ble enough to take that into account (a String-like-Safe-like-MyString object anyone?)

Now I’m ram­bling, dan­git. I’ve got to go back to my String java = new JavaPro­gram­mer(); code.

This (for the django inter­ested) is an inter­est­ing read as well.

  • http://blog.moertel.com/ Tom Moer­tel

    In response to your ques­tion, “Do you really want a com­piler to enforce social and domain-specific rules for you?” the answer, as far as secu­rity assur­ances are con­cerned, is yes.

    While you can cer­tainly imple­ment a run-time sys­tem for detect­ing poten­tially unsafe string inter­ac­tions, what sen­si­ble options do you have when you detect one? Remem­ber, what you have just detected is not that an attacker tried to stuff some nasty code into a form; rather, you have detected a pro­gram­ming error: the pro­gram­mer used a string in one place as if it were, say, untrusted user input and in another as if it were, say, part of a trusted, programmer-supplied tem­plate. It’s just that until some user (maybe a bad guy but very pos­si­bly a good guy) manged to walk a par­tic­u­lar path through your appli­ca­tion that exer­cised both uses of the string, the pro­gram­ming error went undetected.

    Now, how­ever, your run-time sys­tem has detected it. The prob­lem is, your sys­tem detected it in a live-running appli­ca­tion, at the site of a poten­tial unsafe string inter­ac­tion, not at the error’s ori­gin in your code, which could be a long ways dis­tant. That is, this par­tic­u­lar string inter­ac­tion may very well not be the error. The error may have been caused in the dis­tant past by code that has already exe­cuted. What’s worse, all the code in between the error’s ori­gin and the present site of detec­tion could have been caus­ing side effects: writ­ing to the data­base, updat­ing ses­sion vari­ables, launch­ing the mis­siles, etc. How do you safely undo all those effects?

    That’s the real problem.

    In a compile-time sys­tem, when a poten­tially unsafe string inter­ac­tion is detected, you have all the time in the world to hunt the prob­lem down to its true source, wher­ever that may be, and fix it.

    In sum, a compile-time sys­tem offers you the assur­ance that none of the checked-for prob­lems can occur at run-time. A run-time sys­tem can only guar­an­tee that, if a checked-for prob­lem occurs, it will be caught. By then, how­ever, it may already be too late to avoid the consequences.

    Cheers. –Tom

  • http://www.jessenoller.com jesse

    Good points, all of them.

    If you’re going to go as far as to shackle a devel­oper into the SafeString type you’ve defined, you also have to block/deny any sort of object poly­mor­phism as well — I can’t see you able to enforce the compile-time rules with­out block­ing the that as well.

    It’s food for thought — although when it comes to most appli­ca­tions, it’s user-input or input-generated from an untrusted third party that most people/programmers con­cern them­selves with. For exam­ple, it would be inter­est­ing to enforce these style rules within an DB ORM (both to and from) and things such as networking/other com­mu­ni­ca­tions lay­ers. It’s this type of hard­en­ing that only really exposes itself at run­time in most instances.

    I’m won­der­ing about the scope of safety you’re pro­vid­ing by bak­ing in a new safe type into the lan­guage — you’re only pro­tect­ing a pro­gram­mer from lever­ag­ing an unsafe string dur­ing code-compilation time. I think (opin­ion mind you) that your time would be bet­ter spent doing exactly what you pointed out as a flaw in my sug­ges­tion: design­ing ways to cope with these issues/attacks at runtime.

    If you treat all data (strings) as untrusted data to begin with — both user and pro­gram­mer gen­er­ated alike and build into your sys­tem a clean/abstracted way of deal­ing with and cap­tur­ing dan­ger­ous objects your appli­ca­tion would be infi­nitely more secure with­out the need to enforce it at compiler-time.

  • Ants Aasma

    I feel you didn’t get the point of the type based approach. I think that cross-stite script­ing bugs are not a case of mix­ing up unsafe and safe strings, its a case of type mis­match, like adding together feet and meters. So if you have a user sup­plied string and you out­put it to a html tem­plate you need to encode it as an HTML string lit­eral (i.e. escape char­ac­ters that have spe­cial mean­ing in HTML). You could also have an user sup­plied html> lan­guage frag­ment (like this com­ment­ing sys­tems allows) in which case you want to have a func­tion that inter­prets the string as html, removes all the dan­ger­ous stuff and returns an html frag­ment. When you have detailed enough type infor­ma­tion you have the abil­ity not only to detect type mis­matches, but to do type coer­cion. For the feet and meters exam­ple if you add two quan­tity objects you can do auto­matic unit con­ver­sion when they’re of the same dimen­sion and raise an Excep­tion when, you’re adding a quan­tity of time to a quan­tity of dis­tance. For the html exam­ple when con­cate­nat­ing html and reg­u­lar strings it is pos­si­ble to auto­mat­i­cally escape the strings.

    I threw together a proof of con­cept imple­men­ta­tion of that a few days ago. It’s cur­rently avail­able at http://dpaste.com/17329/

  • http://www.jessenoller.com jesse

    Thanks — I did get that approach: I was ques­tion­ing the use­ful­ness of doing this in the com­piler. Thank you for the exam­ple though

  • http://blog.moertel.com/ Tom Moer­tel

    jesse, thanks for your response.

    You are miss­ing some­thing fun­da­men­tal to the nature of (mod­ern) sta­tic type sys­tems. You wrote that:

    [With a compile-type safety sys­tem] you’re only pro­tect­ing a pro­gram­mer from lever­ag­ing an unsafe string dur­ing code-compilation time. I think (opin­ion mind you) that your time would be bet­ter spent […] design­ing ways to cope with these issues/attacks at runtime.

    What you are miss­ing is that that compile-time sys­tem proves that these issues can­not occur at run time. Once you have that guar­an­tee, you have no need to attempt to cope with the issue any­more, at all, let alone at run time. The issue is gone.

    And as far as being “shack­led” into the sys­tem goes, the only thing you are shacked into is not doing things that would be poten­tially unsafe if allowed at occur at run-time. In my Haskell-based imple­men­ta­tion, for exam­ple, it’s no more dif­fi­cult to write safe-string code than it is to write unsafe everything-is-just-a-string code, but it’s a whole lot safer.

    Cheers,
    Tom

  • http://www.jessenoller.com jesse

    Thank you — more to think about on my part, I still see you hav­ing to address unsafe strings at run­time on non-static strings, at the end of the day, bak­ing more safety into the compiler/interpreter/language is never a bad thing.

  • http://blog.moertel.com/ Tom Moer­tel

    You pose a good ques­tion: Does a sta­tic sys­tem need run-time help to address non-static strings? Hap­pily, the answer is no.

    That’s because even though the val­ues of non-static strings may be unknown until run time, the types of the strings will be known at com­pile time, and that’s enough for the sys­tem to do its work. If the sys­tem knows, for exam­ple, that some func­tion takes an HTML tem­plate and a string rep­re­sent­ing plain-old text, the sys­tem can ver­ify at com­pile time that the way the func­tion com­bines the text with the tem­plate is safe, even though the sys­tem may have no idea what val­ues the tem­plate or the text will take until run time.

    Cheers! –Tom

What's this?

You are currently reading Safe Strings for Security? at jessenoller.com.

meta