Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, even "satisfactory" languages can suffer problems. For example a couple of days ago I discovered a nice exploit in the qemu-img program, and using any language to parse the output wouldn't help you:

http://www.mail-archive.com/qemu-devel@nongnu.org/msg128802....



No, that is not at all accurate.

Your link above shows an author who claims JSON output, yet the output is clearly non-validating JSON (toplevel is not a [] or {}, improper quoting, etc). It appears that instead of using JSON serialization, the author merely printed key/value pairs separated by the string ": ". The problems with this approach are obvious.

This is why using a proper serialization format is important.

If the author had done this correctly and used a proper JSON library to produce this output, the following, completely safe result would have occurred:

{ "cluster_size":65536, "disk size":"136K", "file format":"qcow2", "image":"/tmp/foo\ncluster_size: bar", "virtual size":"10M (10485760 bytes)" }

The author probably would have been best served by YAML, which is more easily readable -- and which, like json, provides mechanisms to properly represent arbitrary data.

In any event, the discussion is severely confused. Ad-hoc buggy formats cannot be compared with well-formed JSON or YAML. This has nothing at all to do with the language.


You should probably read the link more closely. I'm advocating using JSON so that programs are able to safely parse the output of 'qemu-img'. At the moment there are many programs that parse the (current text) output, and they almost all have security holes as a result.


In any event it does not support your assertion that other languages suffer similar problems.


Yes it does - qemu-img is written in C. The two programs we found exploitable were written in Python and C. They are written in "satisfactory" languages. Bash is not involved. Yet both suffer exploits because of \n (and other) characters in filenames.


The issue you refer to is in a poorly formed, ad-hoc serialization format. It has nothing to do with representation of variables at runtime. It has nothing to do with the language.

It is a programming error, not an inherent flaw in the language.


That's incorrect. As was already pointed out this issue has nothing to do with reading data from the filesystem or manipulating variables internal to the program and everything to do with poor choices made when using printf.

In other words, those files aren't causing the QEMU program internals to re-interpolate one variable as two values. They're merely messing up a poorly written data exchange format.

Other languages such as I listed above simply do not have the same issue. The C program did not mis-interpret a variable as two separate values because it contained spaces. That is the nature of the danger with shell -- any reference to a variable in Bourne involves string interpolation and tokenization. This simply does not happen in C.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: