No, even "satisfactory" languages can suffer problems. For example a couple of d...

nnnnnnnn · on Sept 6, 2012

No, that is not at all accurate.

Your link above shows an author who claims JSON output, yet the output is clearly non-validating JSON (toplevel is not a [] or {}, improper quoting, etc). It appears that instead of using JSON serialization, the author merely printed key/value pairs separated by the string ": ". The problems with this approach are obvious.

This is why using a proper serialization format is important.

If the author had done this correctly and used a proper JSON library to produce this output, the following, completely safe result would have occurred:

{ "cluster_size":65536, "disk size":"136K", "file format":"qcow2", "image":"/tmp/foo\ncluster_size: bar", "virtual size":"10M (10485760 bytes)" }

The author probably would have been best served by YAML, which is more easily readable -- and which, like json, provides mechanisms to properly represent arbitrary data.

In any event, the discussion is severely confused. Ad-hoc buggy formats cannot be compared with well-formed JSON or YAML. This has nothing at all to do with the language.

rwmj · on Sept 6, 2012

You should probably read the link more closely. I'm advocating using JSON so that programs are able to safely parse the output of 'qemu-img'. At the moment there are many programs that parse the (current text) output, and they almost all have security holes as a result.

nnnnnnnn · on Sept 6, 2012

In any event it does not support your assertion that other languages suffer similar problems.

rwmj · on Sept 6, 2012

Yes it does - qemu-img is written in C. The two programs we found exploitable were written in Python and C. They are written in "satisfactory" languages. Bash is not involved. Yet both suffer exploits because of \n (and other) characters in filenames.

nnnnnnnn · on Sept 6, 2012

The issue you refer to is in a poorly formed, ad-hoc serialization format. It has nothing to do with representation of variables at runtime. It has nothing to do with the language.

It is a programming error, not an inherent flaw in the language.

fffggg · on Sept 7, 2012

That's incorrect. As was already pointed out this issue has nothing to do with reading data from the filesystem or manipulating variables internal to the program and everything to do with poor choices made when using printf.

In other words, those files aren't causing the QEMU program internals to re-interpolate one variable as two values. They're merely messing up a poorly written data exchange format.

Other languages such as I listed above simply do not have the same issue. The C program did not mis-interpret a variable as two separate values because it contained spaces. That is the nature of the danger with shell -- any reference to a variable in Bourne involves string interpolation and tokenization. This simply does not happen in C.