Bingo. Both returning NULL from alloc(0) and assuming that alloc() never returns NULL are correct behavior by the compiler. It's simply taking advantage of the nasal demons.
It seems that LLVM and Clang have generally been much more aggressive about taking advantage of undefined behavior, which has ended up breaking code that worked under gcc. For example, with some versions of Clang, if you try to write something like (int )0 = 0, it'll completely skip that line when generating code! Completely valid, since dereferencing NULL is undefined behavior, but not what one might expect after using a compiler that's more obedient.
There's a great post on undefined behavior in C, what it means for a compiler, and how LLVM and Clang deal with it here:
But where exactly is it specified that alloca(0) is undefined? None of the man pages I could find say that 'size' may not be 0.
EDIT: no idea why people vote me down for this, it is a legit question. Downvotes are for comments that don't contribute to discussion, not for comments that you disagree with.
Creating a file of 0 bytes is well-defined so why would it be such a strange idea to think that alloca(0) would also be well-defined, especially because none of the man pages mention that it's not allowed?
My man page says, "The alloca() macro allocates size bytes of space in the stack frame of the caller. This temporary space is automatically freed on return. alloca() returns a pointer to the beginning of the allocated space."
What does it mean to allocate zero bytes on the stack frame? What does it mean to return a pointer to the beginning of zero bytes of allocated space?
You can't safely do anything with such a pointer within the confines of defined behavior in C, therefore there are no requirements placed on its value.
On the contrary, you can do something safely with such a pointer - compare it with another pointer. If the value of 'alloca(0)' is _unspecified_, then this comparison is legal, and its result, while not specified, must be consistent. In other words, code like this should not fail the assertion:
Note that is_x_null can be either 0 or 1, and this value may even vary between runs. But it's not acceptable for the assert to fail here, and it sounds like it would with this bug.
If you argue that alloca(0) is 'undefined behavior', of course, all this goes out the window. Since alloca is not standardized, though, one can't really argue this - all of alloca's behavior is implementation-defined, and so if llvm-gcc really wants alloca(0) to be UB, it should document this fact as a porting concern.
Incidentally, all of this applies to malloc(0) as well. C99 defines malloc(0)'s behavior as follows:
> If the size of the space requested is zero, the behavior is implementation- defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.
This is actually even stricter than 'unspecified', in that it requires the implementation to document which choice it takes.
Good point that you can compare the pointer with another pointer. However, your bit about consistency only applies to results from the same call. Nothing says that alloca(0) must return the same pointer on two different calls, and indeed one would naively expect the opposite. While one could certainly claim a compiler bug if the same value compared both equal to and not equal to NULL, that's not the case in the example given; instead, alloca(0) behaves as if it returns NULL in some cases, and non-NULL in other cases, which is completely allowed.
It's true that alloca is all implementation-defined, but do you know of an implementation where the value of alloca(0) is defined? On OS X, it is not defined.
Comparing alloca(0) to malloc(0) is off-base. The return value from malloc(0) has an important requirement that alloca(0) lacks: you must be able to pass the result to free(). Therefore, the compiler can't have it be an arbitrary value, whereas replacing all calls to alloca(0) with (void *)arc4random() would be (aside from the side effect of calling arc4random()) a valid transformation.
> instead, alloca(0) behaves as if it returns NULL in some cases, and non-NULL in other cases, which is completely allowed.
Not really. I can't reproduce the optimizer bug here (on the contrary, the optimizer in the version of llvm-gcc I have installed seems to assume it is null[1]), but if alloca() is returning null and the optimizer assumes it's non-null, the value will be treated as non-null in the function but null if, say, the value is passed into a non-inline function which compares it to null.
[1] i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00)
The 'consistency' I speak of is the idea that, if I assign alloca(0) to a variable, it's either NULL or non-NULL - I can't look at that same variable two different ways and get different answers. From the sounds of the article, this requirement might be violated by this bug.
I don't think so. Only one comparison is being done in the example code. I'd bet that if you e.g. printed out the value of the pointer before the if statement, the optimization would no longer kick in as it does.
"On the contrary, you can do something safely with such a pointer - compare it with another pointer. If the value of 'alloca(0)' is _unspecified_, then this comparison is legal, and its result, while not specified, must be consistent."
While you would expect a comparison to another pointer to be legal, I don't believe it has to be consistent. That's the whole point of undefined behavior. In one sense, it's like the question of whether NaN == NaN. In many implementations, that expression is not necessarily true--even though based on your pointer example logic it should be.
Referencing the OP: the real problem here is not that the NULL check got optimized out and so seemed inconsistent. That's within the tolerance of C's undefined behavior (which is designed for optimizations like this and is what makes it so hard to write truly sane C code.) The real problem is a bug with the code as written. One should never check against and undefined result. Instead there should have been an initial boolean guard against the case of alloca'ing 0 bytes. I __could__ see an argument for expecting that to be always null (in fact, that's the argument for higher level languages!) But expecting it to be a specific value in relation to the stack? That seems arbitrary at best.
> While you would expect a comparison to another pointer to be legal, I don't believe it has to be consistent. That's the whole point of undefined behavior. In one sense, it's like the question of whether NaN == NaN. In many implementations, that expression is not necessarily true--even though based on your pointer example logic it should be.
Right, this is a question of whether alloca(0) is undefined behavior, an unspecified return value, or implementation-defined behavior. And I argue that alloca's behavior is implementation-defined in the first place, so it LLVM-gcc wants alloca(0) to be UB it needs to define that :)
"the case of alloca'ing 0 bytes. I __could__ see an argument for expecting that to be always null"
??? I would expect that it always succeeds. Rationale: if, at some time and place, alloca(n) succeeds, I expect that any alloca asking for less space would succeed, too.
Not sure I agree with that, allocating 0 bytes seems pretty well-defined to me, just like allocating a file of 0 bytes is well-defined.
That said, http://pubs.opengroup.org/onlinepubs/009695399/functions/mal... explicitly mentions that malloc(0) is implementation-defined, not undefined. It would seem strange to me if alloca(0) is supposed to be undefined instead of implementation-defined.
Files have metadata, and exist in an environment where the OS ensures that appending to a file doesn't overwrite another file. Neither of those apply to pointers in C.
Not sure why you mention overwriting. If you allocate a memory block of 0 bytes then it would seem logical to me that you can't write anything to it. NULL would also satisfy that definition, but alloca(0) is supposed to return a region of 0 bytes at the end of the stack... meaning a pointer to the end of the stack.
Everything in C operates according to the "as if" rule. In other words, the resulting program merely needs to behave as if the result was executed according to the spec. How execution happens is left entirely up to the compiler.
C also doesn't define "the stack" in any way. And of course alloca() isn't part of C at all, but on the systems where it exists, it's not very specific about just where it allocates stuff, just that it's on that nebulous "the stack".
Given the above, I believe you cannot write a conforming program that can detect the difference between alloca(0) returning a region of 0 bytes at the end of the stack and alloca(0) returning something else. Since no conforming program can detect it, the "as if" rule allows the return value to be considered to be anything at all.
The program in question is not a conforming program in the first place. It's the Ruby interpreter and it does all kinds of low-level stuff. alloca(0) is called from the garbage collector in order to detect the end of the stack so that the garbage collector can scan the stack for pointers. The code assumes that it's running on a system where there is a stack at all, which is pretty much all systems nowadays.
Of course it's not a conforming program. That's rather the point: as a non-conforming program, the compiler is allowed to apply optimizations which may behave differently from what the programmer wants it to do. That this code works on one compiler and fails on another doesn't make it a compiler bug, though. It merely means that this code relies on the compiler behaving in a certain way which isn't actually mandated.
Which, it's not remotely a stupid question: malloc gets passed a zero-length argument when it's used to allocate variable length data. That malloc(0) works and is handled by free() without crashing the program is a simplifying assumption, as is the assumption that free(NULL) won't corrupt the heap.
Any use of alloca() on the other hand seems risky. Similar arguments could be made about the semantics of jmp_bufs, which also get used to get a handle on the stack.
"the assumption that free(NULL) won't corrupt the heap."
That's not an assumption, that's how the free() function is defined to work by the language standard. It never ceases to astound me how many otherwise good C programmers think free(NULL) is an error.
as is the assumption that free(NULL) won't corrupt the heap.
When you refer to simplifying assumptions, are you talking about assumptions made by the programmer, or by the compiler and libc? For example, the POSIX manual page[0] for free( void* ptr ) says, "If ptr is a null pointer, no action shall occur." The malloc manpage says, "If size is 0, either a null pointer or a unique pointer that can successfully passed to free() shall be returned." That sounds more like a definition than an assumption to me. What am I missing?
[0] Obtained by installing manpages-posix-dev on Ubuntu and running man 3posix free.
malloc(0) has to return a pointer you can pass to free(), so it can't be arbitrary junk. The only thing you can do with the pointer returned from alloca() is dereference it. Technically, as pointed out elsewhere, you can compare it with other pointers, but the only real requirement there is that it compare not equal to any other non-NULL pointer.
Well, alloca is not specified in any formal standard, so you won't find an official ruling on whether it's undefined, unspecified or well-defined. By itself, that state of affairs should strongly discourage its use in portable code.
I interpret that statement as it being similar to __asm__(), which is also machine and compiler dependent and discouraged in portable code. Still, sometimes you need it when writing low-level code. __asm__ doesn't blow up the way alloca in the given example does.
Incidentally, I ran into a nasty x64 __asm__ codegen bug with exactly the same compiler version that this blog post covers. It was in lockless multithreaded code, so you can imagine how much fun that was to debug. Rather than work around it, I ended up replacing all our GCC inline assembly with modern intrinsics like __sync_fetch_and_add.
That's amazing! First time I've come across the term. For the lazy:
"When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose. Someone else followed up with a reference to “nasal demons”, which quickly became established."
It seems that LLVM and Clang have generally been much more aggressive about taking advantage of undefined behavior, which has ended up breaking code that worked under gcc. For example, with some versions of Clang, if you try to write something like (int )0 = 0, it'll completely skip that line when generating code! Completely valid, since dereferencing NULL is undefined behavior, but not what one might expect after using a compiler that's more obedient.
There's a great post on undefined behavior in C, what it means for a compiler, and how LLVM and Clang deal with it here:
http://blog.llvm.org/2011/05/what-every-c-programmer-should-...