> why not compile-time error for reading memory that hasn't been written https:/...

TheCoelacanth · 2025-05-13T14:26:19 1747146379

A compiler doesn't have to accept all possible programs. If it can't prove that a variable is initialized before being read, then it can simply require that you explicitly initialize it.

dooglius · 2025-05-13T15:25:56 1747149956

Sure, but then not accepting many programs would be the answer to parent's question "why not"

jerf · 2025-05-13T17:22:40 1747156960

Not accepting many C programs, maybe. It's pretty easy to create a language where declaration is initialization of some sort, as evidenced by the large number of languages in common use where, one way or another, that's already the case.

This isn't some whacko far out idea. Most languages already today don't have any way (modulo "unsafe", or some super-carefully declared and defined method that is not the normal operation of the language) of reading uninitialized memory. It's only the residual C-likes bringing up the rear where this is even a question.

(I wouldn't count Odin's "explicitly label this as not getting initialized"; I'm talking about defaults being sharp and pointy. If a programmer explicitly asks for the sharp and pointy, then it's a valid choice to give it to them.)

dooglius · 2025-05-13T18:13:37 1747160017

I think we are in agreement? Odin works the way you describe, and GP in response expressed a preference that the compiler instead fail at compile time if it detected that memory had not been explicitly initialized; my response was to explain why this is not (in the general case) feasible.

reverius42 · 2025-05-13T19:37:14 1747165034

It may not be feasible in the general case by changing the compiler, but it's definitely feasible in the general case by changing the language. If you can't specify an uninitialized variable syntactically then you don't have to analyze whether it exists semantically.

trealira · 2025-05-13T18:10:43 1747159843

Somehow Rust is able to do it, though. Is it really that hard for compilers to do flow analysis to detect and forbid uses of uninitialized variables? Not even being sarcastic, I genuinely would like to know why more languages don't do this.

steveklabnik · 2025-05-13T20:26:32 1747167992

The flow analysis isn't particularly hard, but lots of languages simply don't do it because they don't allow uninitialized variables in the first place. Given null is a pretty common concept, you just say they're initialized but null, and you don't even need to do the analysis at all.

trealira · 2025-05-14T02:22:47 1747189367

Thanks for explaining. I guess that makes sense, but it surprises me: language implementors choosing to do the slightly worse option not due to technical limitations or historical baggage, but just because it's easier. I thought only C was like that. That's not to insult any of them; it's not like I'm the maintainer of a compiler for a major programming language.

steveklabnik · 2025-05-14T14:16:02 1747232162

I mean, not having uninitialized variables is simpler to understand, as a user. It’s not inherently about ease of implementation.

Furthermore, language designers are also human. Many of the languages used in industry weren’t even created by people who have a PLT background, or were created a very long time ago.

trealira · 2025-05-14T16:12:09 1747239129

I think it depends. Not having uninitialized variables by requiring the user to provide some value, as in expression-oriented languages, makes sense to me, because it wouldn't make sense in an expression for a variable not to have a value. An example of SML:

  1 + let val x = 1 in x * 2 end

That's both conceptually simpler and seems like one of the good solutions.

On the other hand, in languages where you don't have to explicitly initalize variables after declaring them, I think the best solution is to do flow analysis and forbid uses of uninitialized variables rather than implicitly set them all to null or zero. The latter solution just leads to confusing bugs if you ever don't initialize a variable in all branches (which is why I think that should be a compiler error). It's only a little better than C's solution of just letting the value be undefined, or in practice, a non-deterministic garbage value. An error message just seems more user-friendly, particularly for beginner programmers who are more likely to make mistakes like those and not immediately understand what the problem is.

I do get what you're saying about it being simpler, but I don't think it's conceptually simpler for the user to always initialize variables to zero or null.

steveklabnik · 2025-05-14T17:03:09 1747242189

> On the other hand, in languages where you don't have to explicitly initalize variables after declaring them, I think the best solution is

I agree, for sure.

> I don't think it's conceptually simpler for the user to always initialize variables to zero or null.

I'm not saying zero, I'm just saying null. Because null is a valid value for that type, it's equivalent to

> Not having uninitialized variables by requiring the user to provide some value

trealira · 2025-05-14T17:13:00 1747242780

> I'm not saying zero, I'm just saying null.

What would happent to the primitive types of that language, then? When I said that, I was thinking of languages like Go and Java, both of which implicitly initialize integers to 0, booleans to false, floats to 0.0, and references to nil or null. That's what I'm calling less user-friendly than disallowing variables that aren't explicitly initialized before being used in all branches.

steveklabnik · 2025-05-14T17:42:22 1747244542

> What would happent to the primitive types of that language, then?

Not every language has them. A common example here is JavaScript, which, while `undefined` is a thing in, is closer to null than the C sense of undefined.

> When I said that, I was thinking of languages like Go and Java, both of which implicitly initialize integers to 0, booleans to false, floats to 0.0, and references to nil or null.

Right, that's zero initialization. I also agree that I think it's a mis-feature generally, but I think in the overall design space there's reasons you'd end up there. It's not the path I'd personally take, but I can see why people choose it, even if I disagree.

trealira · 2025-05-13T19:07:11 1747163231

This is a self-response, but I've thought of a case where it might be fairly difficult for a compiler to prove a variable is always initialized, because of the use of pointers. Take this function to copy a linked list in C:

  struct node {
      struct node *next;
      int data;
  };

  struct node *copy_list(struct node *list_node) {
      struct node *new_list, **indirect;

      indirect = &new_list
      while (list_node != NULL) {
          // Pretend malloc can't fail
          struct node *np = malloc(sizeof(*np));
          np->data = list_node->data;
          *indirect = np;
          indirect = &np->next;
          list_node = list_node->next;
      }
      *indirect = NULL;
      return new_list;
  }

The variable "new_list" is always initialized, no matter what, even though it's never explicitly on the left hand side of an assignment. If the while loop never ran, then indirect is pointing to the address of new_list after the loop, and the "*indirect = NULL;" statement sets it to NULL. If the loop did run, then "new_list" is set to the result of a call to malloc. In all cases, the variable is set.

But it feels like it would be hard for something that isn't a formal proof assistant to prove this. The equivalent Rust code (unidiomatic as it would be to roll your own linked list code) would require you to set "new_list" to be None before the loop starts.