I've been arguing lately that C++ should do something like this: Add a full-on theorem-prover to the language so that I can write my own safety rules specific to how my program works.
In the Cloudflare Workers runtime we have a sort of analogous problem to what Blow has in his game engine. We have certain objects that live on the JavaScript heap (which is garbage collected), or are directly owned by objects on the JS heap, and then we have system objects involved in basic I/O that aren't necessarily tied to a particular JS isolate and are not garbage collected, but are often tied to specific OS threads. JS objects and system objects are allowed to own instances of each other, but the ownership must be handled in a special way. E.g., when a system object holds a reference to a JS object, the destructor must ensure the JS isolate lock is held before releasing that reference. Conversely when a JS object holds ownership of a system object, and the JS object is garbage collected, we need to queue the system object's destructor to run in the thread where it was created.
Conceptually, it ought to be easy to write rules where each type is declared as being a system type or a JS type, and then the compiler could trivially enforce that the correct sort of references are used between them. But, the problem is very specific to our codebase, so it would make no sense to bake into the language.
I think the next big thing for C++ should be adding ways to define custom code-checking rules like this. It would allow C++ to become a safe language and would fit perfectly with C++'s culture of being ridiculously overcomplicated. ;)
VCC[1] was a substantial piece of research from MSR aimed at supporting the verification of C (not C++) via a theorem prover. However, it appears in a study[2] of its application to parts of the Hyper-V code-base, there were some practical difficulties relating to proving performance in the developer workflow.
Still, fascinating to see work like this on real-world code-bases with established languages. You can also see very similar work in languages like Dafny[3].
Adding verification once you have code is generally more difficult, though in case of VCC it was helpful to have a large example code base to make sure the tool would cover the major patterns, especially relating to concurrency primitives.
> I've been arguing lately that C++ should do something like this: Add a full-on theorem-prover to the language so that I can write my own safety rules specific to how my program works.
I really like this idea. I’ve been messing around with the z3 theorem prover, and while I don’t think there’s an easy way to integrate it to do compile-time checks, I made a little wrapper class that lets you add assertions which are checked for satisfiability at the end of a function’s scope. But it’s all at runtime, so it’s more like a more advanced assert() than anything like Jai or Rust has.
I think SMT solvers are criminally under used and have a lot of potential to help solve some of the issues in a way that still allows rapid iteration at first and then incremental safety checks to be added on later.
I like this idea too, but bear in mind that the world of SMT in general (Z3 not an exception) seems to have a lot of cosmetically-similar formulations some of which are trivial and some of which will take forever to solve.
The honeymoon with SMT is often brief. You've gotten through the whole 'hooray, now I can nuke Sudoko and those "Mr Brown lives on the same street as the baker, who has a white house" problems' and surely all the world's problems will fall to my awesome new abilities. You then put some basic quantifier problems into Z3 or whatever and watch the prompt go away, apparently permanently.
I'd love it if there were more "intermediate" level documentation sets on SMT - aimed not at people who have just started using SMT, but also not aimed at people who are going to start doing SMT research.
> I'd love it if there were more "intermediate" level documentation sets on SMT - aimed not at people who have just started using SMT, but also not aimed at people who are going to start doing SMT research.
Agreed. I've been doing some research to create an SAT Solver and found surprisingly low amount of learning material. I'd love some recommendations from HN crowd.
You might want to look at the Why3 meta-solver framework. It uses an intermediate language, Why3ML, that can be run against an array of different SMT solvers, including Z3 and CVC4. If you run into something that SMT can't figure out in a reasonable time-frame, you can use Coq to manually-prove your theorems too.
Languages like SPARK and Frama C generate Why3ML as an intermediate output that then gets transformed to run against whatever solvers are available during the verification phase of development. Since the verification/proof phase is separate from compilation, you can write your code with fewer checks initially, and then add conditions as you go to prove functional correctness.
I'd be surprised if C++ is ever touted seriously as a language suited for formal verification. I think the language just has too many corners and escape hatches to be able to nail down its semantics in a way that automated provers can handle.
Having said that though, I'm pretty sure the MSVC Driver Verifier tool uses an SMT solver to try and prove the absence of certain types of errors, so who knows?
Wouldn't Frama-C and some of its plugins do this job? I've seen (I think?) Airbus (working with Atos?) people write custom Frama-C plug-ins to implement some static analysis & proof on their code, with great success.
Would dependent types help in this case?
Add the reference type when creating the type and then you could have the compiler check that the contract holds.
It might be expressible with dependent types. I think what I'm suggesting ought to be able to express a superset of dependent types.
I think a lot of type theory tries too hard to have certain "desirable" properties, like being decidable (that is, type checking will always complete in finite time). IMO it's fine if there are cases that will go into an infinite loop, as long as the developer has tools for debugging and fixing them. After all, we rarely use programming languages where the actual programs aren't allowed to loop -- such languages exist, but they are far too restrictive to be useful for most purposes.
Maybe what I want is not even a theorem prover, but just a way to write little scripts that execute in an imperative fashion, which can iterate over the code and look for errors. Why does it have to be a fancy logic language that only academics understand?
Dependent types prevent infinite loops at compile time because if you have nonterminating terms, your types are unsound as a logic (you can derive true and false properties altogether)
That's because theorem proving with dependent types is based on the idea that you can prove something by providing a proof object that is a witness that what you want to prove is true. But a nonterminating function can fake it: you can write a function that promises a proof that something is true, while it never returns (so it never has to build a proof object). That way you can prove that false things are "true".
Does this non-terminating function run at compile time or at runtime?
If it runs at compile time, and it doesn't terminate, then presumably no executable is generated? That's good enough for me.
Or are you describing a function that checks some property at runtime, which the type system then depends on afterwards? But if that function never returns, then the subsequent code that depends on the property it was checking won't ever execute. So a false thing is only proven true in code that never executes. That seems fine to me.
(Also note that as a practical matter I mostly don't care if it's possible for a developer to maliciously trick the type system. I only care about catching accidental errors. Relying on a complex type checker for sandboxing of malicious code seems too risky.)
I would recommend the entire episode, it was very good. Only a small part is about Rust.
I’ve listened to plenty of interviews and talks/rants by Jonathan Blow, and while I think he has a lot of interesting ideas, he also has a well earned reputation for being very blunt in how he expresses his opinions. Bryan Cantrill was an excellent interviewer, and managed to bring out the best side of Blow.
The only specific statements about Rust quoted in that thread appears to be:
> And so Rust has a good set of ingredients there. The problem that I have with it is when I'm working on really hard stuff, I don't exactly know what I'm doing for a long time. And so if the cost of experimentation is driven too high, it actually impairs my ability to get work done.
The rest seems to be a discussion about the problems of memory management in general. It's not clear how much Blow has actually used Rust.
I'm finding exactly the same problem with Rust -- I am do exploratory programming in GAP (a Python-like language), then rewriting to Rust once I'm confident I have all the basic data structures and code down. The compile speed of Rust alone is painful.
Also, when converting C++ code to Rust, I find many places where I'm having to de-optimise code to get it past the borrow checker. Large cyclic data structures which are freed as one piece are annoyingly difficult in Rust.
There's a lot more in the podcast that's discussed in the comments as well, RE: linked lists, multiply owned data structures, and common goals of Rust and the Jai compiler team, but I got lazy with transcribing the audio by hand (it's harder than it sounds!) and left it at that. Besides which, the podcast episode outside of Rust was excellent!
A hacky solution is to point your speakers at a device running Google Docs with speech-to-text turned on, but I haven't found anything that handles multiple speakers well - especially with overlapping voices, as happens often in this episode.
Yup overlapping voices should be a problem. I was thinking more along the lines of you playing the podcast to yourself via headphones and repeating what Jonathan Blow says into your speech to text program. :)
About a year ago I really devoured Jonathan Blow's opinions. He is definitely correct in many, many aspects. I've been paying less and less attention lately because he explains the problems of virtually any language, then explains how his language solves it all perfectly. I am personally incapable of taking firesale arguments like that seriously.
I too have many great ideas, but you need to be honest about the limitations of your ideas. Jai seems like the answer for game development, but I would not trust a browser written in a language that is aspirationally fast at any cost.
Rust has a place alongside Jai, they don't have to compete.
> I've been paying less and less attention lately because he explains the problems of virtually any language, then explains how his language solves it all perfectly.
I seriously doubt that he would say that his language "solves it all perfectly." He's still making changes to his language to fix things that don't work right or to change his language to implement features that either didn't exist before, or don't solve the problem as he now sees it (which is different than how he saw it when he initially implemented that feature).
> I am personally incapable of taking firesale arguments like that seriously.
I would challenge you to understand how Jonathan Blow talks. He says what he says, and nothing more. He does not give you things to read "between the lines" or anything like that. He says what he says, and if he did not say some thing, he did not mean that thing, barring mistaken speech, of course. He also continually learns and adapts what he says, which can appear as if he says one thing on one occasion and something different on another, which is of course what happens to anyone who talks about things as they are learning those things.
> Rust has a place alongside Jai, they don't have to compete.
This is, of course, correct. Jon mentions Rust as he does because it doesn't solve the problems that he (specifically) has. Rust is just not an option for him, so he's creating something that is an option for him, and for other people who value the things he values in a language. I don't think Jon will lose any sleep at all if you don't like his language once it is generally available.
Here's a link to the actual podcast he appears in. Not my most productive almost-3 hours while I listened to that. Lots of references in the show notes on that page too:
He also has some things to say about the (wrong in his opinion) direction we've taken in moving to managed and dynamic languages, in that they're not necessarily higher level languages, just more removed from the CPU.
It just seems really unconvincing to me. Just don’t do data modeling with entity pointers at all, and use patterns like RAII to guarantee that resources manage the lifetime of pointers they need. Or use pure functional programming if it matters that much in a given use case.
When the article says the compiler doesn’t know whether a given pointer is meant to be an entity pointer or not, but that you do as the programmer (and thus should use complex metaprogramming facilities to encode correctness of handling semantics of that pointer), it loses me completely. Those are code smells of bad program structure (especially creating the metaprogram to enforce a type of correctness chosen by the programmer, and highly susceptible to YAGNI premature abstraction).
In response to “boy it’s hard to guarantee keeping track of all these things” the bad response is to take on more code liability in the form of metaprogram “safety” enforcement. The good response is to ruthlessly remove entity-style pointer handling entirely and restructure the program to simply not need it.
> highly susceptible to YAGNI premature abstraction
I think in most cases this sentiment is correct but I think Blow's argument is that you are going to need it. In other talks he put forth some use cases where this has worked for him and his game dev studio.
The most abstract version of this argument:
1. A compiler is a tool to help the programmer
2. Compilers for the Rust programming language help the programmer by validating the program to be within a subset of all computable programs that operate within the language's rules (the borrow checker and type system)
3. there are likely many programs that are computable and correct that cannot be modeled within Rust's safety system (see most of the world's software to date)
4. For some use cases the most optimal solution may not fall within the subset of programs that Rust allows
5. Because of this it may be impossible for Rust to implement the most optimal solutions for some programs.
Blow's goal of making compilation describable within the language itself brings forward a huge set of features that other languages need a runtime or complex compiler to do. Things like code generation, reflection, metaprogramming, linting, and correctness validation could simply be packages you install with a package manager. If the language is kept at a minimal feature set and complexity with an advanced enough metaprogramming phase entire language features and abstractions can be implemented in this step. For example imaging being able to do something like `import grpc; import application.proto` and all of the gRPC magic happening without any complex build system at the full native speed of the compiled language with no runtime deps.
I'm not saying Blow is correct, I'm just saying it is a very interesting direction to take programming languages: weak guarantees but completely programmable.
> Blow's goal of making compilation describable within the language itself brings forward a huge set of features that other languages need a runtime or complex compiler to do. Things like code generation, reflection, metaprogramming, linting, and correctness validation could simply be packages you install with a package manager. If the language is kept at a minimal feature set and complexity with an advanced enough metaprogramming phase entire language features and abstractions can be implemented in this step.
I know you're not advocating for Lisp, but it kind of sounds like you are.
I love Lisp as much as the next neckbeard, but macros aren't unique to Lisp and they aren't the only kind of compile-time execution.
It's tough to come up with a better syntax than sexprs for writing macros in, that's clearly true. A number of recent languages have managed good-enough macros, Nim and Rust being two examples.
I have not listened to the podcast, but the example regarding entity pointers not outliving the frame would be something Rust is really good at.
The borrow checker can make sure that references to entities are strictly contained within the lifetime of a frame container and statically check that aspect.
> Because what’s an entity pointer vs a pointer to something else? The compiler doesn’t know that
Rusts compiler can know the difference and it can check lifetimes around that. Lifetimes can be made part of the type
of the pointer.
Pointers living through a render pass aren't the main use case he is interested in. It's a problem space that many engineers are familiar with so it's easy to talk about. Game developers have some very domain-specific problems that are less common outside of game engines. A more detailed description of his opinions are laid out here where he goes into much more detail about his experiences porting The Witness to mobile (I think): https://www.youtube.com/watch?v=4t1K66dMhWk
I think most of the time you're right, but in particular in game programming you know the optimizations you're going to need since the patterns have been pretty well established by now. And those patterns and optimizations are pretty low level which requires you to really understand how things are laid out in memory and rust might not give you great insight into how it lays things out (or it's not as well known or talked about...).
You can lay out your types exactly how you want in Rust, just by using the `#[repr(C)]` or `#[repr(packed)]` attributes to guarantee an ordering of the fields. So,
> rust might not give you great insight into how it lays things out
Is not true, and this is one of the first things you will learn when you try any sort of FFI. So
In the Cloudflare Workers runtime we have a sort of analogous problem to what Blow has in his game engine. We have certain objects that live on the JavaScript heap (which is garbage collected), or are directly owned by objects on the JS heap, and then we have system objects involved in basic I/O that aren't necessarily tied to a particular JS isolate and are not garbage collected, but are often tied to specific OS threads. JS objects and system objects are allowed to own instances of each other, but the ownership must be handled in a special way. E.g., when a system object holds a reference to a JS object, the destructor must ensure the JS isolate lock is held before releasing that reference. Conversely when a JS object holds ownership of a system object, and the JS object is garbage collected, we need to queue the system object's destructor to run in the thread where it was created.
Conceptually, it ought to be easy to write rules where each type is declared as being a system type or a JS type, and then the compiler could trivially enforce that the correct sort of references are used between them. But, the problem is very specific to our codebase, so it would make no sense to bake into the language.
I think the next big thing for C++ should be adding ways to define custom code-checking rules like this. It would allow C++ to become a safe language and would fit perfectly with C++'s culture of being ridiculously overcomplicated. ;)