c4x86 - JIT compiler for x86 in 86 lines

userbinator · on Dec 14, 2014

This is amazingly elegant and concise. I've heard a lot of people, including the compiler academics, claim that even the simplest non-optimising compiler for x86 would be extremely complex; this is a great counterexample to that, alongside (O)TCC. It looks like it's using mainly two registers only for code generation, eax and ecx, and the stack.

Jack Crenshaw's "Let's build a compiler" tutorial series uses the same code generation method but with eax/edx (http://www.pp4s.co.uk/main/tu-trans-comp-jc-intro.html ); not sure if he was the first one to come up with it, however. Incidentally, I think his tutorial series is one of the best ones I've read for understanding how compilers work, even better than all the "traditional" literature like the Dragon Book.

The other article that'll probably be very helpful for those understanding how the rest of this compiler works is http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm#climbing which describes the "precedence climbing" way of parsing expressions.

Compilers are often thought of as "magic" that only very few people can understand, but I don't think that has to be true in general. While "production-quality" compilers like gcc are certainly very complex, the essence of a compiler is really quite simple. However, this wouldn't be the impression one gets by reading a lot of the compiler books out there --- the heavy emphasis on theory in the majority of them has a somewhat obfuscating effect. They also tend to focus a lot more on compiler-compilers (Lex, Yacc) and more general parsing techniques like LR, while spending very little on the simple methods of syntax-directed translation and recursive descent. AFAIK precedence-climbing is not even described in the Dragon Book.

abecedarius · on Dec 14, 2014

I've never seen precedence climbing in any compiler textbook, unless you count the lcc book. (I learned of it instead from an unconventional algorithms class by Dave Gillespie, who wrote Gnu Calc.) You can see how it gets squeezed out of the curriculum since you can write a conventional parser just fine without it, but it's a loss -- when you see it how can you not go "Oh yes, that's sweet, that's just right"?

Crenshaw's series turned me off when the first parser he showed would incorrectly accept incorrect inputs, and it could've been done rightly just as easily -- or so it looked to me. I did not test it or read further, and since it's so often recommended I seem to've missed something.

P.S. precedence climbing in Python: https://github.com/darius/sketchbook/blob/master/parsing/pre... . Pratt's top-down operator precedence parsing is essentially an object-oriented way to express the same algorithm: https://github.com/darius/sketchbook/blob/master/parsing/pra...

statictype · on Dec 15, 2014

Incidentally, I think his tutorial series is one of the best ones I've read for understanding how compilers work, even better than all the "traditional" literature like the Dragon Book.

Totally agree. I was interested in how compilers work when in high school - long before I heard of the dragon book and had no idea how to even start on making a parser and code generator.

When he walked through a basic recursive descent parser, it was like a light bulb went off over my head - such a simple and elegant solution to a otherwise seemingly tricky problem.

afandian · on Dec 13, 2014

Previous discussion on C4: https://news.ycombinator.com/item?id=8558822

Looks like this is a fork of that.

tptacek · on Dec 13, 2014

That discussion makes me a little sad for HN. Although 'userbinator has a fantastic comment in there.

sitkack · on Dec 13, 2014

link to userbinator's comment https://news.ycombinator.com/item?id=8560974

dmytrish · on Dec 13, 2014

Yes, that's a fork of c4 with virtual machine replaced by x86 code generation and running the compiled code.

chubot · on Dec 13, 2014

What prevents c4x86.c from compiling with c4.c? Is it the <sys/mman.h> header? (Note the "not self-hosted" comment in the README.)

dmytrish · on Dec 13, 2014

It's `mmap` call and syntax of function pointer at the end of the file (needed to call the compiled code).

P.S. Otherwise, it is ready to run itself, work in progress.

userbinator · on Dec 14, 2014

Allowing it to call arbitrary functions using dlopen()/dlsym() would be the idea I have for fixing the first problem - it'd also increase its power substantially since it can then make use of any other library's functions instead of the currently hardcoded list. Supporting function pointers would be a bit harder and involve changing the parser/code generator.

hyc_symas · on Dec 14, 2014

How would you tell it which shared library to dlopen?

Was looking into changing the order of arguments to go the right way 'round. Also changing address refs to be PC-relative, to avoid the relocation phase. Seems pretty straightforward.

dmytrish · on Dec 14, 2014

I thought about `dlopen`, but did not know how to fetch it libc platform-independently (`dlopen(argv[0])` is fine on OS X, but not on Linux).

Now I see that the first argument can be NULL, that approach is feasible.

joebo · on Dec 14, 2014

This greatly helped my understanding of how compilers work.

How different would the implementation be for x64? Would it primarily be a matter of translating opcodes and calling conventions? For example, pop ecx is 59 [1], and on x64 it looks like it would be pop rcx, which is also 59 [2][3].

[1] - http://sparksandflames.com/file/x86InstructionChart.html

[2] - https://defuse.ca/online-x86-assembler.htm#disassembly

[3] - http://msdn.microsoft.com/en-us/library/windows/hardware/ff5...

userbinator · on Dec 14, 2014

For x64 under both *nix and Windows sizeof(int) remains at 4 while the sizes of pointers increase to 8 bytes (64 bits), so anything in the compiler that depends on those sizes would also have to change. As for the opcodes, with x86/x64 it's a little weird since 64-bit mode actually defaults to 32-bit data sizes and uses mostly the same opcodes as 32-bit mode, with a prefix byte needed to override this to 64 bits. Addresses default to 64 bits, however. The 32-bit environment is "nice" in that pointers and integers are the same size; not true for most 64-bit in general (http://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_mo... )

But still, trying to fit the x64 version in 64 lines is probably going to be the harder problem. ;-)

joebo · on Dec 14, 2014

This also works on win32 under mingw32. I forked and added support for it: https://github.com/joebo/c4/blob/master/c4x86.c

jrockway · on Dec 13, 2014

By 86 lines, do they mean 542 lines?

tptacek · on Dec 13, 2014

Aside from 'dmytrish (the author of this fork)'s point about the code generator: this is extraordinarily tight C code. There's no fat on it anywhere, unless you want to call support for the ternary operator "fat". It's not obfuscated or artificially shrunk down at all. It's just sort of perfect.

I took an hour to read through it and comment the hell out of it, and it was some of the most pleasant reading I've done all year. I really, really like this code.

dmytrish · on Dec 13, 2014

I meant the code generator size: it starts at line 448 and ends at 535, so its size is about 86 lines. The rest of the code is taken from original c4 and is not changed.

tomcam · on Dec 13, 2014

..and SNAP! Great work, dmytrish. This is a godsend to people trying to fight their way through traditional compiler texts. Thanks for a truly impressive contribution to the language dev world.

dmytrish · on Dec 13, 2014

Thank you. rswier's c4 was a great inspiration for me. I am glad If my humble two cents are helpful for somebody.

jrockway · on Dec 13, 2014

I kind of noticed that, though lines(c4x86.c) - lines(c4.c) is closer to 30.

petercooper · on Dec 13, 2014

86 in base 67.

haberman · on Dec 14, 2014

If you enjoy this sort of thing you might enjoy my Brainfuck micro-JIT: http://blog.reverberate.org/2012/12/hello-jit-world-joy-of-s...

sitkack · on Dec 13, 2014

Here is another example of a microjit, this time in Python https://gist.github.com/leegao/1073233

cogburnd02 · on Dec 13, 2014

Is the upshot of this sort-of like 'tcc -run'?

dmytrish · on Dec 13, 2014

At a quick glance, yes, `tcc -run` seems to output machine code into memory for running it, so yes.

cogburnd02 · on Dec 13, 2014

Cool. Is that all that's required to use it instead of this?

  #!/usr/bin/tcc -run

Or does doing that require weird stdin-related Magic?

bithush · on Dec 14, 2014

That is some beautiful code.