Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tangential:

> Over the years, processors and C compilers have gotten much better using a couple of techniques. These include pipelining, inlining functions, and branch prediction. Unfortunately, the parsers generated by most parser generators make it difficult for any of these techniques to apply. Most generated parsers operate with a combination of jump tables and gotos, rendering some of the more advanced optimization techniques impotent. Because of this, generated parsers have a maximum performance cliff that is extremely difficult to overcome without significant effort.

Although generating parsers, and finite automata in general, using a table-based approach is common, it has long been recognized that using tables/data for this purpose (as opposed to generating executable source code directly) is not a good idea, precisely because it inhibits compiler optimizations. I think the current situation is simply a consequence of the fact that parsing abruptly stopped being sexy multiple decades ago.

Much better parser generators are possible, and LR, specifically, has much untapped potential left.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: