The preprocessor's got nothing to do with it. As you've noted, the raw size of the source code is easily dealt with. (At least for parsing. We still haven't identified where it's being stored in the first place.) The problem is that when targeting an embedded system, you really want to use compiler optimizations (especially size optimizations), and those heuristics require more RAM than routers have. Ideally, you would compile large parts of the OS with link-time optimization, which is one of the most memory-intensive options you can pass to a compiler.