Funny - when looking at a codebase for the first time, I do almost exactly the same thing as described by Mr Seibel: I start rewriting it. I rename functions or methods that I think have poorly chosen names, I rename the names of fields, variables or parameters for the same reason, I refactor, restructure, and reformat the code to look like I think it should look, and so on.
That sounds like it could be really beneficial to understanding a piece of code, but it seems like it would only ever be really feasible if you were working alone and taking some code from somewhere else and completely consuming it, into a new project like Toot and Whistle or into some other existing project. Most of the times that I've needed to ramp up understanding of some code is either at a new job or before contributing to some existing project.
Would you do this after starting at a new job, and make this your first commit? Or before contributing to open source?
I could envision some awkward social problems arising there. If you kept that code to yourself, but continued working on the old code, that would probably be frustrating.
I'm just curious because I'm really attracted to the idea of this method but am not sure if it would really work where I'd want it to.
As I think I mentioned in the interview, I've found that if I do this, by the time I'm done with my rewrite, I actually understand the original code too. So if I had to, I could throw away my new (better?) code and still benefit from a better understanding of the original code.
Funny, it's a thing I repressed myself doing, always wondering if static analysis (call graphs and such) wouldn't be better.
<sidenote>
There should be a site with a substantial piece of code to discover and people would answer what were the main (3-5) steps they had to do to ~understand it and how long.
I do this whenever I read complicated pieces of code. It's enormously helpful, even if I don't keep my changes around.
Some of the time the changes are broadly beneficial (like taking a multi-thousand-line file and adding some organizational structure) and it makes sense to commit them upstream. Some of the time the changes are personal preference and aid only in your own understanding of the code.
As with most things, the best approach is to use your judgement, not get too attached to your own changes, and to understand that context matters.
Same here. Sometimes I come across code that's not easily improved that way, and not just because it's fragile -- this seems less rare than, say, 15 years ago. Does it seem that way to the rest of you? That things are getting better?