Thanks for the tip. I was dubious, I tried GPT 5.2 for a start on a large plan and it was way better than reviewing it with Claude itself or Gemini. I then used it to help me with feature I was reviewing, it caught real discrepancies between the plan and the actual integration!
- initially it wasn't working, plenty of parent/child relationships problems like described in the post
- so I designed a thin a wrapper and used sealed classes for events instead of dynamic spans + some light documentation
It took me like a day to implement tracing on the existing codebase, and for new features it works out of the box using the documentation.
At the end of the day, leveraging typing + documentation dramatically constrains LLMs to do a better job
reply