I did a bit of digging into why you think agentic coding is “not there yet”, and I think you are bashing a tool you have very little experience with and are using a bit wrongly.
Nothing wrong with that, except that as opposed to any other tool that is out there, agentic coding is approached by smart senior engineers that would otherwise spend time reading documentation and understanding a new package/tool/framework before giving conclusions around it with “I spun up Claude code and it’s not working”. Dunno why the same level of diligence isn’t applied to agentic coding as well.
First question that I always have to such engineers is “what model have you tried?” And it always ends up being the non-SOTA models for tasks that are not simple. Have you tried Claude Opus?
Second question: have you tried plan mode?
And then I politely ask them to read some documentation on using these tools, because the simplicity of the chat interface is deceptive.
Definitely didn't want to imply that the author is a bad engineer, quite the contrary he seems like a very good one. Apologies if it came across that way.
Just that many brilliant engineers as themselves test agentic tools without the same level of thorough understanding that they give to other software engineering tools that they test out.
It doesn't look like you addressed issues raised in the article. E.g., see the "my experiences interviewing candidates" section where we can see this isn't just a problem of the author's (just one example in one section of an article that covers various things).
I always wonder what the purpose of posting these generic, superficial defenses of a certain form of LLM-based coding is?
That was a different matter altogether. I agree though that I didn't touch on that.
My experience is different in that case, but it certainly depends on the type of technical challenge, the programming language, etc.
Candidates that perform better or worse exist with and without agentic coding tools. I've had positive and negative experience on both fronts, so I'd attribute the OP's experience to the N=1 problem, and perhaps to the model's jagged intelligence.
I work mostly in Typescript, and it's well known that models are particulary well versed in it. I know that other programming languages are less supported because the training data for them is lower, in which case models could be worse with them across the board (or some SOTA models could be better than others)
Fanvue | AI Internal Developer Tooling Engineer | REMOTE (London based, GMT-5 to GMT+5 preferred) | Full-time
One of the coolest roles we've had: experiment with cutting-edge AI tools to make talented engineers faster and more effective. Hands-on, experimental, operating at the frontier.
Fanvue is a fast-growing creator monetisation platform — AI-powered, creator-first (Series A, $100M+ ARR, triple-digit YoY growth), supporting hundreds of thousands of creators and millions of fans.
What you'll do: Define and roll out our AI-assisted dev stack (Cursor, Claude Code, etc.) across IDEs, PR workflows, and CI. Build MCP servers and AI-ready infra for context-aware code generation. Enable parallelised workflows, agent skills, and AI-assisted reviews. Establish best practices for safe, consistent AI-assisted coding. Track and report measurable impact. Evaluate emerging tools with clear success criteria.
Who you are: Strong software engineering background with TypeScript in production. Hands-on with AI coding tools (Cursor, Claude Code, Aider, Copilot). Experience with MCP servers. Solid LLM fundamentals (RAG, embeddings, fine-tuning). Comfortable with GitHub workflows and CI/CD. Autonomous, experimental, outcome-focused.
You'll thrive if you want to make engineers dramatically more effective, enjoy rapid experimentation, and own outcomes. You'll struggle if you prefer fixed roadmaps or need heavy direction.
Why Fanvue: Define responsible AI at scale, direct impact on velocity and quality, access to top AI tools, competitive salary, unlimited holiday, remote-first, flexible hours.
I opened it in the morning while my partner was sleeping next to me.
I see the big red “press me” button sitting in a corner, and when I pressed, a rich farting sound blasted out of my phone’s speakers, which coincidentally were on full volume. It was LOUD enough to hear it throughout the house.
My partner’s only reaction was to say “great” in a low voice :)))
What I usually try to test with is try to get them do full scalable SaaS application from scratch... It seemed very impressive in how it did the early code organization using Antigravity, but then at some point, all of sudden it started really getting stuck and constantly stopped producing and I had to trigger continue, or babysit it. I don't know if I could've been doing something better, but that was just my experience. Seemed impressive at first, but otherwise at least vs Antigravity, Codex and Claude Code scale more reliably.
Just early anecdote from trying to build that 1 SaaS application though.
It sounds like an API issue more than anything. I was working with it through cursor on a side project, and it did better than all previous models at following instructions, refactoring, and UI-wise it has some crazy skills.
What really impressed me was when I told it that I wanted a particular component’s UI to be cleaned up but I didn’t know how exactly, just wanted to use its deep design expertise to figure it out, and it came up with a UX that I would’ve never thought of and that was amazing.
Another important point is that the error rate for my session yesterday was significantly lower than when I’ve used any other model.
Today I will see how it does when I use it at work, where we have a massive codebase that has particular coding conventions. Curious how it does there.
Nothing wrong with that, except that as opposed to any other tool that is out there, agentic coding is approached by smart senior engineers that would otherwise spend time reading documentation and understanding a new package/tool/framework before giving conclusions around it with “I spun up Claude code and it’s not working”. Dunno why the same level of diligence isn’t applied to agentic coding as well.
First question that I always have to such engineers is “what model have you tried?” And it always ends up being the non-SOTA models for tasks that are not simple. Have you tried Claude Opus?
Second question: have you tried plan mode?
And then I politely ask them to read some documentation on using these tools, because the simplicity of the chat interface is deceptive.
reply