I just tried Grok 4 and it's insanely good. I was able to generate 1,000 lines o...

sudo-i · 2025-07-10T17:04:56 1752167096

The problem is that code as a 1-off is excellent, but as a maintainable piece of code that needs to be in source control, shared across teams, follow standard SLDC, be immutable, and track changes in some state - it's just not there.

If an intern handed me code like this to deploy an EC2 instance in production, I would need to have a long discussion about their decisions.

mellosouls · 2025-07-10T17:21:21 1752168081

How do you know without seeing the code?

How do you know the criteria you mention hasn't (or can't) be factored into any prompt and context tuning?

How do you know that all the criteria that was important in the pre-llm world still has the same priority as their capabilities increase?

sudo-i · 2025-07-10T17:57:38 1752170258

Anyone using Java for IaC and Configuration Management in 2025 needs to reconsider their career decisions.

tptacek · 2025-07-10T18:46:20 1752173180

What does this have to do with anything? The Java constraint was supplied by a user, not the model.

underdeserver · 2025-07-10T21:46:11 1752183971

Why? Modern Java - certainly since Java 8 - is pretty decent.

sim7c00 · 2025-07-10T17:38:34 1752169114

[flagged]

zo1 · 2025-07-10T17:47:35 1752169655

I find this comment very ironic in the context of this thread. Let's agree to disagree.

handfuloflight · 2025-07-10T18:13:22 1752171202

There's a chunk of the programming population who label everything they themselves didn't write as junk.

nlarew · 2025-07-10T17:18:13 1752167893

How do you know? Have you seen the code GP generated?

JohnMakin · 2025-07-10T18:02:46 1752170566

No, have you? They always seem to be missing from these types of posts. Personally I am skeptical, as AI has been abysmal at 1 shot provisioning actual quality cloud infrastructure. I wish it could, because it would make my life a lot less annoying. Unfortunately I have yet to really see it.

tptacek · 2025-07-10T18:47:32 1752173252

No, they're not. People talk about LLM-generated code the same way they talk about any code they're responsible for producing; it's not in fact the norm for any discussion about code here to include links to the code.

But if you're looking for success stories with code, they're easy to find.

https://alexgaynor.net/2025/jun/20/serialize-some-der/

albedoa · 2025-07-10T19:33:40 1752176020

> it's not in fact the norm for any discussion about code here to include links to the code.

I certainly didn't interpret "these types of posts" to mean "any discussion about code", and I highly doubt anyone else did.

The top-level comment is making a significant claim, not a casual remark about code they produced. We should expect it to be presented with substantiating artifacts.

tptacek · 2025-07-10T19:35:25 1752176125

I guess. I kind of side-eyed the original one-shotting claim, not because I don't believe it, but because I don't believe it matters. Serious LLM-driven code generation runs in an iterative process. I'm not sure why first-output quality matters that much; I care about the outcome, not the intermediate steps.

So if we're looking for stories about LLMs one-shotting high-quality code, accompanied by the generated code, I'm less sure of where those examples would be!

JohnMakin · 2025-07-10T19:53:34 1752177214

I could write a blog post exactly like this with my chatGPT history handy. That wasn't the point I was making. I am extremely skeptical of any claims that say someone can 1 shot quality cloud infrastructure without seeing what they produced. I'd even take away the 1-shot requirement - unless the person behind the prompt knows what they're doing, pretty much every example I've seen has been terrible.

tptacek · 2025-07-10T19:58:22 1752177502

I mean, I agree with you that the person behind the prompt needs to know what they're doing! And I don't care about 1-shotting, as I said in a sibling comment, so if that's all this is about, I yield my time. :)

There are just other comments on this thread that take as axiomatic that LLM-generated code is bad. That's obviously not true as a rule.

sudo-i · 2025-07-10T17:53:39 1752170019

How do you know?

kvirani · 2025-07-10T17:19:01 1752167941

But isn't that just a few refactoring prompts away?

sudo-i · 2025-07-10T17:51:35 1752169895

nashadelic · 2025-07-11T09:48:53 1752227333

I'd love to hear how grok works inside agentic coders like cursor or copilot for production code bases.

doctoboggan · 2025-07-10T18:09:50 1752170990

Please share your result if possible. So many lines in a single shot with no errors would indeed be impressive. Does grok run tools for these sorts of queries? (linters/sandbox execution/web search)

makestuff · 2025-07-10T18:47:21 1752173241

Out of curiosity, why do you use Java instead of typescript for CDK? Just to keep everything in one language?

oblio · 2025-07-10T21:16:01 1752182161

Why not, I would say? What's the advantage of using Typescript over modern Java?