Hacker Newsnew | past | comments | ask | show | jobs | submit | cheema33's commentslogin

As others have pointed out, humans train on existing codebases as well. And then use that knowledge to build clean room implementations.

That’s the opposite of clean-room. The whole point of clean-room design is that you have your software written by people who have not looked into the competing, existing implementation, to prevent any claim of plagiarism.

“Typically, a clean-room design is done by having someone examine the system to be reimplemented and having this person write a specification. This specification is then reviewed by a lawyer to ensure that no copyrighted material is included. The specification is then implemented by a team with no connection to the original examiners.”


No they don't. One team meticulously documents and specs out what the original code does, and then a completely independent team, who has never seen the original source code, implements it.

Otherwise it's not clean-room, it's plagiarism.


What they don't do is read the product they're clean-rooming. That's kinda disqualifying. Impossible to know if the GCC source is in 4.6's training set but it would be kinda weird if it wasn't.

True, but the human isn't allowed to bring 1TB of compressed data pertaining to what they are "redesigning from scratch/memory" into the clean room.

In fact the idea of a "clean room" implementation is that all you have to go on is the interface spec of what you are trying to build a clean (non-copyright violating) version of - e.g. IBM PC BIOS API interface.

You can't have previously read the IBM PC BIOS source code, then claim to have created a "clean room" clone!


Not the same.

I have read nowhere near as much code (or anything) as what Claude has to read to get to where it is.

And I can write an optimizing compiler that isn't slower than GCC -O0


If that's what clean room means to you, I do know AI can definitely replace you. As even ChatGPT is better than that.

(prompt: what does a clean room implementation mean?)

From ChatGPT without login BTW!

> A clean room implementation is a way of building something (usually software) without copying or being influenced by the original implementation, so you avoid copyright or IP issues.

> The core idea is separation.

> Here’s how it usually works:

> The basic setup

> Two teams (or two roles):

> Specification team (the “dirty room”)

> Looks at the original product, code, or behavior

> Documents what it does, not how it does it

> Produces specs, interfaces, test cases, and behavior descriptions

> Implementation team (the “clean room”)

> Never sees the original code

> Only reads the specs

> Writes a brand-new implementation from scratch

> Because the clean team never touches the original code, their work is considered independently created, even if the behavior matches.

> Why people do this

> Reverse-engineering legally

> Avoid copyright infringement

> Reimplement proprietary systems

> Create open-source replacements

> Build compatible software (file formats, APIs, protocols)

I really am starting to think we have achieved AGI. > Average (G)Human Intelligence

LMAO


> `curl | bash` is at least a one-time thing coming from a single source.

Is it? Are you sure?


Yes? I assume this is a rhetorical question but I don't know what rhetoric it's intended to convey.

I'm not the commentor, but you could get different results from the same curl command depending on what the server wants to give you at the time. The bash script can make additional curl calls or set up jobs that occur at other times.

I'm sure both of you understand this. I'm guessing it's just semantics.


Right. My point is that you only run it once, so there's only that one chance for a compromise. If you got lucky and talked to the right server and it gave you a good script, which is overwhelmingly probable most of the time, you're in the clear. That doesn't mean it's wise, but the danger is limited. Whereas with these agents, every piece of data they're exposed to is potentially interpreted as instructions.

> I was an on-prem maxi (if thats a thing) for a long time. I've run clusters that costed more than $5M, but these days I am a changed man.

I have had a similar transformation. I still host non-critical services on-prem. They are exceptionally cheap to run. Everything else, I host it on Hetzner.


> Managing the PostgreSQL databases is a medium to low complexity task as I see it.

Same here. But, I assume you have managed PostgreSQL in the past. I have. There are a large number of people software devs who have not. For them, it is not a low complexity task. And I can understand that.

I am a software dev for our small org and I run the servers and services we need. I use ansible and terraform to automate as much as I can. And recently I have added LLMs to the mix. If something goes wrong, I ask Claude to use the ansible and terraform skills that I created for it, to find out what is going on. It is surprisingly good at this. Similarly I use LLMs to create new services or change configuration on existing ones. I review the changes before they are applied, but this process greatly simplifies service management.


> Same here. But, I assume you have managed PostgreSQL in the past. I have. There are a large number of people software devs who have not. For them, it is not a low complexity task. And I can understand that.

I'd say needing to read the documentation for the first time is what bumps it up from low complexity to medium. And then at medium you should still do it if there's a significant cost difference.


You can ask an AI or StackOverflow or whatever for the correct way to start a standby replica, though I think the PostgreSQL documentation is very good.

But if you were in my team I'd expect you to have read at least some of the documentation for any service you provision (self-hosted or cloud) and be able to explain how it is configured, and to document any caveats, surprises or special concerns and where our setup differs / will differ from the documented default. That could be comments in a provisioning file, or in the internal wiki.

That probably increases our baseline complexity since "I pressed a button on AWS YOLO" isn't accepted. I think it increases our reliability and reduces our overall complexity by avoiding a proliferation of services.


But is there a significant cost difference? I'm skeptical.

I hope this is the correct service, "Amazon RDS for PostgreSQL"? [1]

The main pair of PostgreSQL servers we have at work each have two 32-core (64-vthread) CPUs, so I think that's 128 vCPU each in AWS terms. They also have 768GiB RAM. This is more than we need, and you'll see why at the end, but I'll be generous and leave this as the default the calculator suggests, which is db.m5.12xlarge with 48 vCPU and 192GiB RAM.

That would cost $6559/month, or less if reserved which I assume makes sense in this case — $106400 for 3 years.

Each server has 2TB of RAID disk, of which currently 1TB is used for database data.

That is an additional $245/month.

"CloudWatch Database Insights" looks to be more detailed than the monitoring tool we have, so I will exclude that ($438/month) and exclude the auto-failover proxy as ours is a manual failover.

With the 3-year upfront cost this is $115000, or $192000 for 5 years.

Alternatively, buying two of yesterday's [2] list-price [3] Dell servers which I think are close enough is $40k with five years warranty (including next-business-day replacement parts as necessary).

That leaves $150000 for hosting, which as you can see from [4] won't come anywhere close.

We overprovision the DB server so it has the same CPU and RAM as our processing cluster nodes — that means we can swap things around in some emergency as we can easily handle one fewer cluster node, though this has never been necessary.

When the servers are out of warranty, depending on your business, you may be able to continue using them for a non-prod environment. Significant failures are still very unusual, but minor failures (HDDs) are more common and something we need to know how to handle anyway.

[1] https://calculator.aws/#/createCalculator/RDSPostgreSQL

[2] https://news.ycombinator.com/item?id=46899042

[3] There are significant discounts if you order regularly, buy multiple servers, or can time purchases when e.g. RAM is cheaper.

[4] https://www.voxility.com/colocation/prices


For what it's worth, I have also managed my own databases, but that's exactly why I don't think it's a good use of my time. Because it does take time! And managed database options are abundant, inexpensive, and perform well. So I just don't really see the appeal of putting time into this.

If you have a database, you still have work to do - optimizing, understanding indexes, etc. Managed services don't solve these problems for you magically and once you do them, just running the db itself isn't such a big deal and it's probably easier to tune for what you want to do.

Absolutely yes. But you have to do this either way. So it's just purely additive work to run the infrastructure as well.

I think if it were true that the tuning is easier if you run the infrastructure yourself, then this would be a good point. But in my experience, this isn't the case for a couple reasons. First of all, the majority of tuning wins (indexes, etc.) are not on the infrastructure side, so it's not a big win to run it yourself. But then also, the professionals working at a managed DB vendor are better at doing the kind of tuning that is useful on the infra side.


Maybe, but you're paying through the nose continually for something you could learn to do once - or someone on your team could easily learn with a little time and practice. Like, if this is a tradeoff you want to make, it's fine, but at some point learning that 10% more can halve your hosting costs so it's well worth it.

It's not the learning, it's the ongoing commitment of time and energy into something that is not a differentiator for the business (unless it is actually a database hosting business).

I can see how the cost savings could justify that, but I think it makes sense to bias toward avoiding investing in things that are not related to the core competency of the business.


How do you manage availability zones in your fully self managed setup?

I have a similar system from Hetzner. I pay around $100 for it. No bandwidth cap.

I have setup encrypted backups to go to my backup server in the office. We have a gigabit service at the office. Critical data changes are backed up every hour and full backup once a day.


Yeah -- I know I could probably get a better deal. I pay more for premium support ($200), as well as a North American location. Plus, probably an addition premium for not wanting to go through the effort of switching servers.

> AWS has people working 24x7 so that if something fails someone is there to take action..

The number of things that these 24x7 people from AWS will cover for you is small. If your application craps out for any number of reasons that doesn't have anything to do with AWS, that is on you. If your app needs to run 24x7 and it is critical, then you need your own 24x7 person anyway.


All the hardware and network issues are on them. I agree that you still need your own people to support you applications, but that is only part of the problem.

I've got thousands of devices over hundreds of sites in dozens of countries. The number of hardware failures are a tiny number, and certainly don't need 24/7 response

Meanwhile AWS breaks once or twice a year.


I have been using Netbird for my small company of 10 people for about 2 years. Users on slow connections complained that they could not stay connected with services reliably. I could not reproduce the problem as I mostly connected from very fast connections. I thought that maybe the users or their ISPs were to blame. And then one time I was using the wifi on a plane. It was a slow connection and I was connected to an RDP server. I could not stay connected. I also has Cloudflare VPN connected to the same server. It worked really well over the same connection. I went back ad forth many times as I had trouble believing how bad the Netbird connection was. Long story short, we are now completely switching over to Cloudflare VPN. It is free for first 50 users and is very very reliable, in our experience.

This is how you do things if you are new to this game.

Get two other, different, LLMs to thoroughly review the code. If you don’t have an automated way to do all of this, you will struggle and eventually put yourself out of a job.

If you do use this approach, you will get code that is better than what most software devs put out. And that gives you a good base to work with if you need to add polish to it.


I actually have used other LLMs to review the code, in the past (not today, but in the past). It's fine, but it doesn't tend to catch things like "this technically works but it's loading a footgun." For example, the redux test I was mentioning in my original post, the tests were reusing a single global store variable. It technically worked, the tests ran, and since these were the first tests I introduced in the code base there weren't any issues even though this made the tests non deterministic... but, it was a pattern that was easily going to break down the line.

To me, the solution isn't "more AI", it's "how do I use AI in a way that doesn't screw me over a few weeks/months down the line", and for me that's by making sure I understand the code it generated and trim out the things that are bad/excessive. If it's generating things I don't understand, then I need to understand them, because I have to debug it at some point.

Also, in this case it was just some unit tests, so who cares, but if this was a service that was publicly exposed on the web? I would definitely want to make sure I had a human in the loop for anything security related, and I would ABSOLUTELY want to make sure I understood it if it were handling user data.


how long ago was this past? A review with latest models should absolutely catch the issue you describe, in my experience.

Ah, "It's work on my computer" edition of LLM.

December. Previous job had cursor and copilot automatically reviewing PRs.

The quality of generated code does not matter. The problem is when it breaks 2 AM and you're burning thousands of dollars every minutes. You don't own the code that you don't understand, but unfortunately that does not mean you don't own the responsibility as well. Good luck on writing the postmortem, your boss will have lots of question for you.

Frequently the boss is encouraging use of AI for efficiency without understanding the implications.

And we'll just have the AI write the postmortem, so no big deal there. ;)


AI can help you understand code faster than without AI. It allows me to investigate problems that I have little context in and be able to write fixes effectively.

> you will struggle and eventually put yourself out of a job.

We can have a discussion without the stakes being so high.


> If you do use this approach, you will get code that is better than what most software devs put out. And that gives you a good base to work with if you need to add polish to it.

If you do use this approach, you'll find that it will descend into a recursive madness. Due to the way these models are trained, they are never going to look at the output of two other models and go "Yeah, this is fine as it is; don't change a thing".

Before you know it you're going to have change amplification, where a tiny change by one model triggers other models (or even itself) to make other changes, which triggers further changes, etc ad nauseum.

The easy part is getting the models to spit out working code. The hard part is getting it to stop.


Im sick and tired of these empty posts.

SHOW AN EXAMPLE OF YOU ACTUALLY DOING WHAT YOU SAY!


There's no example because OP has never done this, and never will. People lie on the internet.

I've never done this because i haven't felt compelled to do this because I want to review my own code but I imagine this works okay and isn't hard to set up by asking Claude to set this up for you...

What? People do this all the time. Sometimes manually by invoking another agent with a different model and asking it to review the changes against the original spec. I just setup some reviewer / verifier sub agents in Cursor that I can invoke with a slash command. I use Opus 4.5 as my daily driver, but I have reviewer subagents running Gemini 3 Pro and GPT-5.2-codex and they each review the plan as well, and then the final implementation against the plan. Both sometimes identify issues, and Opus then integrates that feedback.

It’s not perfect so I still review the code myself, but it helps decrease the number of defects I have to then have the AI correct.


The setup is much simpler than you might think. I have 4 CLI tools I use for this setup. Claude Code, Codex, Copilot and Cursor CLI. I asked Claude Code to create a code reviewer "skill" that uses the other 3 CLI tools to review changes in detail and provide feedback. I then ask Claude Code to use this skill to review any changes in code or even review plan documents. It is very very effective. Is it perfect? No. Nothing is. But, as I stated before, this produces results that are better than what an average developer sends in for PR review. Far far better in my own experience.

In addition to that, we do use CodeRabbit plugin on GitHub to perform a 4th code review. And we tell all of our agents to not get into gold-plating mode.

You can choose not to use modern tools like these to write software. You can also choose to write software in binary.


these two posts (the parent and then the OP) seem equally empty?

by level of compute spend, it might look like:

- ask an LLM in the same query/thread to write code AND tests (not good)

- ask the LLM in different threads (meh)

- ask the LLM in a separate thread to critique said tests (too brittle, testing guidelines, testing implementation and not out behavior, etc). fix those. (decent)

- ask the LLM to spawn multiple agents to review the code and tests. Fix those. Spawn agents to critique again. Fix again.

- Do the same as above, but spawn agents from different families (so Claude calls Gemini and Codex).

—-

these are usually set up as /slash commands like /tests or /review so you aren’t doing this manually. since this can take some time, people might work on multiple features at once.


I don't get it. People complain when they have to go to the office. And then some are given the option to work from home. Then they complain their boss can find out where they are during work hours. What on Earth are you complaining about?

Just go to the damn office already!!!


It's about trust and empowerment.

It's about hiring adults, respecting and trusting them to do the job and support the team, and be responsible for their methods. The details are not important to that goal.

If an employer instead treats people like toddlers needing supervision, spoon feeding, and metrics around methods, not work, they will get only that.


It's pretty amazing to see the bubble many people here seem to work in. A guess, but probably 90% of employees have to go to work. Either they physically cannot do their job remotely or the employer demands that they be present.

A lot of people are coming across as whiny children here, "Oh no I might have to go to the office for my 6-figure paycheck." Grow up and go to work, as George Carlin might say.


"Oh you're doing work? That's so cute... we're gonna close whatever apps you had open, because we're updating now. We own your computer. You had unsaved work? Too bad, it's gone, get bent."

This, a 1,000 times. I hate, hate, hate this "feature". My Macs don't do that. My Linux systems don't do that. The whole, "screw you, we don't care" attitude of Microsoft is quite appalling.

Microsoft now makes it very difficult to disable this feature. After a few registry edits, I thought I was able to put a stop to the madness. But, then it went to rebooting on its own again.


I keep telling Windows 10 to delay these updates by 1 week each time...

Curiously, Office apps have auto-save, so does IntelliJ, VSCode, and even Notepad nowadays.. "restore my work environment after a reboot" almost works, but some things do disappear (e.g. unsaved web input forms) that it's aggravating enough. I wonder if they'll make it mandatory for apps to persist more across restarts.

Ok, that requires them to be competent, if they're competent we won't even have what we have now.


"requiring apps to do X" means blocking running every app written prior to you instituting the requirement, bricking everyone's workflows, so that would be the least competent thing imaginable.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: