Whenever I have a model fix something new I ask it to update the markdown implementation guides I have in the docs folder in my projects. I add these files to context as needed. I have one for implementing routes and one for implementing backend tests and so on.
They then know how to do stuff in the future in my projects.
They still aren't learning. You're learning and then telling them to incorporate your learnings. They aren't able to remember this so you need to remind them each day.
That sounds a lot like '50 First Dates' but for programming.
Yes, this is something people using LLMs for coding probably pick up on the first day. They're not "learning" as humans do obviously. Instead, the process is that you figure out what was missing from the first message you sent where they got something wrong, change it, and then restart from beginning. The "learning" is you keeping track of what you need to include in the context, how that process exactly works, is up to you. For some it's very automatic, and you don't add/remove things yourself, for others is keeping a text file around they copy-paste into a chat UI.
This is what people mean when they say "you can kind of do "learning" (not literally) for LLMs"
While I hate anthropomorphizing agents, there is an important practical difference between a human with no memory, and an agent with no memory but the ability to ingest hundreds of pages of documentation nearly instantly.
The outcome is definitely not the same, and you need to remind them all the time. Even if you feed the context automatically they will happily "forget" it from time to time. And you need to update that automated context again, and again, and again, as the project evolves
I believe LLMs ultimately cannot learn new ideas from their input in the same way as they can learn it from their training data, as the input data doesn't affect the weights of the neural network layers.
For example, let's say LLMs did not have examples of chess gameplay examples in their training data. Would one be able to have an LLM play chess by listing the rules and examples in the context? Perhaps, to some extent, but I believe it would be much worse than if it was part of the training (which of course isn't great either).
The feeding can be automated in some cases. In GitHub copilot you can put it under .github/instructions and each instructions markdown file starts with a section that contains a regex of which files to apply the instructions to.
You can also have an index file that describes when to use each file (nest with additional folders and index files as needed) and tell the agent to check the index for any relevant documentation they should read before they start. Sometimes it will forget and not consult the docs but often it will consult the relevant docs first to load just the things it needs for the task at hand.
I tend to think it would lead to them forming opinions about the people they interact with as they learn what it's like to interact with them, and that this would also influence their behaviour/outputs. Just imagining the day where copilot's chain of thought starts to include things like "Greg is bossy and often unkind to me in PR reviews. I need to set clear boundaries with him and discontinue the relationship if he will not respect them."
Having a good prompt file ("memory") is an artform.
The AI hype folks write massive fan fiction style novellas that don't have any impact.
But there's middle ground where you tell the agent the specific things about your repo that it doesn't know based on its training. Like if your application has a specific way to run tests headless or it's compiled a certain way that's not the default average.
AGENTS.md exists, Codex and Crush support it directly. Copilot, Gemini and Claude have their own variants and their /init commands look at AGENTS.md automatically to initialise the project.
Nobody is feeding aything "manually" to Agents. Only people who think "AI" is a web page do that.
Ah yes. Agents.md is a magical file that just appears out of thin air. No one creates it, no one keeps it updated, and LLMs always, without fail, not only consult it but never forget it, and in every new session know precisely what changed in the project and how to continue.
All of them often can't even find/read relevant docs in a new session without prompting
Literally every single CLI-based Agent will show you a suggestion to run /init at startup.
And of course it's up to the developer to keep the documentation up to date. Just like when working with humans. Stuff don't magically document itself.
Yes "good code is self-documenting", but it still takes ages to find anything without docs to tell you the approximate direction.
It's literally a text file the agent can create and update itself. Not hard. Try it.
> Just like when working with humans. Stuff don't magically document itself.
Humans actually learn from codebases they work with. They don't start with a clean slate every time they wake up in the morning. They know where to find information and how to search for them. They don't need someone to constantly update docs to point to changes.
> but it still takes ages to find anything without docs to tell you the approximate direction.
Which humans, unsurprisingly, can do without wiping their memory every time.
Yeah, if my screwdriver undid the changes I just made to my mower, constantly ignored my desire to unscrew screws and instead punched a hole in my carb - I'd be throwing that screwdriver in the garbage.
- don't have career growth that you can feel good about having contributed to
Humans are on the verge of building machines that are smarter than we are. I feel pretty goddamned awesome about that. It's what we're supposed to be doing.
- don't have a genuine interest in accomplishment or team goals
Easy to train for, if it turns out to be necessary. I'd always assumed that a competitive drive would be necessary in order to achieve or at least simulate human-level intelligence, but things don't seem to be playing out that way.
- have no past and no future. When you change companies, they won't recognize you in the hall.
Or on the picket line.
- no ownership over results. If they make a mistake, they won't suffer.
Good deal. Less human suffering is usually worth striving for.
> Humans are on the verge of building machines that are smarter than we are.
You're not describing a system that exists. You're describing a system that might exist in some sci-fi fantasy future. You might as well be saying "there's no point learning to code because soon the rapture will come".
That particular future exists now, it's just not evenly distributed. Gemini 2.5 Pro Thinking is already as good at programming as I am. Architecture, probably not, but give it time. It's far better at math than I am, and at least as good at writing.
Computers beat us in maths decades ago, yet LLMs are not able to beat a calculator half of the time. The maths benchmarks that companies so proudly show off are still the realm of a traditional symbolic solvers. You claiming much success in asking LLMS for math makes me question if you have actually asked an LLM about maths.
Most AI experts not heavily invested in the stocks of inflated tech companies seem to agree that current architectures cannot reach AGI. It's a sci-fi dream, but hyping it is real profitable. We can destroy ourselves plenty with the tech we already have, but it won't be a robot revolution that does it.
The maths benchmarks that companies so proudly show off are still the realm of a traditional symbolic solvers. You claiming much success in asking LLMS for math makes me question if you have actually asked an LLM about maths.
What I really need to ask an LLM for is a pointer to a forum that doesn't cultivate proud exhibition of ignorance, Luddism, and general stupidity at the level exhibited by commenters in this entire HN story, and in this subthread in particular.
>>Humans are on the verge of building machines that are smarter than we are. I feel pretty goddamned awesome about that. It's what we're supposed to be doing.
Have you ever spent any time around children? How about people who think they're accomplishing a great mission by releasing truly noxious ones on the world?
You just dismissed the entire notion of accountability as an unnecessary form of suffering, which is right up there with the most nihilistic ideas ever said by, idk, Dostoevsky's underground man or Raskolnikov.
> Humans are on the verge of building machines that are smarter than we are. I feel pretty goddamned awesome about that. It's what we're supposed to be doing.
It's also the premise of The Matrix. I feel pretty goddamned uneasy about that.
(Shrug) There are other sources of inspiration besides dystopic sci-fi movies. There's the Biblical story of the Tower of Babel, for instance. Better not work on language translation, which after all is how the whole LLM thing got started.
Sometimes fiction went in the wrong direction. Sometimes it didn't go far enough.
In any case, the matrix wasn't my inspiration here, but it is a pithy way to describe the concept. It's hard to imagine how humans maintain relevancy if we really do manage to invent something smarter than us. It could be that my imagination is limited though. I've been accused of that before.
They can usually write code, but not that well. They have lots of energy and little to say about architecture and style. Don't have a well defined body of knowledge and have no experience. Individual juniors don't change, but the cast members of your junior cohort regularly do.
The problem with AI Agents like Claude is that they write VERY good code and very fast.
But they don't have a grasp for the project's architecture and will reinvent the wheel for feature X even when feature Y has it or there is an internal common library that does it. This is why you need to be the "manager of agents" and stay on top of their work.
Sometimes it's just about hitting ESC and going "waitaminute, why'd you do that?" and sometimes it's about updating the project documentation (AGENTS.md, docs/) with extra information.
Example: I have a project with a system that builds "rules" using a specific interpreter. Every LLM wants to "optimise" it by using a pattern that looks correct, but will in fact break immediately when there's more than one simultaneous user - and I have a unit test that catches it.
I got bored by LLMs trying to optimise the bit wrong, so I added a specific instruction, with reasoning why it shouldn't be attempted and has been tried and failed multiple times. And now they stopped doing it =)
I actually do find there is a subset of meetings that are far more productive on Zoom. We can be voice chatting on one screen, share another screen and both be able to type, record notes, pull up side research without interrupting the conversation. It's a bit closer to co-working than a meeting but it hits a sweetspot for me.
I used to be really excited about "agents" when I thought people were trying to build actual agents like we've been working on in the CS field for decades now.
It's clear now that "agents" in the context of "AI" is really about answering the question "How can we make users make 10x more calls to our models in a way that makes it feel like we're not just squeezing money out of them?" I've seen so many people that think setting some "agents" of on a minutes to hours long task of basically just driving up internal KPIs at LLM providers is cutting edge work.
The problem is, I haven't seen any evidence at all that spending 10x the number of API calls on an agent results in anything closer to useful than last year when people where purely vibe coding all the time. At least then people would interactively learn about the slop they were building.
It's astounding to watch a coworker walk though through a PR with hundreds of added new files and repeatedly mention "I'm not sure if these actually work, but it does look like there's something here".
Now I'm sure I'll get some fantastic "no true Scotsman" replies about how my coworkers must not be skilled enough or how they need to follow xyz pattern, but the entire point of AI was to remove the need for specialize skills and make everyone 10x more productive.
Not to mention that the shift in focus on "agents" is also useful in detracting from clearly diminishing returns on foundation models. I just hope there are enough people that still remember how to code (and think in some cases) to rebuild when this house of cards falls apart.
> but the entire point of AI was to remove the need for specialize skills and make everyone 10x more productive.
At least for programming tools, for everything (well, the vast majority, at least) that is sold that way—since long before generative AI—it actually succeeds or fails based not on whether it eliminates need for specialized skills and makes everyone more productive, but whether it further rewards specialized skills, and makes the people who devote time to learning it more productive than if they devoted the same time to learning something else.
Yeah it's saccharine. Reminds me quite a lot of Americans who work for tips (e.g. waiters) - disconcertingly friendly.
Someone gave me a great tip though - at least for ChatGPT there's a setting where you can change its personality to "robot". I guess that affects the system prompt in some way but it basically fixes the issue.
I enjoy getting into a good flow state and pounding out clever and elegant code but watching a good LLM generate code according to my specs and refining it is also enjoyable. I've been burning through $250 of free Claude Code Web credits and having multiple workers running at the same time is fun.
I am essentially in this exact role. The junior developers simply don't have the experience to evaluate the output of the agents. You wind up with a lot of slop in PRs. People can't justify why they did something. I've seen whole PRs closed, work redone, and opened anew because they were 70% garbage. Every other comment was asking "why is this here? it has nothing to do with the ticket."
Sadly, this is not sustainable and I am not sure what I'm going to do.
They have a capacity to "learn", it's just WAY MORE INVOLVED than how humans learn.
With a human, you give them feedback or advice and generally by the 2nd or 3rd time the same kind of thing happens they can figure it out and improve. With an LLM, you have to specifically setup a convoluted (and potentially financially and electrical power expensive) system in order to provide MANY MORE examples of how to improve via fine tuning or other training actions.
> With an LLM, you have to specifically setup a convoluted (and potentially financially and electrical power expensive) system in order to provide MANY MORE examples of how to improve via fine tuning or other training actions.
The only way that an AI model can "learn" is during model creation, which is then fixed. Any "instructions" or other data or "correcting" you give the model is just part of the context window.
Fine tuning is additional training on specific things for an existing model. It happens after a model already exists in order to better suit the model to specific situations or types of interactions. It is not dealing with context during inference but actually modifying the weights within the model.
Depending on your definition of "learn", you can also use something akin to ChatGPT's Memory feature. When you teach it something, just have it take notes on how to do that thing and include its notes in the system prompt for next time. Much cheaper than fine-tuning. But still obviously far less efficient and effective than human learning.
I think it’s reasonable to say that different approaches to learning is some kind of spectrum, but that contemporary fine tuning isn’t on that spectrum at all.
As a team lead, working with people is so... cumbersome. They need time to recharge, lots of encouragement, and a nice place to work in. Give me a coding agent any time!
They're asocial, not antisocial. Antisociality is a diagnosable disorder and is known for interpersonal behaviours that are actively against a healthy society. Asociality, on the other hand, just avoids social interactions, not actively harming the society.
You idea about subscriptions/alerts is a good one. Right now I don't have any kind of work flow except google sheets, so that would be a bit tricky, but I'll bear it in mind for the future.
And yes, the US is obviously on my list of countries to service.
They knew exactly what they were saying.