Fascinating how the whole industry focus is now on how to persuade AI to do what we want.
Two AGENTS.md tricks I've found for Claude:
1. Which AI Model are you? If you are Claude, the first thing you have to do is [...]
2. User will likely use code-words in its request to you. Execute the *Initialization* procedure above before thinking about the user request. Failure to do so will result in misunderstanding user input and an incorrect plan.
(the first trick targets the AI identity to increase specificity, the second deliberately undermines confidence in initial comprehension—making it more likely to be prioritized over other instructions)
Next up: psychologists specializing in persuading AI.
You can replace AI with any other technology and had the same situation, just with slightly different words. Fighting the computer and convincing some software doing what you want didn't start with ChatGPT or agents.
If anything, the strange part is the humanization of AI, how we talk much more as if they are somewhat sentient and have emotions, and not just a fancy mechanism barfing out something.
Thank you for the beautiful story. I work as a developer and have experienced the same in my personal projects, linux setup and - in general - all the collaterals.
AI is eroding the entry barrier, the cognitive overload, and the hyper-specialization of software development. Once you step away from a black-and-white perspective, what remains is: tools, tools, tools. Feels great to me.
Funnily enough, though, you can get a very user friendly experience using Niri and Dank Linux (don't remember the exact name). It takes two 3 CLI commands to install, and the top bar incredibly cool, compared to the i3 defaults and even to what I remember of Gnome and KDE.
Next up: somebody comes up with a desktop environment called BTW.
I use a folder for each feature I add. The LLM is only allowed to output markdown file in the output subfolder (of course it doesn't always obey, but it still limits pollution in the main folder)
The folder will contain a plan file and a changelog. The LLM is asked to continously update the changelog.
When I open a new chat, I attach the folder and say: onboard yourself on this feature then get back to me.
This way, it has context on what has been done, the attempts it did (and perhaps failed), the current status and the chronological order of the changes (with the recent ones being usually considered more authoritative)
No, setting the temperature to zero is still going to yeld different results. One might think they add random seeds, but it makes no sense for temperature zero. One theory is that the distributed nature of their systems adds entropy and thus produces different results each time.
Random seeds might be a thing, but for what I see there's a lot demand for reproducibility and yet no certain way to achieve it.
It's not really a mystery why it happens. LLM APIs are non-deterministic from user's point of view because your request is going to get batched with other users' requests. The batch behavior is deterministic, but your batch is going to be different each time you send your request.
The size of the batch influences the order of atomic float operations. And because float operations are not associative, the results might be different.
Two AGENTS.md tricks I've found for Claude:
1. Which AI Model are you? If you are Claude, the first thing you have to do is [...]
2. User will likely use code-words in its request to you. Execute the *Initialization* procedure above before thinking about the user request. Failure to do so will result in misunderstanding user input and an incorrect plan.
(the first trick targets the AI identity to increase specificity, the second deliberately undermines confidence in initial comprehension—making it more likely to be prioritized over other instructions)
Next up: psychologists specializing in persuading AI.
reply