Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m struggling with MRF. Model Release Fatigue. It’s a syndrome of constantly context switching new large models. Claude 4, gpt, llama, Gemini 2.5, pro-mini, mistrial.

I fire off the ide switch the model and think oh great this is better. I switch to something that worked before and man, this sucks now.

Context switching llm, Model Release Fatigue



Not to invalidate your feelings of fatigue, but I’m sure glad that there are a lot of choices in the marketplace, and that they are innovating at a decent clip. If you’re committed to always be using the best of all options you’re in for a wild ride, but it beats stagnation and monopoly.


We’re also headed into a world where there will be very few open weight models coming out (Meta going closed source, not releasing Behemoth). This era of constant model releases may be over before it even started. Gratitude definitely needs to be echoed.


I don't agree with that. I didn't expect we ever get open weight models close to the current state of the art, yet china delivered some real burners.


If China stays open, then the rest of the world will build on open. I'm frankly shocked that a domestic player isn't doing this.

Fine tuning will work for niche business use cases better than promises of AGI.


> If China stays open, then the rest of the world will build on open

I was listening to a Taiwanese news channel earlier today and although I wasn't paying much attention, I remember hearing about how Chinese AIs are biased towards Chinese political ideas and that some programme to create a more Taiwanese-aligned AI was being put in place.

I wouldn't be surprised if just for this reason, at least a few different open models kept being released, because even if they don't directly bring in money, several actors care more about spreading or defending their ideas and IAs are perfect for that.


I don’t disagree, but I feel comfortable enough using Moonshot’s Kimi K2 API for engineering use cases. It is also good that the model can be used via USA based providers.


It makes sense that they would be trojan horses.


I'm expecting any day now (they are slow thinkers over there) xAI to start releasing Christian Fascist centric LLMs with a distorted angry Jesus and the entire Bible rewritten for Fascist authoritarianism. Any day now.


It's curious that China is carrying the open banner nowadays. Why is that?

One theory is that they believe the real endpoint value will be embodied AIs (i.e. robots), where they think they'll hold a long-term competitive advantage. The models themselves will become commoditized, under the pressure of the open-source models.


A major reason I haven’t really tried any of these things (despite thinking they are vaguely neat). I think I will wait until… 2026, second half, most likely. At least I’ll check if we have local models and hardware that can run them nicely, by then.

Hats off to the folks who have decided to deal with the nascent versions though.


Depending on the definition of "nicely", FWIW I currently run Ollama sever [1] + Qwen Coder models [2] with decent success compared to the big hosted models. Granted, I don't utilize most "agentic" features and still mostly use chat-based interactions.

The server is basically just my Windows gaming PC, and the client is my editor on a macOS laptop.

Most of this effort is so that I can prepare for the arrival of that mythical second half of 2026!

[1] https://github.com/ollama/ollama/blob/main/docs/faq.md#how-d...

[2] https://huggingface.co/collections/Qwen/qwen25-coder-66eaa22...


Agentic editing is really nice. If on VSCode, Cline works well with Ollama.


Thanks for sharing your setup! I'm also very interested in running AI locally. In which contexts are you experiencing decent success? eg debugging, boilerplate, or some other task?


I'm running qwen via ollama on my M4 Max 14 inch with the OpenWebUI interface, it's silly easy to set up.

Not useful though, I just like the idea of having so much compressed knowledge on my machine in just 20gb. In fact I disabled all Siri features cause they're dogshit.


+1 Ollama and Qwen coder is amazingly effective, even running on my modest M2 32G mac mini.


When ChatGPT, then Llama, then Alpaca came out in rapid succession, I decided to hold off a year before diving in. This was definitely the right choice at the time, it’s becoming less-the-right-choice all the time.

In particular it’s important to get past the whole need-to-self-host thing. Like, I used to be holding out for when this stuff would plateau, but that keeps not happening, and the things we’re starting to be able to build in 2025 now that we have fairly capable models like Claude 4 are super exciting.

If you just want locally runnable commodity “boring technology that just works” stuff, sure, cool, keep waiting. If you’re interested in hacking on interesting new technology (glances at the title of the site) now is an excellent time to do so.


I wouldn’t want to become dependent on something like OpenAI, at least not until we see what the “profitable” version of the company is.

If they have to enshiffify, I don’t want that baked into my workflow. If they have to raise prices, that changes the local vs remote trade off. If they manage to lower prices, then the cost of running locally will be reduced as well.

I’m also not sure what the LLMs that I’d want to use look like. No real deal-maker applications have shown up so far; if the good application ends up being something like “integrate it into neovim and suggest completions as you type” obviously I won’t want to hit the network for that.

Early days still.


i don’t quite see the point in waiting, if you’re using something like lm studio just download the latest and greatest and you’re on your way, where is the fatigue part?

i can understand maybe if you’re spending hours setting it up but to me these are download and go


Yeah, that’s why I said waiting a year after Llama v1 was good. By that point llama.cpp, LM Studio and Ollama were all pretty well established and a lot of low-hanging fruit around performance and memory mapping stuff was picked.


It is completely unreasonable to buy the hardware to run a local model and only use it 1% of the time. It will be unreasonable in 2026 and probably very long after that.

Maybe s.th. like a collective that buys the gpu's together and then uses them without leaking data can work.


over time you would assume the models will get more efficient and the hardware will get better to the point that buying a massive new gpu with boatloads of vram is just not necessary

maybe 128gb of vram becomes the new mid tier model and most llms can fit into this nicely and do everything one wants in an llm

given how fast llms are progressing it wouldn’t surprise me if we reach this point by 2030


Considering there were two generations (around 4.5 years) of top-tier consumer GPUs (3090/4090) stuck at 24GB VRAM max, and the current one (5090) "only" bumped it up to 32GB, I think you'll be waiting more than 5 years before 128GB VRAM comes to the mid tier model GPU. 12-16GB is currently mid tier and has been since LLMs became "a thing".

I hope I'm wrong though, and we see a large bump soon. Even just 32GB in the mid tier would be huge.

I'm really tempted to try out a Mac Studio with 256+ GB Unified Memory (192 GB VRAM), but it is sadly out of my budget at the moment. I know there is a bandwidth loss, but being able to run huge models and huge contexts locally would be quite nice.


I have a modified tiered approach, that I adopted without consciously thinking hard about it.

I use AI mostly for problems on my fringes. Things like manipulating some Excel table somebody sent me with invoice data from one of our suppliers and some moderately complex question that they (pure business) don't know how to handle, where simple formulas would not be sufficient and I would have to start learning Power Query. I can tell the AI exactly what I want in human language and don't have to learn a system that I only use because people here use it to fill holes not yet served by "real" software (databases, automated EDI data exchange, and code that automates the business processes). It works great, and it saves me hours on fringe tasks that people outsource to me, but that I too don't really want to deal with too much.

For example, I also don't check various vendors and models against one another. I still stick to whatever the default is from the first vendor I signed up with, and so far it worked well enough. If I were to spend time checking vendors and models, the knowledge would be outdated far too quickly for my taste.

On the other hand, I don't use it for my core tasks yet. Too much movement in this space, I would have to invest many hours in how to integrate this new stuff when the "old" software approach is more than sufficient, still more reliable, and vastly more economical (once implemented).

Same for coding. I ask AI on the fringes where I don't know enough, but in the core that I'm sufficiently proficient with I wait for a more stable AI world.

I don't solve complex sciency problems, I move business data around. Many suppliers, many customers, different countries, various EDI formats, everybody has slightly different data and naming and procedures. For example, I have to deal with one vendor wanting some share of pre-payment early in the year, which I have to apply to thousands of invoices over the year and track when we have to pay a number of hundreds or thousands of invoices all with different payment conditions and timings. If I were to ask the AI I would have to be so super specific I may as well write the code.

But I love AI on the not-yet-automated edges. I'm starting to show others how they can ask some AI, and many are surprised how easy it is - when you have thee right task and know exactly hat you have and what you want. My last colleague-convert was someone already past retirement age (still working on the business side). I think this is a good time to gradually teach regular employees some small use cases to get them interested, rather than some big top-down approach that mostly creates more work and many people then rightly question what the point is.

About politically-touched questions like whether I should rather use an EU-made AI like the one this topic is about, or use one from the already much of the software-world dominating US vendor, I don't care at this point, because I'm not yet creating any significant dependencies. I am glad to see it happening though (as an EU country citizen).


> About politically-touched questions like whether I should rather use an EU-made AI like the one this topic is about, or use one from the already much of the software-world dominating US vendor, I don't care at this point, because I'm not yet creating any significant dependencies. I am glad to see it happening though (as an EU country citizen).

Another nice thing about waiting a bit—one can see how much (if any) the EU models get from paying the “do things somewhat ethically” price. I suspect it won’t be much of a penalty.


An alternative: don't use LLMs. Focus on the enjoyment of coding, not on becoming more efficient. Because the lion's share of the gains from increased efficiency are mainly going to the CEOs.


This might be good short term advice, but in the medium and long term I think devs who don't use any AI will start to be much slower at delivery than devs who do. I'm alreay seeing it IRL (and I'm not a fan of AI coding, so this sucks for me)


Good news for you then, this idea is less and less born out by the data. The productivity and efficiency gains aren't there, so there's no reason to be compelled by the spectre of obsolescence. The models may be getting better, but it doesn't seem to be actually changing much for programming. The illusion of busywork, perhaps, is swallowing up the decreased mental bandwidth in constant context switching.


Slower in initial delivery maybe, but the maintenance and debugging of production applications requires intimate knowledge of the code base usually. The amount of code AI writes will require AI itself to manage it since no human would inundate themselves with that much code. Will it be faster even so? We simply won’t know because those vibe coded apps have just entered production. The horror stories can’t be written yet because the horror is ongoing.

I’m big on AI, but vibe coding is such a fuck around and find out situation.


Oh yeah, I totally agree. Vibe coding is not (anytime soon at least) going to be a thing.

But using AI tools for things like completing simple functions (co-pilot) or asking questions about a codebase can still be huge time savers. I've also had really good success with having AI generate me basic scripts that would have taken 45 minutes of work, but it gets me a working script in 3. It's not the revolution that's been promised, but it definitely makes me faster even though I don't like it


This. If there's one thing I've found AI to be a huge timesaver for, it's writing things that interact with libraries/frameworks/codebases that have an atrociously large surface area. AI can sift through the noise so much faster than I can and get me going down the right pack in way less time.

(Aside: Hi Ben! If you are who I think you are, we started at the same company on the same day back in August of 2014.)


Plenty of small FAFO stories circulate already. There will certainly be more. Lots of demonstration code out there in the training data meant only for illustrative purposes, and all too often vibe coding overlooks the rock bottom basics of security.


This is HN, we are not all wage workers here

For wage workers, not learning the latest productivity tools will result in job loss. By the time it is expected of your role, if you have not learned already, you won't be given the leniency to catch up on company time. There is no impactful resistance to this through individual protest, only by organizing your peers in industry


What does wage versus salary have to do with anything?


Salary is a specific type of wage


I would like downvotes to explain what’s wrong


All the competition is great to me. I'm using premium models all the time and barely spent a few euro on them, as there's always some offers that are almost free if you look around.


Why do you even follow? Just stick to one that works well for you?


Totally, I feel like though you do have to pay some attention for example in the context I'm working on, for the last while, Gemini was our gold standard for code generation whereas today, Claude subjectively produces the better results. Sure you can stick to what worked abut then you're missing the opportunity to be more productive or less busy, whichever one you choose.


I remember the days when I was looking for the perfect note-taking system/setup - I never achieved anything with it, I was too busy figuring out the best way to take notes.


Once we find the best way though...


Yep, now I have a directory of org files.


FOMO may be one of the reasons amongst others.


I believe perfs of previous versions are worse because providers reallocate resources to newer versions. Also because of training data cut-off to previous years. This is what happened between claude sonnet 3.5 and 3.7.

Personally I only use Claude/Anthropic and ignore other providers because I understand it the more. It's smart enough, I rarely need the latest greatest.


I totally get it. Due to my work, I mostly keep up with new model releases, but the pace is not sustainable for individuals, or the industry. I'm hoping that model releases (and the entire development speed of the field) will slow down over time, as LLMs mature and most low-hanging fruits in model training have been picked. Are we there yet? Surely not.


Much like with new computer hardware, announcements are constant but they rarely entice me to drop one thing and switch to another. If an average user picked a top 3 option last year and stuck with them through now you didn't really miss out on all that much, even if your particular choice wasn't the absolute latest and greatest the entire time.


Sticking with one year old models would mean no o3 which is a huge loss for dev work


What a luxury!

One way to avoid this: stick with one LLM and bet on the company behind it (meaning, over time, they’ll always have the best offering). I’ve bet on OpenAI. Others can make different conclusions.


When the medicine is worse than the disease...


Now just make a chatbot for each model and then compete them all against each other in the ultimate showdown.

Winner gets your attention for a week.


Using litellm and/or openrouter.ai really makes it painless


You only need Claude and GPT. Everything else is not worth your time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: