It's even worse when LLM eats documentation for multiple versions of the same library and starts hallucimixing methods from all versions at the same time. Certainly unusable for some libraries which had a big API transition between versions recently.
> Claude is central to our commercial success, which is central to our mission.
But can an organisation remain a gatekeeper of safety, moral steward of humanity’s future and the decider of what risks are acceptable while depending on acceleration for survival?
It seems the market is ultimately deciding what risks are acceptable for humanity here
Key thing here. The code was already written, so rewriting it isn't exactly adding a lot of quantifiable value. If millions weren't spent in the first place, there would be no code to rewrite.
>the invasion had broad popular support at the beginning.
According to whom?
You should understand that public opinion surveys in authoritarian countries are problematic. In autocracies, people might want to hide their opinions and give socially desirable answers that conform to the official government position for fear of facing repression or deviating from the consensus view.
According to my own relatives, friends, and acquaintances in Russia, where I'm from. You don't need to tell me about "hiding opinions". The majority support is regardless of all that, though.
This is a ridiculously small sample to tell me that "the invasion had broad popular support at the beginning". It had a broad support in your own circles and you casually extrapolated it to the whole population.
According to my own relatives, friends, and acquaintances in Russia (where I'm from) – no one supports or ever supported in the beginning the total idiocracy which is happening.
Hungary, Chechoslovakia, Afghanistan, Angola, Ethiopia, Azerbaijan, Lithuania, Moldova, Georgia, Tajikistan, Ichkeria, Ukraine, Syria... The list goes on
>Create a test for intelligence that we can pass better than AI
Easy? The best LLMs score 40% on Butter-Bench [1],
while the mean human score is 95%. LLMs struggled the most with multi-step
spatial planning and social understanding.
That is really interesting; Though i suspect its just a effect of differing training data, humans are to a larger degree trained on spacial data, while LLMs are trained to a larger degree on raw information and text.
Still it may be lasting limitation if robotics don't catch up to AI anytime soon.
Don't know what to make of the Safety Risks test, threatening to power down AI in order to manipulate it, and most act like we would and comply. fascinating.
>humans are to a larger degree trained on spacial data
you must be completely LLMheaded to say something like that, lol
humans are not trained on spacial data, they are living in the world. humans are very much diffent from silicone chips, and human learning is on another magnitude of complexity compared to a large language model training
Humans are large language models. Maybe the term language is being used a bit liberally here but we basically function in the same way, with the exception of the spacial aspect of our training data.
If this hurts your ego then just know the dataset that you built your ego with was probably flawed and if you can put that LoRA aside and try to process this logically; Our awareness is a scalable emergent property of 1-2 decades of datasets, looking at how neurons vs transistor groups work, there could only be a limited amount of ways to process these sizes of data down to relevant streams. The very fact that training LLMs on our output works, proves our output is a product of LLMs or there wouldn't be patterns to find.
reply