merizian's comments

merizian · 2025-11-15T11:40:30 1763206830

I disagree that legislation can't help. Fundamentally there's an education disconnect and unnecessary friction in setting up parental controls. Governments can better educate parents about the risks, and give them better tools to filter/monitor content their children watch (eg at the device level). Being a parent is hard and it's possible to make this part easier imo.

eg consider child-proof packaging and labeling laws for medication, which dramatically reduced child mortality due to accidental drug misuse.

JustExAWS · 2025-11-15T14:51:29 1763218289

Well the law could be simple - “every computer sold must have a prominently displayed ‘parental choice’ screen on first boot that lets the owner specify whether this device will be used by a child and give the parents and option to block adult content”

merizian · 2025-07-06T23:02:12 1751842932

The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component.

In practice, continual learning has not been an important component of improvement in deep learning history thus far. Instead, large diverse datasets and scale have proven to work the best. I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve. I think generally, anthropomorphisms lack predictive power.

I think maybe a big real crux is the amount of acceleration you can achieve once you get very competent programming AIs spinning the RL flywheel. The author mentioned uncertainty about this, which is fair, and I share the uncertainty. But it leaves the rest of the piece feeling too overconfident.

827a · 2025-07-07T02:07:47 1751854067

Continuous learning might not have been important in the history of deep learning evolution yet, but that might just be because the deep learning folks are measuring the wrong thing. If you want to build the most intelligent AI ever, based on whatever synthetic benchmark is hot this month, then you'd do exactly what the labs are doing. If you want to build the most productive and helpful AI; intelligence isn't always the best goal. Its usually helpful, but an idiot who learns from his mistakes is often more valuable than a egotistical genius.

energy123 · 2025-07-07T12:29:15 1751891355

The LLM does learn from its mistakes - while it's training. Each epoch it learns from its mistakes.

imtringued · 2025-07-07T08:35:13 1751877313

>The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component. >I think generally, anthropomorphisms lack predictive power.

I didn't expect someone get this part so wrong the way you did. Continuous learning has almost nothing to do with humans and anthropomorphism. If anything, continuous learning is the bitter lesson cranked up to the next level. Rather than carefully curating datasets using human labor, the system learns on its own even when presented with an unfiltered garbage data stream.

>I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve.

The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear. You can build a static model of the system's properties, but the static model will fail, because the real system has changed and you now have a permanent modelling error. Correcting the modelling error requires changing the model, hence continual learning has become mandatory. I think it is pretty telling that you failed to take the existence of reality (a separate entity from the model) into account. The paradigm didn't stop working, it never worked in the first place.

It might be difficult to understand the bitter lesson, but let me rephrase it once more: Generalist compute scaling approaches will beat approaches based around human expert knowledge. Continual learning reduces the need for human expert knowledge in curating datasets, making it the next step in the generalist compute scaling paradigm.

merizian · 2025-07-07T16:56:36 1751907396

> The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear.

To be clear, this isn’t what Dwarkesh was pointing at, and I think you are using the term “continual learning” differently to him. And he is primarily interested in it because humans do it.

The article introduces a story about how humans learn, and calls it continual learning:

> How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student … This just wouldn’t work … Yes, there’s RL fine tuning. But it’s just not a deliberate, adaptive process the way human learning is.

The point I’m making is just that this is bad form: “AIs can’t do X, but humans can. Humans do task X because they have Y, but AIs don’t have Y, so AIs will find X hard.” Consider I replace X with “common sense reasoning” and Y with “embodied experience”. That would have seemed reasonable in 2020, but ultimately would have been a bad bet.

I don’t disagree with anything else in your response. I also buy into bitter lesson (and generally: easier to measure => easier to optimize). I think it’s just different uses of the same terms. And I don’t necessarily think what you’re referring to as continual learning won’t work.

Davidzheng · 2025-07-07T02:25:23 1751855123

Well Alphaproof used test-time-training methods to generate similar problems (alphazero style) for each question it encounters.

merizian · 2025-04-27T22:57:15 1745794635

I prefer a more nuanced take. If I can’t reliably delegate away a task, then it’s usually not worth delegating. The time to review the code needs to be less than the time it takes to write it myself. This is true for people and AI.

And there are now many tasks which I can confidently delegate away to AI, and that set of tasks is growing.

So I agree with the author for most of the programming tasks I can think of. But disagree for some.

merizian · on Dec 23, 2024

Because of mup [0] and scaling laws, you can test ideas empirically on smaller models, with some confidence they will transfer to the larger model.

[0] https://arxiv.org/abs/2203.03466

merizian · on Nov 13, 2024

This reminds me of LLM pretraining and how there are so many points at which the program could fail and so you need clever solutions to keep uptime high. And it's not possible to just fix the bugs--GPUs will often just crash (e.g. in graphics, if a pixel flips the wrong color for a frame, it's fine, whereas such things can cause numerical instability in deep learning so ECC catches them). You also often have a fixed sized cluster which you want to maximize utilization of.

So improving uptime involves holding out a set of GPUs to swap out failed ones while they reboot. But also the whole run can just randomly deadlock, so you might solve that by listening to the logs and restarting after a certain amount of inactivity. And you have to be clever with how to save/load checkpoints, since that can start to become a huge bottleneck.

After many layers of self healing, we managed to take a vacation for a few days without any calls :)

merizian · on Sept 12, 2024

You can buy a commercial license for OpenPose for $25K/year https://cmu.flintbox.com/technologies/b820c21d-8443-4aa2-a49...

merizian · on Sept 5, 2024

Even if the out of place hold were used, would you then conclude it to be causal? I still wouldn't rule out coincidence. Many discoveries happen as a result of investigating spurious patterns.

Also the author rules out psychology, but I wouldn't, especially since there were multiple confirmed errors in the route preparation, which I expect could reduce one's trust in the fairness of the competition. In the moment, I might start to wonder, "If one hold was out of place, why not more? Is anyone even checking this?" even if untrue / unlikely.

petsfed · on Sept 5, 2024

I’ve never been a competitive speed climber, but I do understand that part of the process of precision is having cues for e.g. body position. So the fact that it’s never actually touched is not necessarily the red herring it seems to be. Racecar drivers cue off of trackside landmarks to get their brake timing right, for instance.

Certainly, the rope feel is a much more significant factor, since the feel of the rope tugging on your harness is a non visual part of your body position feedback (maybe “I know that I’m going fast enough/pulling hard enough if I’m outracing the rope retraction rate”).

merizian · on April 26, 2024

Another option that works for me: https://cordlessdog.com/stay/

merizian · on Nov 18, 2023

The fallacy being made in this argument is that computers need to perform tasks the same way as humans to achieve equal or better performance on them. While having better "system 2" abilities may improve performance, it's plausible that scaled-up next-token prediction along with a bit of scaffolding and finetuning could match human performance on the same diversity of tasks while doing them a completely different way.

If I had to critique Hinton's claims, I would say his usage of the word "understand" can be vague and communicate assumptions because it's from an ontology used for reasoning about human reasoning, not this new alien form of reasoning which language models embody.

edot · on Nov 18, 2023

I believe it was Feynman who said something to the effect of "airplanes do not fly like birds do, but they fly much faster and can carry much more". So yes, we do not need to exactly replicate how humans do things in order to do human-like things in a useful manner. Planes do not flap their wings, but the jet engine (which is completely unnatural) does a great job of making things fly when paired with fixed wings of a certain shape.

mcmoor · on Nov 19, 2023

Tbf planes have access to much more energy than birds just like LLM does. Maybe that will be the next challenge.

BurningFrog · on Nov 18, 2023

> The fallacy being made in this argument is that computers need to perform tasks the same way as humans to achieve equal or better performance

Especially since I don't think we know that much about how human intelligence actually works.

metanonsense · on Nov 18, 2023

In addition to that, the "system 2" abilities might already be there with "epi" strategies like chain-of-thought prompting. Talking / writing to yourself might not be the most efficient way to think but at least I do it often enough when pondering a problem.