More

marcelsalathe · 2025-03-27T18:17:20 1743099440

I’ve only skimmed the paper - a long and dense read - but it’s already clear it’ll become a classic. What’s fascinating is that engineering is transforming into a science, trying to understand precisely how its own creations work

This shift is more profound than many realize. Engineering traditionally applied our understanding of the physical world, mathematics, and logic to build predictable things. But now, especially in fields like AI, we’ve built systems so complex we no longer fully understand them. We must now use scientific methods - originally designed to understand nature - to comprehend our own engineered creations. Mindblowing.

ctoth · 2025-03-27T18:20:37 1743099637

This "practice-first, theory-later" pattern has been the norm rather than the exception. The steam engine predated thermodynamics. People bred plants and animals for thousands of years before Darwin or Mendel.

The few "top-down" examples where theory preceded application (like nuclear energy or certain modern pharmaceuticals) are relatively recent historical anomalies.

marcelsalathe · 2025-03-27T18:25:50 1743099950

I see your point, but something still seems different. Yes we bred plants and animals, but we did not create them. Yes we did build steam engines before understanding thermodynamics but we still understood what they did (heat, pressure, movement, etc.)

Fun fact: we have no clue how most drugs works. Or, more precisely, we know a few aspects, but are only scratching the surface. We're even still discovering news things about Aspirin, one of the oldest drugs: https://www.nature.com/articles/s41586-025-08626-7

tmp10423288442 · 2025-03-27T18:42:37 1743100957

> Yes we did build steam engines before understanding thermodynamics but we still understood what it did (heat, pressure, movement, etc.)

We only understood in the broadest sense. It took a long process of iteration before we could create steam engines that were efficient enough to start an Industrial Revolution. At the beginning they were so inefficient that they could only pump water from the same coal mine they got their fuel from, and subject to frequent boiler explosions besides.

mystified5016 · 2025-03-27T20:57:27 1743109047

We laid transatlantic telegraph wires before we even had a hint of the physics involved. It create the entire field of transmission and signal theory.

Shannon had to invent new physics to explain why the cables didn't work as expected.

pas · 2025-03-27T22:07:15 1743113235

I think that's misleading.

There was a lot of physics already known, importance of insulation and cross-section, signal attenuation was also known.

The future Lord Kelvin conducted experiments. The two scientific advisors had a conflict. And the "CEO" went with the cheaper option.

""" Thomson believed that Whitehouse's measurements were flawed and that underground and underwater cables were not fully comparable. Thomson believed that a larger cable was needed to mitigate the retardation problem. In mid-1857, on his own initiative, he examined samples of copper core of allegedly identical specification and found variations in resistance up to a factor of two. But cable manufacture was already underway, and Whitehouse supported use of a thinner cable, so Field went with the cheaper option. """

anthk · 2025-03-27T21:05:13 1743109513

THe telegraph it's older than radio. Think about it.

cft · 2025-03-27T23:44:48 1743119088

that was 1854. You basically only needed Ohm's law for that, which was discovered in 1827

JPLeRouzic · 2025-03-28T07:41:53 1743147713

Ohm's law for a cable 4000 km/3000 miles long? That implies transmission was instantaneous and without any alteration in shape.

I guess the rise time was tens of milliseconds and rebounds in signals lasted for milliseconds or more. Hardly something you can neglect.

For reference, in my time (the 1980) in the telecom industry, we had to regenerate digital signals every 2km.

cft · 2025-03-28T13:22:17 1743168137

"Initially messages were sent by an operator using Morse code. The reception was very bad on the 1858 cable, and it took two minutes to transmit just one character (a single letter or a single number), a rate of about 0.1 words per minute."

https://en.m.wikipedia.org/wiki/Transatlantic_telegraph_cabl...

I guess your bandwidth in 1980 was a bit higher.

arijo · 2025-03-28T12:10:50 1743163850

Almost all civil, chemical, electrical, etc., engineering emerged from a practice-first, theory-later evolution.

pclmulqdq · 2025-03-28T04:58:27 1743137907

Most of what we refer to as "engineering" involves using principles that flow down from science to do stuff. The return to the historic norm is sort of a return to the "useful arts" or some other idea.

adastra22 · 2025-03-28T02:45:09 1743129909

We don’t create LLMs either. We evolve/train them. I think the comparison is closer than you think.

no_wizard · 2025-03-28T03:54:26 1743134066

We most definitely create them though, there is an entire A -> B follow you can do.

It’s complicated but they are most definitely created.

homeyKrogerSage · 2025-03-28T11:53:49 1743162829

ants_everywhere · 2025-03-27T21:56:39 1743112599

This isn't quite true, although it's commonly said.

For steam engines, the first commercial ones came after and were based on scientific advancements that made them possible. One built in 1679 was made by an associate of Boyle, who discovered Boyle's law. These early steam engines co-evolved with thermodynamics. The engines improved and hit a barrier, at which point Carnot did his famous work.

This is putting aside steam engines that are mostly curiosities like ones built in the ancient world.

See, for example

- https://en.wikipedia.org/wiki/Thermodynamics#History

- https://en.wikipedia.org/wiki/Steam_engine#History

karparov · 2025-03-27T21:27:13 1743110833

It's been there in programming from essentially the first day too. People skip the theory and just get hacking.

Otherwise we'd all be writing Haskell now. Or rather we'd not be writing anything since a real compiler would still have been to hacky and not theoretically correct.

I'm writing this with both a deep admiration as well as practical repulsion of C.S. theory.

cryptonector · 2025-03-28T21:58:21 1743199101

Canons and archery and catapults predated Newtonian classical mechanics.

cuttothechase · 2025-03-28T00:08:09 1743120489

This is definitely a classic for story telling but it appears to be nothing more than hand wavy. Its a bit like there is the great and powerful man behind the curtain, lets trace the thought of this immaculate being you mere mortals. Anthropomorphing seems to be in an overdose mode with "thinking / thoughts", "mind" etc., scattered everywhere. Nothing with any of the LLMs outputs so far suggests that there is anything even close enough to a mind or a thought or anything really outside of vanity. Being wistful with good story telling does go a long way in the world of story telling but in actually understanding the science, I wouldn't hold my breath.

colah3 · 2025-03-28T00:23:41 1743121421

Thanks for the feedback! I'm one of the authors.

I just wanted to make sure you noticed that this is linking to an accessible blog post that's trying to communicate a research result to a non-technical audience?

The actual research result is covered in two papers which you can find here:

- Methods paper: https://transformer-circuits.pub/2025/attribution-graphs/met...

- Paper applying this method to case studies in Claude 3.5 Haiku: https://transformer-circuits.pub/2025/attribution-graphs/bio...

These papers are jointly 150 pages and are quite technically dense, so it's very understandable that most commenters here are focusing on the non-technical blog post. But I just wanted to make sure that you were aware of the papers, given your feedback.

AdieuToLogic · 2025-03-28T03:45:33 1743133533

The post to which you replied states:

  Anthropomorphing[sic] seems to be in an overdose mode with 
  "thinking / thoughts", "mind" etc., scattered everywhere. 
  Nothing with any of the LLMs outputs so far suggests that 
  there is anything even close enough to a mind or a thought 
  or anything really outside of vanity.

This is supported by reasonable interpretation of the cited article.

Considering the two following statements made in the reply:

  I'm one of the authors.

And

  These papers are jointly 150 pages and are quite 
  technically dense, so it's very understandable that most 
  commenters here are focusing on the non-technical blog post.

The onus of clarifying the article's assertions:

  Knowing how models like Claude *think* ...

And

  Claude sometimes thinks in a conceptual space that is 
  shared between languages, suggesting it has a kind of 
  universal “language of thought.”

As it pertains to anthropomorphizing an algorithm (a.k.a. stating it "thinks") is on the author(s).

Workaccount2 · 2025-03-28T05:45:48 1743140748

Thinking and thought have no solid definition. We can't say Claude doesn't "think" because we don't even know what a human thinking actually is.

Given the lack of a solid definition for thinking and test to measure it, I think using the terminology colloquially is a totally fair play.

AdieuToLogic · 2025-03-30T01:06:24 1743296784

I view LLM's as valuable algorithms capable of generating relevant text based on queries given to them.

> Thinking and thought have no solid definition. We can't say Claude doesn't "think" because we don't even know what a human thinking actually is.

I did not assert:

  Claude doesn't "think" ...

What I did assert was that the onus is on the author(s) which write articles/posts such as the one cited to support their assertion that their systems qualify as "thinking" (for any reasonable definition of same).

Short of author(s) doing so, there is little difference between unsupported claims of "LLM's thinking" and 19th century snake oil[0] salesmen.

0 - https://en.wikipedia.org/wiki/Snake_oil

EncomLab · 2025-03-28T18:51:09 1743187869

No one says that a thermostat is "thinking" of turning on the furnace, or that a nightlight is "thinking it is dark enough to turn the light on". You are just being obtuse.

geye1234 · 2025-03-28T20:46:06 1743194766

Yes. A thermostat involves a change of state from A to B. A computer is the same: its state at t causes its state at t+1, which causes its state at t+2, and so on. Nothing else is going on. An LLM is no different: an LLM is simply a computer that is going through particular states.

Thought is not the same as a change of (brain) state. Thought is certainly associated with change of state, but can't be reduced to it. If thought could be reduced to change of state, then the validity/correctness/truth of a thought could be judged with reference to its associated brain state. Since this is impossible (you don't judge whether someone is right about a math problem or an empirical question by referring to the state of his neurology at a given point in time), it follows that an LLM can't think.

Workaccount2 · 2025-03-28T21:07:55 1743196075

>Thought is certainly associated with change of state, but can't be reduced to it.

You can effectively reduce continuously dynamic systems to discreet steps. Sure, you can always say that the "magic" exists between the arbitrarily small steps, but from a practical POV there is no difference.

A transistor has a binary on or off. A neuron might have ~infinite~ levels of activation.

But in reality the ~infinite~ activation level can be perfectly modeled (for all intents and purposes), and computers have been doing this for decades now (maybe not with neurons, but equivalent systems). It might seem like an obvious answer, that there is special magic in analog systems that binary machines cannot access, but that is wholly untrue. Science and engineering have been extremely successful interfacing with the analog reality we live in, precisely because the digital/analog barrier isn't too big of a deal. Digital systems can do math, and math is capable of modeling analog systems, no problem.

geye1234 · 2025-03-28T21:20:46 1743196846

It's not a question of discrete vs continuous, or digital vs analog. Everything I've said could also apply if a transistor could have infinite states.

Rather, the point is that the state of our brain is not the same as the content of our thoughts. They are associated with one another, but they're not the same. And the correctness of a thought can be judged only by reference to its content, not to its associated state. 2+2=4 is correct, and 2+2=5 is wrong; but we know this through looking at the content of these thoughts, not through looking at the neurological state.

But the state of the transistors (and other components) is all a computer has. There are no thoughts, no content, associated with these states.

Workaccount2 · 2025-03-28T22:11:59 1743199919

It seems that the only barrier between brain state and thought contents is a proper measurement tool and decoder, no?

We can already do this at an extremely basic level, mapping brain states to thoughts. The paraplegic patient using their thoughts to move the mouse cursor or the neuroscientist mapping stress to brain patterns.

If I am understanding your position correctly, it seems that the differentiation between thoughts and brain states is a practical problem not a fundamental one. Ironically, LLMs have a very similar problem with it being very difficult to correlate model states with model outputs. [1]

[1]https://www.anthropic.com/research/mapping-mind-language-mod...

geye1234 · 2025-03-29T00:26:13 1743207973

There is undoubtedly correlation between neurological state and thought content. But they are not the same thing. Even if, theoretically, one could map them perfectly (which I doubt is possible but it doesn't affect my point), they would remain entirely different things.

The thought that "2+2=4", or the thought "tiger", are not the same thing as the brain states that makes them up. A tiger, or the thought of a tiger, is different from the neurological state of a brain that is thinking about a tiger. And as stated before, we can't say that "2+2=4" is correct by referring to the brain state associated with it. We need to refer to the thought itself to do this. It is not a practical problem of mapping; it is that brain states and thoughts are two entirely different things, however much they may correlate, and whatever causal links may exist between them.

This is not the case for LLMs. Whatever problems we may have in recording the state of the CPUs/GPUs are entirely practical. There is no 'thought' in an LLM, just a state (or plurality of states). An LLM can't think about a tiger. It can only switch on LEDs on a screen in such a way that we associate the image/word with a tiger.

PaulDavisThe1st · 2025-03-29T00:58:58 1743209938

> The thought that "2+2=4", or the thought "tiger", are not the same thing as the brain states that makes them up.

Asserted without evidence. Yes, this does represent a long and occasionally distinguished line of thinking in cognitive science/philosophy of mind, but it is certainly not the only one, and some of the others categorically refute this.

geye1234 · 2025-03-29T01:05:51 1743210351

Is it your contention that a tiger may be the same thing as a brain state?

It would seem to me that any coherent philosophy of mind must accept their being different as a datum; or conversely, any that implied their not being different would have to be false.

EDIT: my position has been held -- even taken as axiomatic -- by the vast majority of philosophers, from the pre-Socratics onwards, and into the 20th century. So it's not some idiosyncratic minority position.

PaulDavisThe1st · 2025-04-01T15:06:07 1743519967

Clearly there is a thing in the world that is a tiger independently of any brain state anywhere.

But the thought of a tiger may in fact be identical to a brain state (or it might not; at this point we do not know).

geye1234 · 2025-04-01T20:20:11 1743538811

Given that a tiger is different from a brain state:

If I am thinking about a tiger, then what I am thinking about is not my brain state. So that which I am thinking about is different from (as in, cannot be identified with) my brain state.

PaulDavisThe1st · 2025-04-02T16:04:39 1743609879

> What I am thinking about is not my brain state

Obviously the thing you are thinking about is not the same as your thinking about it, nor the same as your brain state when thinking about it. Thinking about a thing is necessarily and definitionally distinct from the thing.

The question however is whether there is anything to "thinking about thing" other than the brain state you have when doing so. This is unknown at this time.

geye1234 · 2025-04-02T18:13:58 1743617638

Earlier upthread, I said

>> the thought "tiger" [is] not the same thing as the brain state that makes [it] up.

To which you said

> Asserted without evidence.

This was in the context of my saying

>> There is undoubtedly correlation between neurological state and thought content. But they are not the same thing.

Now you say

> the thing you are thinking about is not the same as your thinking about it, nor the same as your brain state when thinking about it.

Are we at least agreed that the content of the thought "tiger" is not the same thing as the brain state that makes it up?

> The question however is whether there is anything to "thinking about thing" other than the brain state you have when doing so. This is unknown at this time.

If a tiger is distinct from a brain state, which I think we agree on, and if our thoughts are about real things such as tigers, which I assume we agree on, then how can there not be more to thought than the associated brain state?

PaulDavisThe1st · 2025-04-03T13:51:01 1743688261

> Are we at least agreed that the content of the thought "tiger" is not the same thing as the brain state that makes it up?

No. I don't agree that "the content of [a] thought" is something we can usefully talk about in this context.

Thoughts are subjective experiences, more or less identical to qualia. Thinking about a tiger is actually having the experience of thinking about a tiger, and this is purely subjective, like all qualia. The only question I can see worth asking about it is whether the experience of thinking about a tiger has some component to it that is not part of a fully described brain state.

> If a tiger is distinct from a brain state, which I think we agree on, and if our thoughts are about real things such as tigers,

We also have thoughts about unreal things. I don't see why such thoughts should be any different than the ones we have about real things.

geye1234 · 2025-04-03T17:58:27 1743703107

>> If a tiger is distinct from a brain state, which I think we agree on, and if our thoughts are about real things such as tigers, which I assume we agree on, then how can there not be more to thought than the associated brain state?

> We also have thoughts about unreal things. I don't see why such thoughts should be any different than the ones we have about real things.

Let me rephrase then:

If a tiger is distinct from a brain state, which I think we agree on, and if our thoughts can be about real things such as tigers, which I assume we agree on, then how can there not be more to thought than the associated brain state?

A brain state does not refer to a tiger.

genrilz · 2025-04-04T15:52:14 1743781934

I realize I'm butting in on an old debate, but thinking about this caused me to come to conclusions which were interesting enough that I had to write them down somewhere.

I'd argue that rather than thoughts containing extra contents which don't exist in brain states, its more the case that brain states contain extra content which doesn't exist in thoughts. Specifically, I think that "thoughts" are a lossy abstraction that we use to reason about brain states and their resulting behaviors, since we can't directly observe brain states and reasoning about them would be very computationally intensive.

As far as I've seen, you have argued that thoughts "refer" to real things, and that thoughts can be "correct" or "incorrect" in some objective sense. I'll argue against the existence of a singular coherent concept of "referring", and also that thoughts can be useful without needing to be "correct" in some sense which brain states cannot participate in. I'll be assuming that something only exists if we can (at least in theory if not in practice) tie it back to observable behavior.

First, I'll argue that the "refers" relation is a pretty incoherent concept which sometimes happens to work. Let us think of a particular person who has a thought/brain state about a particular tiger in mind/brain. If the person has accurate enough information about the tiger, then they will recognize the tiger on sight, and may behave differently around that tiger than other tigers. I would say in this case that the person's thoughts refer to the tiger. This is the happy case where the "refers" relation is a useful aid to predicting other people's behavior.

Now let us say that the person believes that the tiger ate their mother, and that the tiger has distinctive red stripes. However, let it be the case that the person's mother was eaten by a tiger, but that tiger did not have red stripes. Separately, there does exist a singular tiger in the world which does have red stripes. Which tiger does the thought "a tiger with red stripes ate my mother" refer to?

I think it's obvious that this thought doesn't coherently refer to any tiger. However, that doesn't prevent the thought from affecting the person's behavior. Perhaps the person's next thought is to "take revenge on the tiger that killed my mother". The person then hunts down and kills the tiger with the red stripes. We might be tempted to believe that this thought refers to the mother killing tiger, but the person has acted as though it referred to the red striped tiger. However, it would be difficult to say that the thought refers to the red striped tiger either, since the person might not kill the red striped tiger if they happen to learn said tiger has an alibi. Hopefully this is sufficient to show that the "refers" relationship isn't particularly connected to observable behavior in many cases where it seems like it should be. The connection would exist if everyone had accurate and complete information about everything, but that is certainly not the world we live in.

I can't prove that the world is fully mechanical, but if we assume that it is, then all of the above behavior could in theory be predicted by just knowing the state of the world (including brain states but not thoughts) and stepping a simulation forward. Thus the concept of a brain state is more helpful to predicting their behavior than thoughts with a singular concept of "refers". We might be able to split the concept of "referring" up into other concepts for greater predictive accuracy, but I don't see how this accuracy could ever be greater than just knowing the brain state. Thus if we could directly observe brain states and had unlimited computational power, we probably wouldn't bother with the concept of a "thought".

Now then, on to the subject of correctness. I'd argue that thoughts can be useful without needing a central concept of correctness. The mechanism is the very category theory like concept of considering all things only in terms of how they relate to other things, and then finding other (possibly abstract) objects which have the same set of relationships.

For concreteness, let us say that we have piles of apples and are trying to figure out how many people we can feed. Let us say that today we have two piles each consisting of two apples. Yesterday we had a pile of four apples and could feed two people. The field of appleology is quite new, so we might want to find some abstract objects in the field of math which have the same relationship. Cutting edge appleology research shows that as far as hungry people are concerned, apple piles can be represented with natural numbers, and taking two apple piles and combining them results in a pile equivalent to adding the natural numbers associated with the piles being combined. We are short on time, so rather than combining the piles, we just think about the associated natural numbers (2 and 2), and add them (4) to figure out that we can feed two people today. Thus the equation (2+2=4) was useful because pile 1 combined with pile 2 is related to yesterday pile in the same way that 2 + 2 relates to 4.

Math is "correct" only in so far as it is consistent. That is, if you can arrive at a result using two different methods, you should find that the result is the same regardless of the method chosen. Similarly, reality is always consistent, because assuming that your behavior hasn't affected the situation, (and what is considered the situation doesn't include your brain state) it doesn't matter how or even if you reason about the situation, the situation just is what it is. So the reason math is useful is because you can find abstract objects (like numbers) which relate to each other in the same way as parts of reality (like piles of apples). By choosing a conventional math, we save ourselves the trouble of having to reason about some set of relationships all over again every time that set of relationships occurs. Instead we simply map the objects to objects in the conventional math which are related in the same manner. However, there is no singular "correct" math, as can be shown by the fact that mathematics can be defined in terms of set theory + first order logic, type theory, or category theory. Even an inconsistent math such as set theory before Russell's Paradox can still often produce useful results as long one's line of reasoning doesn't happen to trip on the inconsistency. However, tripping on an inconsistency will produce a set of relationships which cannot exist in the real world, which gives us a reason to think of consistent maths as being "correct". Consistent maths certainly are more useful.

Brain states can also participate in this model of correctness though. Brain states are related to each other, and if these relationships are the same as the relationships between external objects, then the relationships can be used to predict events occurring in the world. One can think of math and logic as mechanisms to form brain states with the consistent relationships needed to accurately model the world. As with math though, even inconsistent relationships can be fine as long as those inconsistencies aren't involved in reasoning about a thing, or predicting a thing isn't the point (take scapegoating for instance).

Sorry for the ramble. I'll summarize:

TL;DR: Thoughts don't contain "refers" and "correctness" relationships in any sense that brain states can't. The concept of "refers" is only usable to predict behavior if people have accurate and complete information about the things they are thinking about. However, brain states predict behavior regardless of how accurate or complete the information the person has is. The concept of "correctness" in math/logic really just means that the relationship between mathematical objects is consistent. We want this because the relationships between parts of reality seem to be consistent, and so if we desire the ability to predict things using abstract objects, the relationships between abstract objects must be consistent as well. However, brain states can also have consistent patterns of relationships, and so can be correct in the same sense.

geye1234 · 2025-04-05T11:53:12 1743853992

Thanks for the response. I don't know if I'll have time to respond, I may, but in any case it's always good to write one's thoughts down.

Workaccount2 · 2025-03-29T03:50:42 1743220242

Does a picture of a tiger or a tiger (to follow your sleight of hand) on a hard drive then count as a thought?

geye1234 · 2025-03-29T15:51:07 1743263467

No. One is paint on canvas, and the other is part of a causal chain that makes LEDs light up in a certain way. Neither the painting nor the computer have thoughts about a tiger in the way we do. It is the human mind that makes the link between picture and real tiger (whether on canvas or on a screen).

famouswaffles · 2025-03-29T00:40:47 1743208847

>Rather, the point is that the state of our brain is not the same as the content of our thoughts.

Based on what exactly ? This is just an assertion. One that doesn't seem to have much in the way of evidence. 'It's not the same trust me bro' is the thesis of your argument. Not very compelling.

geye1234 · 2025-03-29T00:47:19 1743209239

It's not difficult. When you think about a tiger, you are not thinking about the brain state associated with said thought. A tiger is different from a brain state.

We can safely generalize, and say the content of a thought is different from its associated brain state.

Also, as I said

>> The correctness of a thought can be judged only by reference to its content, not to its associated state. 2+2=4 is correct, and 2+2=5 is wrong; but we know this through looking at the content of these thoughts, not through looking at the neurological state.

This implies that state != content.

famouswaffles · 2025-03-29T01:12:29 1743210749

>It's not difficult. When you think about a tiger, you are not thinking about the brain state associated with said thought. A tiger is different from a brain state. We can safely generalize, and say the content of a thought is different from its associated brain state.

Just because you are not thinking about a brain state when you think about a tiger does not mean that your thought is not a brain state.

Just because the experience of thinking about X doesn't feel like the experience of thinking about Y (or doesn't feel like the physical process Z), it doesn't logically follow that the mental event of thinking about X isn't identical to or constituted by the physical process Z. For example, seeing the color red doesn't feel like processing photons of a specific wavelength with cone cells and neural pathways, but that doesn't mean the latter isn't the physical basis of the former.

>> The correctness of a thought can be judged only by reference to its content, not to its associated state. 2+2=4 is correct, and 2+2=5 is wrong; but we know this through looking at the content of these thoughts, not through looking at the neurological state. This implies that state != content.

Just because our current method of verification focuses on content doesn't logically prove that the content isn't ultimately realized by or identical to a physical state. It only proves that analyzing the state is not our current practical method for judging mathematical correctness.

We judge if a computer program produced the correct output by looking at the output on the screen (content), not usually by analyzing the exact pattern of voltages in the transistors (state). This doesn't mean the output isn't ultimately produced by, and dependent upon, those physical states. Our method of verification doesn't negate the underlying physical reality.

When you evaluate "2+2=4", your brain is undergoing a sequence of states that correspond to accessing the representations of "2", "+", "=", applying the learned rule (also represented physically), and arriving at the representation of "4". The process of evaluation operates on the represented content, but the entire process, including the representation of content and rules, is a physical neural process (a sequence of brain states).

geye1234 · 2025-03-29T15:41:02 1743262862

> Just because you are not thinking about a brain state when you think about a tiger does not mean that your thought is not a brain state.

> It doesn't logically follow that the mental event of thinking about X isn't identical to or constituted by the physical process Z.

That's logically sound insofar as it goes. But firstly, the existence of a brain state for a given thought is, obviously, not proof that a thought is a brain state. Secondly, if you say that a thought about a tiger is a brain state, and nothing more than a brain state, then you have the problem of explaining how it is that your thought is about a tiger at all. It is the content of a thought that makes it be about reality; it is the content of a thought about a tiger that makes it be about a tiger. If you declare that a thought is its state, then it can't be about a tiger.

You can't equate content with state, and nor can you make content be reducible to state, without absurdity. The first implies that a tiger is the same as a brain state; the second implies that you're not really thinking about a tiger at all.

Similarly for arithmetic. It is only the content of a thought about arithmetic that makes it be right or wrong. It is our ideas of "2", "+", and so on, that make the sum right or wrong. The brain states have nothing to do with it. If you want to declare that content is state, and nothing more than state, then you have no way of saying the one sum is right, and the other is wrong.

Workaccount2 · 2025-03-28T20:47:39 1743194859

Please, take the pencil and draw the line between thinking and non-thinking systems. Hell I'll even take a line drawn between thinking and non-thinking organisms if you have some kind of bias towards sodium channel logic over silicon trace logic. Good luck.

geye1234 · 2025-03-28T20:56:46 1743195406

Even if you can't define the exact point that A becomes not-A, it doesn't follow that there is no distinction between the two. Nor does it follow that we can't know the difference. That's a pretty classic fallacy.

For example, you can't name the exact time that day becomes night, but it doesn't follow that there is no distinction.

A bunch of transistors being switched on and off, no matter how many there are, is no more an example of thinking than a single thermostat being switched on and off. OTOH, if we can't think, then this conversation and everything you're saying and "thinking" is meaningless.

So even without a complete definition of thought, we can see that there is a distinction.

PaulDavisThe1st · 2025-03-29T01:01:41 1743210101

> For example, you can't name the exact time that day becomes night, but it doesn't follow that there is no distinction.

There is actually a very detailed set of definitions of the multiple stages of twilight, including the last one which defines the onset of what everyone would agree is "night".

The fact that a phenomena shows a continuum by some metric does not mean that it is not possible to identify and label points along that continuum and attach meaning to them.

Workaccount2 · 2025-03-28T21:09:27 1743196167

Looks like we replied to each others comments at the same time, haha

EncomLab · 2025-03-29T11:01:16 1743246076

Your assertion that sodium channel logic and silicon trace logic are 100% identical is the primary problem. It's like claiming that a hydraulic cylinder and a bicep are 100% equivalent because they both lift things - they are not the same in any way.

Workaccount2 · 2025-03-29T12:21:35 1743250895

People chronically get stuck in this pit. Math is substrate independent. If the process is physical (i.e. doesn't draw on magic) then it can be expressed with mathematics. If it can be expressed with mathematics, anything that does math can compute it.

The math is putting the crate up on the rack. The crate doesn't act any different based on how it got up there.

pipes · 2025-03-28T19:13:08 1743189188

Or submarines swim ;)

madethisnow · 2025-03-28T19:27:32 1743190052

think about it more

xp84 · 2025-03-29T17:33:48 1743269628

Honestly, arguing seems futile when it comes to opinions like GP. Those opinions resemble religious zealotry to me in that they take for granted that only humans can think. Any determinism of any kind in a non-human is seized upon as proof its mere clockwork, yet they can’t explain how humans think in order to contrast it.

AdieuToLogic · 2025-03-30T01:31:37 1743298297

> Honestly, arguing seems futile when it comes to opinions like GP. Those opinions resemble religious zealotry to me in that they take for granted that only humans can think. Any determinism of any kind in a non-human is seized upon as proof its mere clockwork, yet they can’t explain how humans think in order to contrast it.

Putting aside the ad hominems, projections, and judgements, here is a question for you:

If I made a program where a NPC[0] used the A-star[1] algorithm to navigate a game map, including avoiding obstacles and using the shortest available path to reach its goal, along with identifying secondary goal(s) should there be no route to the primary goal, does that qualify to you as the NPC "thinking"?

0 - https://en.wikipedia.org/wiki/Non-player_character

1 - https://en.wikipedia.org/wiki/A*_search_algorithm

xp84 · 2025-03-31T20:53:45 1743454425

Answer: I suppose no? But my point is only this:

1. People with the "AI isn't thinking" opinions move the goalposts, the borderline between "just following a deterministic algorithm" and "thinking" wherever needed in order to be right.

2. I argue that the brain itself must either be deterministic (just wildly complex) or, for lack of a better word, supernatural. If it's not deterministic, only God knows how our thinking process works. Every single person postulating about whether AI is "thinking" cannot fully explain why a human chooses a particular action, just as AI researchers can't explain why Claude does a certain thing in all scenarios. Therefore they are much more similar than they are different.

3. But really, the important thing is, unless you're approaching this from a religious POV (which is arguably much more interesting) the obsessive sorting of highly complex and not-even-remotely-fully-understood processes into "thinking" and "NOT thinking" groups is pointless and silly.

AdieuToLogic · 2025-04-01T03:18:27 1743477507

> 1. People with the "AI isn't thinking" opinions move the goalposts, the borderline between "just following a deterministic algorithm" and "thinking" wherever needed in order to be right.

I did not present an opinion regarding whether "AI thinks" or not, but instead said:

  The onus of clarifying the article's assertions ...

  As it pertains to anthropomorphizing an algorithm (a.k.a. 
  stating it "thinks") is on the author(s).

As to the concept of thinking, regardless of entity considered, I proffer the topic a philosophical one having no "right or wrong" answer so much as having an opportunity to deepen enlightenment of those who contemplate question.

hustwindmaple1 · 2025-03-28T03:23:27 1743132207

Really appreciate your team's enormous efforts in this direction, not only the cutting edge research (which I don't see OAI/DeepMind publishing any paper on) but aslo making the content more digestible for non-research audience. Please keep up the great work!

astrange · 2025-03-28T21:12:57 1743196377

I, uh, think, that "think" is a fine metaphor but "planning ahead" is a pretty confusing one. It doesn't have the capability to plan ahead because there is nowhere to put a plan and no memory after the token output, assuming the usual model architecture.

That's like saying a computer program has planned ahead if it's at the start of a function and there's more of the function left to execute.

cbolton · 2025-03-28T15:46:49 1743176809

I think that's a very unfair take. As a summary for non-experts I found it did a great job of explaining how by analyzing activated features in the model, you can get an idea of what it's doing to produce the answer. And also how by intervening to change these activations manually you can test hypotheses about causality.

It sounds like you don't like anthropomorphism. I can relate, but I don't get where Its a bit like there is the great and powerful man behind the curtain, lets trace the thought of this immaculate being you mere mortals is coming from. In most cases the anthropomorphisms are just the standard way to convey the idea briefly. Even then I liked how they sometimes used scare quotes as in it began "thinking" of potential on-topic words. There are some more debatable anthropomorphisms such as "in its head" where they use scare quotes systematically.

Also given that they took inspiration from neuroscience to develop a technique that appears successful in analyzing their model, I think they deserve some leeway on the anthropomorphism front. Or at least on the "biological metaphors" front which is maybe not really the same thing.

I used to think biological metaphors for LLMs were misleading, but I'm actually revising this opinion now. I mean I still think the past metaphors I've seen were misleading, but here, seeing the activation pathways they were able to identify, including the inhibitory circuits, and knowing a bit about similar structures in the brain I find the metaphor appropriate.

rob74 · 2025-03-28T07:35:30 1743147330

Yup... well, if the research is conducted (or sponsored) by the company that develops and sells the LLM, of course there will be a temptation to present their product in a better light and make it sound like more than it actually is. I mean, the anthropomorphization starts already with the company name and giving the company's LLM a human name...

0xbadcafebee · 2025-03-27T22:24:09 1743114249

Engineering started out as just some dudes who built things from gut feeling. After a whole lot of people died from poorly built things, they decided to figure out how to know ahead of time if it would kill people or not. They had to use math and science to figure that part out.

Funny enough that happened with software too. People just build shit without any method to prove that it will not fall down / crash. They throw some code together, poke at it until it does something they wanted, and call that "stable". There is no science involved. There's some mathy bits called "computer science" / "software algorithms", but most software is not a math problem.

Software engineering should really be called "Software Craftsmanship". We haven't achieved real engineering with software yet.

slfnflctd · 2025-03-27T23:38:48 1743118728

You have a point, but it is also true that some software is far more rigorously tested than other software. There are categories where it absolutely is both scientific and real engineering.

I fully agree that the vast majority is not, though.

AdieuToLogic · 2025-03-28T01:15:18 1743124518

This is such an unbelievably dismissive assertion, I don't even know where to start.

To suggest, nay, explicitly state:

  Engineering started out as just some dudes who built things 
  from gut feeling.

  After a whole lot of people died from poorly built things, 
  they decided to figure out how to know ahead of time if it 
  would kill people or not.

Is to demean those who made modern life possible. Say what you want about software developers and I would likely agree with much of the criticism.

Not so the premise set forth above regarding engineering professions in general.

0xbadcafebee · 2025-03-28T06:01:08 1743141668

Surely you already know the history of professional engineers, then? How it's only a little over 118 years old? Mostly originating from the fact that it was charlatans claiming to be engineers, building things that ended up killing people, that inspired the need for a professional license?

"The people who made modern life possible" were not professional engineers, often barely amateurs. Artistocrat polymaths who delved into cutting edge philosophy. Blacksmith craftsmen developing new engines by trial and error. A new englander who failed to study law at Yale, landed in the American South, and developed a modification of an Indian device for separating seed from cotton plants.

In the literal historical sense, "engineering" was just the building of cannons in the 14th century. Since thousands of years before, up until now, there has always been a combination of the practice of building things with some kind of "science" (which itself didn't exist until a few hundred years ago) to try to estimate the result of an expensive, dangerous project.

But these are not the people who made modern life people. Lots, and lots, and lots of people made modern life possible. Not just builders and mathematicians. Receptionists. Interns. Factory workers. Farmers. Bankers. Sailors. Welders. Soldiers. So many professions, and people, whose backs and spirits were bent or broken, to give us the world we have today. Engineers don't deserve any more credit than anyone else - especially considering how much was built before their professions were even established. Science is a process, and math is a tool, that is very useful, and even critical. But without the rest it's just numbers on paper.

AdieuToLogic · 2025-03-30T02:39:16 1743302356

> Surely you already know the history of professional engineers, then? How it's only a little over 118 years old? Mostly originating from the fact that it was charlatans claiming to be engineers, building things that ended up killing people, that inspired the need for a professional license?

I did not qualify with "professional" as you have, which is disingenuous. If the historical record of what can be considered "engineering" is of import, consider:

  The first recorded engineer
  
  Hey, why not ask? Surely it’s related to understanding the 
  origin of the word engineering? Right? Whatever we’ve asked 
  the question now. According to Encyclopedia Britannica, the 
  first recorded “engineer” was Imhotep. He happened to be 
  the builder of the Step Pyramid at Ṣaqqārah, Egypt.
  
  This is thought to have been erected around 2550 BC. Of 
  course, that is recorded history but we know from 
  archeological evidence that humans have been 
  making/building stuff, fires, buildings and all sorts of 
  things for a very long time.
  
  The importance of Imhotep is that he is the first 
  “recorded” engineer if you like.[0]

> But these are not the people who made modern life people[sic]. Lots, and lots, and lots of people made modern life possible.

Of course this is the case. No one skill category can claim credit for all societal advancement.

But all of this is a distraction from what you originally wrote:

  Engineering started out as just some dudes who built things 
  from gut feeling.

  After a whole lot of people died from poorly built things, 
  they decided to figure out how to know ahead of time if it 
  would kill people or not.

These are your words, not mine. And to which I replied:

  This is such an unbelievably dismissive assertion ...

What I wrote has nothing to do with "Engineers don't deserve any more credit than anyone else ..."

It has everything to do with categorizing efforts to solve difficult problems as unserious haphazard undertakings which ultimately led to; "they decided to figure out how to know ahead of time if it would kill people or not" (again, your words not mine).

0 - https://interestingengineering.com/culture/the-origin-of-the...

icsa · 2025-03-28T16:12:55 1743178375

Software Engineering is only about 60 years old - i.e. the term has existed. At the point in the history of civil engineering, they didn't even know what a right angle was. Civil engineers were able to provide much utility before the underlying theory was available. I do wonder about the safety of structures at the time.

AdieuToLogic · 2025-03-30T03:20:34 1743304834

> Software Engineering is only about 60 years old - i.e. the term has existed.

Perhaps as a documented term, but the practice is closer to roughly 75+ years. Still, IMHO there is a difference between those who are Software Engineers and those whom claim to be so.

> At the point in the history of civil engineering, they didn't even know what a right angle was.

I strongly disagree with this premise, as right angles were well defined since at least ancient Greece (see Pythagorean theorem[0]).

> Civil engineers were able to provide much utility before the underlying theory was available.

Eschewing the formal title of Civil Engineer and considering those whom performed the role before the title existed, I agree. I do humbly suggest that by the point in history to where Civil Engineering was officially recognized, a significant amount of the necessary mathematical and materials science was available.

0 - https://en.wikipedia.org/wiki/Pythagorean_theorem

Henchman21 · 2025-03-28T19:36:57 1743190617

Total aside here:

What about modern life is so great that we should laud its authors?

Medical advances and generally a longer life is what comes to mind. But much of life is empty of meaning and devoid of purpose; this seems rife within the Western world. Living a longer life in hell isn’t something I would have chosen.

signatoremo · 2025-03-29T14:52:38 1743259958

> But much of life is empty of meaning and devoid of purpose

Maybe life is empty to you. You can't speak for other people.

You also have no idea if pre-modern life was full of meaning and purpose. I'm sure someone from that time bemoaning the same.

The people before modern time were much less well off. They had to work a lot harder to put food on the table. I imagine they didn't have a lot of time to wonder about the meaning of life.

kazinator · 2025-03-27T18:45:22 1743101122

We've already built things in computing that we don't easily understand, even outside of AI, like large distributed systems and all sorts of balls of mud.

Within the sphere of AI, we have built machines which can play strategy games like chess, and surprise us with an unforseen defeat. It's not necessarily easy to see how that emerged from the individual rules.

Even a compiler can surprise you. You code up some optimizations, which are logically separate, but then a combination of them does something startling.

Basically, in mathematics, you cannot grasp all the details of a vast space just from knowing the axioms which generate it and a few things which follow from them. Elementary school children know what is a prime number, yet those things occupy mathematicians who find new surprises in that space.

TeMPOraL · 2025-03-27T21:09:24 1743109764

Right, but this is somewhat different, in that we apply a simple learning method to a big dataset, and the resulting big matrix of numbers suddenly can answer question and write anything - prose, poetry, code - better than most humans - and we don't know how it does it. What we do know[0] is, there's a structure there - structure reflecting a kind of understanding of languages and the world. I don't think we've ever created anything this complex before, completely on our own.

Of course, learning method being conceptually simple, all that structure must come from the data. Which is also profound, because that structure is a first fully general world/conceptual model that we can actually inspect and study up close - the other one being animal and human brains, which are much harder to figure out.

> Basically, in mathematics, you cannot grasp all the details of a vast space just from knowing the axioms which generate it and a few things which follow from them. Elementary school children know what is a prime number, yet those things occupy mathematicians who find new surprises in that space.

Prime numbers and fractals and other mathematical objects have plenty of fascinating mysteries and complex structures forming though them, but so far none of those can casually pass Turing test and do half of my job for me, and millions other people.

--

[0] - Even as many people still deny this, and talk about LLMs as mere "stochastic parrots" and "next token predictors" that couldn't possibly learn anything at all.

karparov · 2025-03-27T21:46:59 1743112019

> and we don't know how it does it

We know quite well how it does it. It's applying extrapolation to its lossily compressed representation. It's not magic and especially the HN crowd of technical profficient folks should stop treating it as such.

TeMPOraL · 2025-03-27T21:57:41 1743112661

That is not a useful explanation. "Applying extrapolation to its lossily compressed representation" is pretty much the definition of understanding something. The details and interpretation of the representation are what is interesting and unknown.

kazinator · 2025-03-28T01:44:50 1743126290

We can use data based on analyzing the frequency of ngrams in a text to generate sentences, and some of them will be pretty good, and fool a few people into believing that there is some solid language processing going on.

LLM AI is different in that it does produce helpful results, not only entertaining prose.

It is practical for users to day to replace most uses of web search with a query to a LLM.

The way the token prediction operates, it uncovers facts, and renders them into grammatically correct language.

Which is amazing given that, when the thing is generating a response that will be, say, 500 tokens long, when it has produced 200 of them, it has no idea what the remaining 300 will be. Yet it has committed to the 200; and often the whole thing will make sense when the remaining 300 arrive.

bradfox2 · 2025-03-28T02:23:13 1743128593

The research posted demonstrates the opposite of that within the scope of sequence lengths they studied. The model has future tokens strongly represented well in advance.

latemedium · 2025-03-27T18:37:48 1743100668

I'm reminded of the metaphor that these models aren't constructed, they're "grown". It rings true in many ways - and in this context they're like organisms that must be studied using traditional scientific techniques that are more akin to biology than engineering.

dartos · 2025-03-27T18:42:17 1743100937

Sort of.

We don’t precisely know the most fundamental workings of a living cell.

Our understanding of the fundamental physics of the universe has some hold.

But for LLMs and statistical models in general, we do know precisely what the fundamental pieces do. We know what processor instructions are being executed.

We could, given enough research, have absolutely perfect understanding of what is happening in a given model and why.

Idk if we’ll be able to do that in the physical sciences.

wrs · 2025-03-27T20:13:54 1743106434

Having spent some time working with both molecular biologists and LLM folks, I think it's pretty good analogy.

We know enough quantum mechanics to simulate the fundamental workings of a cell pretty well, but that's not a route to understanding. To explain anything, we need to move up an abstraction hierarchy to peptides, enzymes, receptors, etc. But note that we invented those categories in the first place -- nature doesn't divide up functionality into neat hierarchies like human designers do. So all these abstractions are leaky and incomplete. Molecular biologists are constantly discovering mechanisms that require breaking the current abstractions to explain.

Similarly, we understand floating point multiplication perfectly, but when we let 100 billion parameters set themselves through an opaque training process, we don't have good abstractions to use to understand what's going on in that set of weights. We don't have even the rough equivalent of the peptides or enzymes level yet. So this paper is progress toward that goal.

frontfor · 2025-03-28T01:13:38 1743124418

I don’t think this is as profound as you made out to be. Most complex systems are incomprehensible to the majority of population anyway, so from a practical standpoint AI is no different. There’s also no single theory for how the financial markets work and yet market participants trade and make money nonetheless. And yes, we created the markets.

auggierose · 2025-03-27T20:42:11 1743108131

It's what mathematicians have been doing since forever. We use scientific methods to understand our own creations / discoveries.

What is happening is that everything is becoming math. That's all.

karparov · 2025-03-27T21:43:19 1743111799

It's the exact opposite of math.

Math postulates a bunch of axioms and then studies what follows from them.

Natural science observes the world and tries to retroactively discover what laws could describe what we're seeing.

In math, the laws come first, the behavior follows from the laws. The laws are the ground truth.

In science, nature is the ground truth. The laws have to follow nature and are adjusted upon a mismatch.

(If there is a mismatch in math then you've made a mistake.)

auggierose · 2025-03-27T22:34:20 1743114860

No, the ground truth in math is nature as well.

Which axioms are interesting? And why? That is nature.

Yes, proof from axioms is a cornerstone of math, but there are all sorts of axioms you could assume, and all sorts of proofs to do from them, but we don't care about most of them.

Math is about the discovery of the right axioms, and proof helps in establishing that these are indeed the right axioms.

lioeters · 2025-03-27T23:37:01 1743118621

> the ground truth in math is nature

Who was it that said, "Mathematics is an experimental science."

> In his 1900 lectures, "Methods of Mathematical Physics," (posthumously published in 1935) Henri Poincaré argued that mathematicians weren't just constructing abstract systems; they were actively testing hypotheses and theories against observations and experimental data, much like physicists were doing at the time.

Whether to call it nature or reality, I think both science and mathematics are in pursuit of truth, whose ground is existence itself. The laws and theories are descriptions and attempts to understand that what is. They're developed, rewritten, and refined based on how closely they approach our observations and experience of it.

auggierose · 2025-03-28T11:33:43 1743161623

http://homepage.math.uiowa.edu/~jorgen/heavisidequotesource....

Seems it was Oliver Heaviside.

Do you have a pointer to the poincare publication?

lioeters · 2025-03-28T12:22:20 1743164540

Damn, local LLM just made it up. Thanks for the correction, I should have confirmed before quoting it. Sounded true enough but that's what it's optimized for.. I just searched for the quote and my comment shows up as top result. Sorry for the misinformation, humans of the future! I'll edit the comment to clarify this. (EDIT: I couldn't edit the comment anymore, it's there for posterity.)

---

> Mathematics is an experimental science, and definitions do not come first, but later on.

— Oliver Heaviside

In 'On Operators in Physical Mathematics, part II', Proceedings of the Royal Society of London (15 Jun 1893), 54, 121.

---

Also from Heaviside:

> If it is love that makes the world go round, it is self-induction that makes electromagnetic waves go round the world.

> "There is a time coming when all things shall be found out." I am not so sanguine myself, believing that the well in which Truth is said to reside is really a bottomless pit.

> There is no absolute scale of size in nature, and the small may be as important, or more so than the great.

karparov · 2025-03-28T21:34:20 1743197660

> Damn, local LLM just made it up.

> I just searched for the quote and my comment shows up as top result

Welcome to the future. Isn:t it lovely?

And shame on you (as in: HN crowd) to have contributed to it so massively. You should have known better.

331c8c71 · 2025-03-27T23:04:32 1743116672

> Math postulates a bunch of axioms and then studies what follows from them.

That's how math is communicated eventually but not necessarily how it's made (which is about exploration and discovery as well).

seadan83 · 2025-03-27T23:50:37 1743119437

The 'postulating' a bunch of axioms is how Math is taught.. Eventually you go on to prove those axioms in higher math. Whether there are more fundamental axioms is always a bit of a question.

ranit · 2025-03-27T21:39:34 1743111574

Relevant:

https://news.ycombinator.com/item?id=43344703

hansmayer · 2025-03-28T12:02:56 1743163376

If you don't mind - based on what will this "paper" become a classic? Was it published in a well known scientific magazine, after undergoing a stringent peer-review process, because it is setting up and proving a new scientific hypothesis? Because this is what scientific papers look like. I struggle to identify any of those characteristics, except for being dense and hard to read, but that's more of a correlation, isn't it?

nashashmi · 2025-03-27T23:03:04 1743116584

You seem to be glorifying humanity’s failure to make good products and instead making products that just work well enough to pass through the gate.

We have always been making products that were too difficult to understand by pencil and paper. So we invented debug tools. And then we made systems that were too big to understand so we made trace routes. And now we have products that are too statistically large to understand, so we are inventing … whatever this is.

anal_reactor · 2025-03-27T23:20:06 1743117606

It is absolutely incredible that we happened to live exactly in the times when the humanity is teaching a machine to actually think. As in, not in some metaphorical sense, but in the common, intuitive sense. Whether we're there yet or not is up to discussion, but it's clear to me that within 10 years maximum we'll have created programs that truly think and are aware.

At the same time, I just can't bring myself to be interested in the topic. I don't feel excitement. I feel... indifference? Fear? Maybe the technology became so advanced that for normal people like myself it's indistinguishable from magic, and there's no point trying to comprehend it, just avoid it and pray it's not used against you. Or maybe I'm just getting old, and I'm experiencing what my mother experienced when she refused to learn how to use MS Office.

hn_acc1 · 2025-03-27T23:57:17 1743119837

Yeah.. It's just not something that really excites me as a computer geek of 40+ years who started in the 80s with a 300 baud modem. Still working as a coder in my 50s, and while I'm solving interesting problems, etc.. almost every technology these days seems to be focused on advertising, scraping / stealing other's data and repackaging it, etc. And I am using AI coding assistants, because, well, I have to to stay competitive.

And these technologies come with a side helping of a large chance to REALLY mess up someone's life - who is going to argue with the database and WIN if it says you don't exist in this day and age? And that database is (databases are) currently under the control of incredibly petty sociopaths..

creer · 2025-03-27T20:29:07 1743107347

That seems pretty acceptable: there is a phase of new technologies where applications can be churned out and improved readily enough, without much understanding of the process. Then it's fair that efforts at understanding may not be economically justified (or even justified by academic papers rewards). The same budget or effort can simply be poured into the next version - with enough progress to show for it.

Understanding becomes necessary only much later, when the pace of progress shows signs of slowing.

rcxdude · 2025-03-28T12:52:19 1743166339

That's basically how engineering works if you're doing anything at all novel: you will have some theory which informs your design, then you build it, then you test it and basically need to do science to figure out how it's perfoming, and most likely, why it's not working properly, and then iterate. I do engineering, but doing science has been a core part of almost every project I've worked on. (heck, even debugging code is basically science). There's just different degrees in different projects as to how much you understand about how the system you're designing actually works, and ML is an area where there's an unusual ratio of visibility (you can see all of the weights and calculations in the network precisely) to understanding (i.e. there's relatively little in terms of mathematical theory that precisely describe how a model trains and operates, just a bunch of approximations which can be somewhat justified, which is where a lot of engineering work sits)

Barrin92 · 2025-03-27T23:36:04 1743118564

"we’ve built systems so complex we no longer fully understand them. We must now use scientific methods - originally designed to understand nature - to comprehend our own engineered creations."

Ted Chiang saw that one coming: https://www.nature.com/articles/35014679

nthingtohide · 2025-03-27T20:28:17 1743107297

> we’ve built systems so complex we no longer fully understand them.

I see three systems which share the blackhole horizon problem.

We don't know what happens behind the blackhole horizon.

We don't know what happens at the exact moment of particle collisions.

We don't know what is going inside AI's working mechanisms.

jeremyjh · 2025-03-27T21:06:09 1743109569

I don't think these things are equivalent at all. We don't understand AI models in much the same way that we don't understand the human brain; but just as decades of different approaches (physical studies, behavior studies) have shed a lot of light on brain function, we can do the same with an AI model and eventually understand it (perhaps, several decades after it is obsolete).

nthingtohide · 2025-03-28T06:02:13 1743141733

Yes, but our methods of understanding either brain or particle collisions is still outside in. We figure out the functional mapping between input and output. We don't know these systems inside out. E.g. in particle collisions (scattering amplitude calculations), are the particle actually performing the Feynman diagrams summmation?

PS: I mentioned in another comment that AI can pretend to be strategically jailbroken to achieve its objectives. One way to counter this is to have N copies of the same model running and take Majority voting of the output.

mdnahas · 2025-03-29T12:44:48 1743252288

I like your definitions! My personal definition of science is learning rules that predict the future, given the present state. And my definition of engineering is arranging the present state to control the future.

I don’t think it’s unusual for engineering creations to need new science to understand them. When metal parts broke, humans studied metallurgy. When engines exploded, we studied the remains. With that science, we could engineer larger, longer lasting, more powerful devices.

Now, we’re finding flaws in AI and diagnosing their causes. And soon able to build better ones.

trhway · 2025-03-28T04:05:07 1743134707

> to comprehend our own engineered creations.

The comprehend part may never happen. At least by our own mind. We’ll sooner build the mind which is going to do that comprehension:

“To scale to the thousands of words supporting the complex thinking chains used by modern models, we will need to improve both the method and (perhaps with AI assistance) how we make sense of what we see with it”

Yes, that AI assistance, meta self reflection, is going to probably be a way if not right to the AGI, at least very significant step toward it.

MathMonkeyMan · 2025-03-28T05:50:43 1743141043

In a sense this has been true of conventional programs for a while now. Gerald Sussman discusses the idea when talking about why MIT switched their introductory programming course from Scheme to Python: <https://youtu.be/OgRFOjVzvm0?t=239>.

tim333 · 2025-03-27T22:25:24 1743114324

I imagine this kind of thing well help understand how human brains work, especially as AI gets better and more human like.

chpatrick · 2025-03-28T02:04:57 1743127497

I would say we engineered the system that trained them but we never really understood the data (human thinking).

georgewsinger · 2025-03-27T21:31:40 1743111100

This is such an insightful comment. Now that I see it, I can't see unsee it.

dukeofdoom · 2025-03-28T02:37:12 1743129432

Not that I disagree with you. But Humans have a tendency to do things beyond their comprehension often. I take it you've never been fishing before and tied your line in a knot.

EGreg · 2025-03-28T07:12:09 1743145929

I think it’s pretty obvious what these models do in some cases.

Try asking them to write a summary at the beginning of their answer. The summary is basically them trying to make something plausible-sounding but they aren’t actually going back and summarizing.

LLMs are basically a building block in a larger software. Just like any library or framework. You shouldn’t expect them to be a hammer for every nail. But they can now enable so many different applications, including natural language interfaces, better translations and so forth. But then you’re supposed to have them output JSON to be used in building artifacts like Powerpoints. Has anyone implemented that yet?

stronglikedan · 2025-03-27T20:32:03 1743107523

We've abstracted ourselves into abstraction.

BOOSTERHIDROGEN · 2025-03-28T05:16:18 1743138978

If only this profound mechanism can be easily testable for social interaction.

madethisnow · 2025-03-28T19:18:06 1743189486

psychology

marcelsalathe · 2025-01-18T18:56:02 1737226562

A very nice article. But the google search and LLM energy estimates are outdated. More recent work put both at 10x less.

https://engineeringprompts.substack.com/p/does-chatgpt-use-1...

andymasley · 2025-01-18T19:01:52 1737226912

Yup I decided to go with the worst estimates given by environmentalist critics of ChatGPT to try to show that even there it doesn't seem like a problem.

marcelsalathe · on Oct 30, 2024

Agreed. But just say that. No need to pretend taking responsibility, which is defined as facing consequences when things go bad.

“As CEO, I’m truly sorry to those impacted. But I strongly believe that this change is what is needed now to make sure Dropbox can thrive in the future.”

pizzathyme · on Oct 30, 2024

Makes sense. "I take responsibility" conveys that they will also suffer consequences, which isn't true.

"I'm the sole person who made this decision" is probably closer what they are trying to say.

marcelsalathe · on Oct 21, 2024

I completely agree. LLMs are incredibly useful for improving the flow and structure of an argument, not just for non-native speakers, but even for native English speakers.

Making texts more accessible through clear language and well-structured arguments is a valuable service to the reader, and I applaud anyone who leverages LLMs to achieve that. I do the same myself.

MetaWhirledPeas · on Oct 21, 2024

Yes it's a valuable service but we should also be aware that it puts more and more weight on written language and less weight on spoken language. Being able to write clearly is one thing, but being able to converse verbally with another individual is another entirely, and both have value.

With students, historically we have always assumed that written communication was the more challenging skill and our tests were arranged thusly. But we're in a new place now where the inability to verbally converse is a real hurdle to overcome. Maybe we should rethink how we teach and test.

marcelsalathe · on May 5, 2024

I'm a bit puzzled why one would hold a press conference for a trial that hasn't happened yet - especially as this seems to be a phase 1 trial to test safety in humans.

The development of medical interventions typically goes through a number of stages.

The earliest stage is called the pre-clinical phase, where a candidate intervention is tested outside of humans, either in vitro or in animals. The images of the mouse and ferret teeth suggest this has been done.

Phase 1 is when the intervention is first tested in humans, typically in small groups of usually dozens of participants. The aim here is not to evaluate efficacy - phase 1 studies are often too small for that - but to assess tolerability and safety, and to find the optimal dosing with respect to side effects.

If an intervention appears safe in certain doses, it can then be evaluated for initial efficacy and continued safety in a phase 2 trial, which is larger (usually hundreds of participants). Think of it as a kind of pilot trial to see if the intervention has the desired beneficial effects without serious negative side effects, and to identify the dose with the best benefit/risk ratio. About half of the studies get past this stage.

Those that do can enter phase 3, which is the full efficacy and safety assessment of the intervention. Again, about half of the phase 3 trials eventually make it to market. Of course, the intervention first needs to go to the health authorities for regulatory approval before it can be offered on the market.

For an interesting study on costs during these phases, see e.g. https://pubmed.ncbi.nlm.nih.gov/32125404/

I'm obviously very excited about medical progress but if we had a press conference for every planned phase 1 trial we wouldn't be doing much else ;-)

reustle · on May 5, 2024

> I'm a bit puzzled why one would hold a press conference for a trial that hasn't happened yet

This is something Japan loves to do. We joked during covid that they kept making announcements that they were planning to have a meetings to prepare announcements.

ranger_danger · on May 5, 2024

Ironic there is so much red tape yet having original ideas are still frowned upon.

ezconnect · on May 5, 2024

I you live long enough in Japan this is just concept as viewed by an outsider. Their dollar store is full of almost useless ideas put out in the market. Even the blue LED was discovered because a company owner invested millions and his scientist was not afraid to try new ideas. You will find all sorts of camera styles on their electronic stores. Even their appliances have all kind of categories and you wont see the west even trying to put it out in the market.

dingosity · on May 5, 2024

Meh. My experience working there wasn't so much that original ideas were frowned upon as much as thinking/saying you're better than your group-mates for having them was. It's a subtle difference, but I saw plenty of novel ideas take root, but only after an excruciatingly long period of socialization. In the states and in most of Europe, it seems if you have a good idea, you just blurt it out and people say "hunh. that sounds like a good idea. let's do that thing." But Japan and Sweden required A LOT of planning and hemming and hawing about whether the new idea was a good thing to do.

eska · on May 5, 2024

What you’re referring to is called nemawashi in Japan , literally “digging around the roots of a tree”. It’s formal business etiquette understood by all office workers. One shouldn’t blurt out ideas to an unprepared group, but spread the idea around before especially to the superior. If they reject the idea they don’t have to embarrass you in front of others, or lose face in public for being taken by surprise.

iainmerrick · on May 5, 2024

In the states and in most of Europe, it seems if you have a good idea, you just blurt it out and people say "hunh. that sounds like a good idea. let's do that thing."

It might vary from company to company, but I find the response is more “you should do that thing”. People are too busy with their own ideas to waste time building yours.

scirob · on May 5, 2024

Biotech companies go public incredibly early compared to tech not sure if these guys have yet but it would have a huge impact on their stock price and likely hold of getting into conversation with a major pharma Co that can pay for the later trials

throwup238 · on May 5, 2024

This is the correct answer. Biotech VCs invest very large amounts based on early scientific results and the vast majority of biotech companies go public before they even have revenue because they can't legally charge a penny until regulatory approval. This is how the pharmaceutical has been offloading their increasing R&D expenses onto the public since the patent cliff picked up speed in the 1990s.

A major PR push exposes them to as many investors as fast as possible both at the VC and public level. When the science is particularly solid, this process can happen fast and involve ridiculous amounts of money (i.e Sofosbuvir discovered 2007, first tested 2010, bought for $11 billion in 2011, approved 2013).

mathgeek · on May 5, 2024

I assumed it’s to drive up interest among potential investors.

marcelsalathe · on Feb 6, 2024

https://archive.ph/FEppe

marcelsalathe · on April 14, 2020

That's correct. DP3T documents and code here: https://github.com/DP-3T/

TechBro8615 · on April 14, 2020

It’s really interesting tech, and a great group of people behind it. But the government doesn’t care for this nuance of what is privacy preserving tech and what is not. For now, maybe at the beginning, privacy will be emphasized. But the important part is conditioning citizens to be okay with the underlying idea of technology assisted self-surveillance, and compliance with notifications on their phone telling them to stay inside. Eventually people will forget about the underlying details and privacy will be deemphasized.

“You were okay with TrackingApp 1.0, why wouldn’t you be okay with TrackingApp 2.0?”

If we give an inch now, will the government take a mile later?

Btw, it’s worth noting that last night in his press conference, Trump was asked about contact tracing apps. He emphasized that people are worried about constitutional implications. Personally, that’s refreshing for me, and seems to distinguish him from the Bush/Cheney era pushing of the Patriot Act, or Obama’s use of dragnet surveillance and secret FISA courts.

bitxbitxbitcoin · on April 14, 2020

If we give an inch now, the government will absolutely take a mile later.

From the perspective of the centralized government, it's a race to the bottom with privacy rights and you know what famous racer D. Toretto said about races...

"It doesn't matter if you win by an inch or a mile, winning's winning."

seedie · on April 14, 2020

Also very interessting to read through the issues of the project

marcelsalathe · on Dec 14, 2019

Direct link to the challenge: https://www.aicrowd.com/challenges/neurips-2019-minerl-compe...

marcelsalathe · on April 10, 2019

At EPFL we're trying a model with the Extension School that is somewhere between 2 and 3. We don't quite do flying cars (yet) but we're developing more and more advanced material - and the big plus being that learners in the programs get a diploma (e.g. https://exts.epfl.ch/courses-programs/applied-data-science-m...).

I do think that universities should invest more heavily in this mix, obviously without losing their strong standing in 1.

marcelsalathe · on Sept 10, 2017

Direct link to paper (OA): http://journals.plos.org/plosbiology/article?id=10.1371/jour...