So, what ‘algorithms’ are you talking about? The randomness comes from the input...

orbital-decay · 2025-04-30T13:40:20 1746020420

The raw output of a transformer model is a list of logits, confidence scores for each token in its vocabulary. It's only deterministic in this sense (same input = same scores). But it can easily assign equal scores to 1 and 0 and zero to other tokens, and you'll have to sample it randomly to produce the result. Whether you consider it external or internal doesn't matter, transformers are inherently probabilistic by design. Randomness is all they produce. And typically they aren't trained with the case of temperature 0 and greedy sampling in mind.

kurikuri · 2025-05-10T18:54:45 1746903285

> But it can easily assign equal scores to 1 and 0 and zero to other tokens, and you’ll have to sample it randomly to produce the result. Whether you consider it external or internal doesn’t matter, transformers are inherently probabilistic by design.

The transformer is operating on the probability functions in a fully deterministic fashion, you might be missing the forest for the trees here. In your hypothetical, the transformer does not have a non-deterministic way of selecting the 1 or 0 token, so it will rely on a noise source which can. It does not produce any randomness at all.

orbital-decay · 2025-05-22T06:37:14 1747895834

It's one way to look at it, but consider that you need the noise source in case 1 and 0 are strictly equal, necessarily. You can't tell which one is the answer until you decided randomly.

kurikuri · 2025-05-23T04:53:22 1747976002

Right, so the LLM needs some randomness to make that decision. The LLM performs a series of deterministic operations until it needs the randomness to make this decisions, there is no randomness within the LLM itself.