Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>To me, what your example really shows is the problem with incorrect default values, not a problem with encoding data into a key per se. If they'd chosen a non-date for unknown values, maybe 00 or 99 for day or month components, then the issue you described would disappear.

You still have that problem from organic birthdays and also the problem of needing to change ids to correct birth dates.





To add to that, birthdays can clump, just like any seemingly "random" data.

Not significantly. For actual births, a couple holidays have very low rates but clumping into much higher rates happens on no dates.

A million dots scattered randomly over a graph can all land on the exact same coordinate if it’s truly random.

What most people intuit as random is some sort of noise function that is generally dispersed and doesn’t trigger the pattern matching part of their brain


> A million dots scattered randomly over a graph can all land on the exact same coordinate if it’s truly random.

It won't happen though. 0.00000000% chance it happens even once in a trillion attempts.

> What most people intuit as random is some sort of noise function that is generally dispersed and doesn’t trigger the pattern matching part of their brain

Yes, people intuit the texture of random wrong in a situation where most buckets are empty. But when you have orders of magnitude more events than buckets, that effect doesn't apply. You get pretty even results that people expect.


> It won't happen though. 0.00000000% chance it happens even once in a trillion attempts.

It has the same odds as any other specific configuration of randomly assigned dots. The overly active human pattern matching behavior is the only reason it would be treated as special.


>It has the same odds as any other specific configuration of randomly assigned dots

Which doesn't change anything in practice, since it having "the same odds as any other specific configuration" ignores the fact that more scattered configurations are still far more numerous than it (or even from ones with more visual order in general) taken all together.

>The overly active human pattern matching behavior is the only reason it would be treated as special.

Nope, it's also the fact that it is ONE configuration, whereas all the rest are much much larger number. That's enough to make this specific configuration ultra rare in comparison (since we don't compare it to each other but to all others put together).


> >It has the same odds as any other specific configuration of randomly assigned dots

> Nope, it's also the fact that it is ONE configuration, whereas all the rest are much much larger number.

That is the human pattern overactive pattern matching at play. I compared the single configuration of all dots on one location to any other specific configuration. You are not comparing to to _every other configuration_ because they are not the same

You are assigning specific importance to a single valid set of randomly selected data, because it seems significant to our brains.

If I asked you to give me an array of 1 million items containing an x, and y coordinate, what are the odds that any single specific set of items are returned?

Based on your answer to that, what are the odds for a set being return with all the same exact x and y coordinates, and a set with different x, and y coordinates?

if you answer anything other than it being the same chance, then you either don't think the selection mechanism is random, or you are falling to the standard fallacies around randomness


>That is the human pattern overactive pattern matching at play. I compared the single configuration of all dots on one location to any other specific configuration. You are not comparing to to _every other configuration_ because they are not the same. You are assigning specific importance to a single valid set of randomly selected data, because it seems significant to our brains.

That's just how importance works.

It sets some things aside as "significant to our brains". The universe doesn't care, even total heat death is not "important" if one excluses us making a prioritization of things.

Given our classification of orderly configurations as a distinct set, the comparison is between "any from all random-looking/noise-like configuration" vs "any from all orderly-like configurations". And the former are much more.

>if you answer anything other than it being the same chance, then you either don't think the selection mechanism is random, or you are falling to the standard fallacies around randomness

You're confusing the selection mechanism (random) with the classification mechanism that segments the set of possible outcomes into orderly vs not (not random).

As a simpler example, imagine a bag with N loterry numbers on individual cards. If they pick one at random, the chance any number has is 1/N. But the chance that a number OTHER than ours has is N-1/N. Our chances are as good as any other single number's, sure. But they're NOT as good as all other numbers put together.

You're argue that "but all are just sets of coordinates" or "all are just lottery numbers".

Sure, but some of those coordinate sets have importance to us, and others don't. And one of these lottery numbers is important t us, all the others aren't. And since the latter is a much larger group, the posibility of a member of it coming up is too.

That we consider one subset of results more special than the other is not negotiable. It's a thing we actually do in the real world, and it's the premise of the whole discussion.


Lol, reminds me of a story: at his workplace my brother was invited to join a lottery ticket pool where each got to pick the numbers for a ticket. The numbers he picked were 1-2-3-4-5-6. Although the others, mostly fellow engineers, reluctantly agreed his numbers were as likely as the others, after a couple of weeks they neglected to invite him again.

Entropy says it's special. If you have a million dots and 10,000 coordinates, you have 10,000 ways for all the dots to land in the same coordinate, and a zillion kavillion stupillion ways to have somewhere near 100 dots in each coordinate.

No, if its randomly distributed then every specific configuration has the same exact chance of happening.

I am laughing at all the people coming out of the woodwork to reply to my original post in this thread misunderstanding randomness and chance.

If you flip a coin a million times and it lands on head every single time, the millionth and 1 time still has a 50/50 chance of landing on heads


> every specific configuration

Who said anything about specific configurations?

We started this talking about whether things "clump" or not. The result depends on your definition of "clump" but let's say it involves a standard deviation. Different standard deviations have wildly different probabilities, even when every specific configuration has the same probability.

Nobody responding to you is calculating things wrong. We're talking about the shape of the data. Categories. And those categories are different sizes, because they have different numbers of specific configurations in them.

> the millionth and 1 time

I don't see any connection between the above discussion and the gambler's fallacy?


No, because most likely the coin wasn’t a fair coin then, or there was some other bias going on

Im talking about true random. If you believe there is a bias, then you dont believe its a random selection



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: