So "the sky is blue" converts to the tokens [1820, 13180, 374, 6437]
And "le ciel est bleu" converts to the tokens [273, 12088, 301, 1826, 12704, 84]
Then the embeddings vectors created from these are very similar, despite the letters having very little in common.
So "the sky is blue" converts to the tokens [1820, 13180, 374, 6437]
And "le ciel est bleu" converts to the tokens [273, 12088, 301, 1826, 12704, 84]
Then the embeddings vectors created from these are very similar, despite the letters having very little in common.