It occurs to me that using LLMs for compression could, in principle, allow lossy compression of text. If a sequence of tokens happens to be costly to encode (in terms of bits), the compressor might be able to replace that sequence with a cheaper sequence that has a very similar meaning within the context. I don't imagine it would be very useful for anything but it's interesting to think about.
It occurs to me that using LLMs for compression could, in principle, allow lossy compression of text. If a sequence of tokens happens to be costly to encode (in terms of bits), the compressor might be able to replace that sequence with a cheaper sequence that has a very similar meaning within the context. I don't imagine it would be very useful for anything but it's interesting to think about.