Ever since Windows 95 Long File Names for FAT, filenames have been 16-bit characters in their on-disk format. So passing "bytes" means that they need to become wide characters before the filesystem can act on them. And case-sensitivity is still applied, stupidly enough, using locale-specific rules. (Change your locale, and you change how case-insensitive filenames work!)
It is possible to request for a directory to contain case-sensitive files though, and the filesystem will respect that. And if you use the NT Native API, you have no restrictions on filenames, except for the Backslash character. You can even use filenames that Win32 doesn't allow (name with a ":", name with a null byte, file named "con" etc), and every Win32 program will break badly if it tries to access such a file.
It's also possible to use unpaired surrogate characters (D800-DFFF without the matching second part) in a filename. Now you have a file on the disk whose name can't be represented in UTF-8, but the filename is still sitting happily in the filesystem. So people invented "WTF-8" encoding to allow those characters to be represented.
> And case-sensitivity is still applied, stupidly enough, using locale-specific rules. (Change your locale, and you change how case-insensitive filenames work!)
AFAIK, it's even worse: it uses the rules for the locale which was in use when the filesystem was created (it's stored in the $UpCase table in NTFS, or its equivalent in EXFAT). So you could have different case-insensitive rules in a single system, if it has more than one partition and they were formatted with different locales.
IMO, case-insensitive filesystems are an abomination; the case-insensitivity should have been done in the user interface layer, not in the filesystem layer.
> IMO, case-insensitive filesystems are an abomination; the case-insensitivity should have been done in the user interface layer, not in the filesystem layer.
Implementing case-insensitivity in a file picker or something is OK, but doing that throughout your app's runtime is insane since you'd have to hook every file open and then list the directory, whereas in a file picker you're probably listing the directory anyways.
Did not know about $UpCase, the only part I knew was that the FAT16/32 driver from Microsoft (Which has the source code officially released, it's used as an example for how to implement a filesystem on Windows NT) uses locale-specific case-sensitivity tests.
You're right, in the case of FAT16/FAT32 it AFAIK has to use the current system locale, since unlike EXFAT or NTFS there isn't a place in the filesystem to store that locale table.
It is possible to request for a directory to contain case-sensitive files though, and the filesystem will respect that. And if you use the NT Native API, you have no restrictions on filenames, except for the Backslash character. You can even use filenames that Win32 doesn't allow (name with a ":", name with a null byte, file named "con" etc), and every Win32 program will break badly if it tries to access such a file.
It's also possible to use unpaired surrogate characters (D800-DFFF without the matching second part) in a filename. Now you have a file on the disk whose name can't be represented in UTF-8, but the filename is still sitting happily in the filesystem. So people invented "WTF-8" encoding to allow those characters to be represented.