Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unicode is not known for its consistency in dealing with these issues. The original idea behind Unicode was to be able to represent every then-extant character set with perfect fidelity (i.e., go from X to Unicode and back, and you should get the same data). Why are there letters like U+212B Angstrom sign (not to be confused with U+00C5 Latin capital A with ring above) or things like half-width and full-width characters? Because they were present in Shift-JIS, not because of any coherent notion of what constitutes a glyph. Han unification was driven more by the need to keep from blowing a space budget than by actual rationalization of whether or not the scripts deserved separate spaces.

Note that Klingon isn't in Unicode (it was explicitly rejected by the UTC, with a vote of 9 in favor of the rejection proposal, 0 against it, and 1 abstaining). Tengwar and Cirth, though, are actually considered serious proposals for Unicode, just really, really low priority compared to, say, Mayan script (for which the first proposal should be going live in 2017). Mayan script is interesting in its own right because it's the script (well, of the ones I'm aware of) that most challenges normal conventions on what constitutes letters and glyphs.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: