Charsets experiments
/!\ CPU intensive /!\
Comparing non-ASCII characters on legacy charsets
Unicode: all code points (0x0 - 0x10FFFF)
- 00000-0FFFF, 10000-1FFFF,
20000-2FFFF, 30000-3FFFF,
40000-4FFFF, 50000-5FFFF,
60000-6FFFF, 70000-7FFFF,
80000-8FFFF, 90000-9FFFF,
A0000-AFFFF, B0000-BFFFF,
C0000-CFFFF, D0000-DFFFF,
E0000-EFFFF, F0000-FFFFF,
100000-10FFFF
- all (8MB)
Unicode 6.3: all assigned code points
How many bytes are used by each assigned Unicode character in each encoding
Unicode 8 glyphs names
kDefinitions of all Han code points in Unicode 8