jm + heat-maps   1

A Programmer’s Introduction to Unicode – Nathan Reed’s coding blog
Fascinating Unicode details -- a lot of which were new to me. Love the heat map of usage in Wikipedia:
One more interesting way to visualize the codespace is to look at the distribution of usage—in other words, how often each code point is actually used in real-world texts. Below is a heat map of planes 0–2 based on a large sample of text from Wikipedia and Twitter (all languages). Frequency increases from black (never seen) through red and yellow to white.

You can see that the vast majority of this text sample lies in the BMP, with only scattered usage of code points from planes 1–2. The biggest exception is emoji, which show up here as the several bright squares in the bottom row of plane 1.
unicode  coding  character-sets  wikipedia  bmp  emoji  twitter  languages  characters  heat-maps  dataviz 
17 days ago by jm

Copy this bookmark: