Table Comparing Characters in Windows-1252, ISO-8859-1, ISO-8859-15
Table used for debugging common ISO-8859-1 character encoding problems
utf8  windows-1252  cp1252  iso-8859-1 
yesterday by tekai
BurntSushi/bstr: A string type for Rust that is not required to be valid UTF-8.
A string type for Rust that is not required to be valid UTF-8. - BurntSushi/bstr
rust  utf8 
16 days ago by geetarista
Monodraw for macOS — Helftone
Powerful ASCII art editor designed for the Mac.
diagram  macos  ascii  utf8  flow  chart 
17 days ago by pyrho
The Tragedy of UCS-2
Apropos of nothing, I'd like to tell you a tale. It's not an original tale, but it's one of my favorites.
history  unicode  technical  ucs2  utf8  utf-8  sun  microsoft  next  java  javascript  legacy 
19 days ago by xer0x
python - UnicodeEncodeError: 'ascii' codec can't encode character u'xa0' in position 20: ordinal not in range(128) - Stack Overflow | https://stackoverflow.com/
So, even if your format string is Unicode, you *still* need to encode it?


<code class="language-python">
print(u"{}\u00a0{}\u00a0{}\u00a0".format('non', 'breaking', 'spaces').encode.('utf-8'))
This is a classic python unicode pain point! Consider the following:
a = u'bats\u00E0'
print a
=> batsà

All good so far, but if we call str(a), let's see what happens:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 4: ordinal not in range(128)

Oh dip, that's not gonna do anyone any good! To fix the error, encode the bytes explicitly with .encode and tell python what codec to use:
=> 'bats\xc3\xa0'
print a.encode('utf-8')
=> batsà
python  unicode  stringconcatenation  encoding  utf8  solution  reference 
29 days ago by kme
Truths programmers should know about case
"But I hinted, briefly, at the deeper complexity of case in Unicode, and I want to take some time to talk about that in more detail, because it’s interesting and because understanding it can help you make better choices when designing and writing code that processes text. So here, in opposition to “falsehoods programmers believe”, is my inaugural “truths programmers should know”, on the topic of case."
programming  unicode  utf8  i18n  l10n  language 
8 weeks ago by mechazoidal

