utf8   4570

« earlier    

Table Comparing Characters in Windows-1252, ISO-8859-1, ISO-8859-15
Table used for debugging common ISO-8859-1 character encoding problems
utf8  windows-1252  cp1252  iso-8859-1 
yesterday by tekai
BurntSushi/bstr: A string type for Rust that is not required to be valid UTF-8.
A string type for Rust that is not required to be valid UTF-8. - BurntSushi/bstr
rust  utf8 
16 days ago by geetarista
Monodraw for macOS — Helftone
Powerful ASCII art editor designed for the Mac.
diagram  macos  ascii  utf8  flow  chart 
17 days ago by pyrho
The Tragedy of UCS-2
Apropos of nothing, I'd like to tell you a tale. It's not an original tale, but it's one of my favorites.
history  unicode  technical  ucs2  utf8  utf-8  sun  microsoft  next  java  javascript  legacy 
19 days ago by xer0x
python - UnicodeEncodeError: 'ascii' codec can't encode character u'xa0' in position 20: ordinal not in range(128) - Stack Overflow | https://stackoverflow.com/
So, even if your format string is Unicode, you *still* need to encode it?


<code class="language-python">
print(u"{}\u00a0{}\u00a0{}\u00a0".format('non', 'breaking', 'spaces').encode.('utf-8'))
This is a classic python unicode pain point! Consider the following:
a = u'bats\u00E0'
print a
=> batsà

All good so far, but if we call str(a), let's see what happens:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 4: ordinal not in range(128)

Oh dip, that's not gonna do anyone any good! To fix the error, encode the bytes explicitly with .encode and tell python what codec to use:
=> 'bats\xc3\xa0'
print a.encode('utf-8')
=> batsà
python  unicode  stringconcatenation  encoding  utf8  solution  reference 
29 days ago by kme
Truths programmers should know about case
"But I hinted, briefly, at the deeper complexity of case in Unicode, and I want to take some time to talk about that in more detail, because it’s interesting and because understanding it can help you make better choices when designing and writing code that processes text. So here, in opposition to “falsehoods programmers believe”, is my inaugural “truths programmers should know”, on the topic of case."
programming  unicode  utf8  i18n  l10n  language 
8 weeks ago by mechazoidal

« earlier    

related tags

!b-🌺-free/libre-and-open-source-software  2003  alphabet  archive  art  article  ascii  bash  bestpractices  brillianthacks  buffer  c  case_folding  centos  character  character_encoding  character_sets  characterencoding  characters  characterset  charset  charset_encoding  chart  check  clean  cli  code  collation  comparison  compsci  content  conversion  cp1252  cs  csv  dammitbrain  data  database  db  debug  deep-learning  design  dev  development  devtools  diagram  dll  docker  ed  editor  emacs  email  emoji  encoding  encryption  example  exampleused  excel  export  floss  flow  fonts  foreign  formatting  ftfy  funny  git  golang  grep  guide  history  howto  html  html5  i18n  iconv  import  important  indexing  internationalization  internet  interop  iso-8859-1  issue  java  javascript  l10n  language  learn  legacy  library  linux  list  locale  localization  lookup  macos  meta  microsoft  modeline  mojibake  mutt  mysql  mysqldump  names  next  nginx  nlp  note  notepad++  novice  online  opensource  osx  overflow  package  pcretest  pdf  performance  perl  php  plaintext  pointer  primer  programming  punycode  python  rdbms  reddit  reference  regex  reportlab  resource  rust  searchengine  security  shell  so  software  solution  sortof  sql  stackoverflow  stress  string  stringconcatenation  strings  sun  swift  swiftlang  table  tag  tchrist  technical  terminal  test  text-encoding  text  tool  ubuntu  ucs2  umlaute  unicode  unix  unzip  utf-8  utf16  utf7  utf8mb4  util  ux  vi  video  vim  w3schools  webdevel  webdevelopment  wiki  windows-1252  windows  wordpress  zip 

Copy this bookmark: