strings   1249

« earlier    

utf8everywhere
Purpose of this document -- To promote usage and support of the UTF-8 encoding, to convince that this should be the default choice of encoding for storing text strings in memory or on disk, for communication and all other uses. We believe that all other encodings of Unicode (or text, in general) belong to rare-edge cases of optimization and should be avoided by mainstream users
utf  utf-8  Unicode  utf-16  Utf8  char  charset  character  codepage  ASCII  text  string  strings  programming  windows  encoding 
24 days ago by rafaeldff
[1204.3293] Efficiently decoding strings from their shingles
"Determining whether an unordered collection of overlapping substrings (called shingles) can be uniquely decoded into a consistent string is a problem that lies within the foundation of a broad assortment of disciplines ranging from networking and information theory through cryptography and even genetic engineering and linguistics. We present three perspectives on this problem: a graph theoretic framework due to Pevzner, an automata theoretic approach from our previous work, and a new insight that yields a time-optimal streaming algorithm for determining whether a string of $n$ characters over the alphabet $Sigma$ can be uniquely decoded from its two-character shingles. Our algorithm achieves an overall time complexity $Theta(n)$ and space complexity $O(|Sigma|)$. As an application, we demonstrate how this algorithm can be extended to larger shingles for efficient string reconciliation."
strings  algorithms  computational-complexity  nudge-targets 
4 weeks ago by Vaguery

« earlier    

Copy this bookmark:



description:


tags: