jm + string-matching   4

cloudflare/lua-aho-corasick
A nice Lua/C++ implementation of Aho-Corasick for fast string matching against multiple patterns (via JGC). This uses an interesting technique to get better performance by compacting the data structure into a single buffer, to avoid following pointers all over RAM and busting the cache.
optimization  speed  performance  aho-corasick  tries  string-matching  strings  algorithms  lua  c++  via:jgc 
4 weeks ago by jm
Implementing strcmp, strlen, and strstr using SSE 4.2 instructions - strchr.com
Using new Intel Core i7 instructions to speed up string manipulation. Fascinating stuff. SSE ftw
sse  optimization  simd  assembly  intel  i7  intel-core  strstr  strings  string-matching  strchr  strlen  coding 
january 2013 by jm
Fast Packed String Matching for Short Patterns [paper, PDF]
'Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other
fields, like NLP, information retrieval and computational biology. In the last two decades a general trend has appeared
trying to exploit the power of the word RAM model to speed-up the
performances of classical string matching algorithms. [...]
In this paper we use specialized word-size packed string matching instructions, based on the Intel streaming SIMD extensions (SSE) technology, to design very fast string matching algorithms in the case of short patterns.' Reminds me of http://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm , but taking advantage of SIMD extensions, which should make things nice and speedy, at the cost of tying it to specific hardware platforms. (via Tony Finch)
rabin-karp  algorithms  strings  string-matching  papers  via:fanf 
january 2013 by jm

Copy this bookmark:



description:


tags: