jm + string-matching   4

A nice Lua/C++ implementation of Aho-Corasick for fast string matching against multiple patterns (via JGC). This uses an interesting technique to get better performance by compacting the data structure into a single buffer, to avoid following pointers all over RAM and busting the cache.
optimization  speed  performance  aho-corasick  tries  string-matching  strings  algorithms  lua  c++  via:jgc 
august 2014 by jm
Implementing strcmp, strlen, and strstr using SSE 4.2 instructions -
Using new Intel Core i7 instructions to speed up string manipulation. Fascinating stuff. SSE ftw
sse  optimization  simd  assembly  intel  i7  intel-core  strstr  strings  string-matching  strchr  strlen  coding 
january 2013 by jm
Fast Packed String Matching for Short Patterns [paper, PDF]
'Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other
fields, like NLP, information retrieval and computational biology. In the last two decades a general trend has appeared
trying to exploit the power of the word RAM model to speed-up the
performances of classical string matching algorithms. [...]
In this paper we use specialized word-size packed string matching instructions, based on the Intel streaming SIMD extensions (SSE) technology, to design very fast string matching algorithms in the case of short patterns.' Reminds me of , but taking advantage of SIMD extensions, which should make things nice and speedy, at the cost of tying it to specific hardware platforms. (via Tony Finch)
rabin-karp  algorithms  strings  string-matching  papers  via:fanf 
january 2013 by jm

Copy this bookmark: