jm + regex   8

Regexp Disaster
Course notes from Gerald Jay Sussman's "Adventures in Advanced Symbolic Programming" class at MIT. Hard to argue with this:
The syntax of the regular-expression language is awful. There are various incompatable forms of the language and the quotation conventions are baroquen [sic]. Nevertheless, there is a great deal of useful software, for example grep, that uses regular expressions to specify the desired behavior.

Although regular-expression systems are derived from a perfectly good mathematical formalism, the particular choices made by implementers to expand the formalism into useful software systems are often
disastrous: the quotation conventions adopted are highly irregular; the egregious misuse of parentheses, both for grouping and for backward reference, is a miracle to behold. In addition, attempts to
increase the expressive power and address shortcomings of earlier designs have led to a proliferation of incompatible derivative languages.


(via Rob Pike's twitter: https://twitter.com/rob_pike/status/755856685923639296)
regex  regexps  regular-expressions  functional  combinators  gjs  rob-pike  coding  languages 
july 2016 by jm
Rubular
'a Ruby regular expression editor and tester'. Great for prototyping regexps with a little set of test data, providing a neat permalink for the results
regex  regexp  ruby  tools  coding  web  editors  testing 
july 2016 by jm
Hyperscan
a high-performance multiple regex matching library. Hyperscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions and for the matching of regular expressions across streams of data.


Via Tony Finch
via:fanf  regexps  regex  dpi  hyperscan  dfa  nfa  hybrid-automata  text-matching  matching  text  strings  streams 
october 2015 by jm
Peter Norvig writes a program to play regex golf with arbitrary lists
In response to XKCD 1313. This is excellent. It's reminiscent of my SpamAssassin SOUGHT-ruleset regexp-discovery algorithm, described in http://taint.org/2007/03/05/134447a.html , albeit without the BLAST step intended to maximise pattern length and minimise false positives
python  regex  xkcd  blast  rule-discovery  spamassassin  rules  regexps  regular-expressions  algorithms  peter-norvig 
january 2014 by jm
PCRE Performance Project
Excellent stuff. Using "sljit", a stackless platform-independent JIT compiler, this compiles Perl-compatible regular expressions to machine code on ARM, x86, MIPS and PowerPC platforms, resulting in 'similar matching speed to DFA based engines (like re2) on common patterns' with Perl compatibility. 'This work has been released as part of PCRE 8.20 and above. Now (PCRE 8.31), nearly all PCRE features are supported including UTF-8/16 and partial matching.'
pcre  regexps  regex  performance  optimization  jit  compilation  dfa  re2  via:akohli 
september 2012 by jm
RegExr: Online Regular Expression Testing Tool
a very nice interactive editor in Flash, supporting lots of the usual perlish stuff. via Joe
via:jdrumgoole  regexps  regular-expressions  spamassassin  rule-dev  flash  regex  flex  utilities  from delicious
december 2009 by jm
Structural Regular Expressions
'The current UNIX text processing tools are weakened by the built-in concept of a line. There is a simple notation that can describe the `shape' of files when the typical array-of-lines picture is inadequate. That notation is regular expressions. Using regular expressions to describe the structure in addition to the contents of files has interesting applications, and yields elegant methods for dealing with some problems the current tools handle clumsily. When operations using these expressions are composed, the result is reminiscent of shell pipelines.' Paper by Rob Pike, via adulau. intriguing
sregex  via:adulau  regexp  rob-pike  regex  library  text  structural  parsing  from delicious
november 2009 by jm
sregex - Structural Regular Expressions
'The sregex module implements Structural Regular Expressions.' Python, Apache-licensed
sregex  python  via:adulau  regexp  robpike  regex  library  text  structural  parsing  from delicious
november 2009 by jm

Copy this bookmark:



description:


tags: