Monday, May 03, 2010

I found another strange case in the perl regex machine. Consider the following match:

"aab" =~ /(.)(?!(\1)(\1))/

This match succeeds. Perl allows you to pull groups out from inside lookarounds, so $2 is "a". If you think about it, that's a little odd, since group 2 matches while inside an environment where booleans are inverted. (?!(\1)(\1)) is prima facie homomorphic with "not (A and B)" or rewritten with DeMorgan's, "(not A) or (not B)". More explicitly, "(do not match group 1) or (do not group match 1)." The first assertion "do not match group 1" fails but $2 is given the value "a" anyway. After that, the assertion "do not match group 1" succeeds and yet the successful match is not bound to any characters.

I do not think this implementation of negative lookahead is as useful as it could be. Group 3 is where the assertion succeeds. It wouldn't be terribly difficult to bind the text being matched for group 3 to $3.

No comments: