3 * Need to go through everything and square it with RightToLeft matching.
4 The support for this was built into an early version, and lots of things built
5 afterwards are not savvy about bi-directional matching. Things that spring to
6 mind: Regex match methods should start at 0 or text.Length depending on
7 direction. Do split and replace need changes? Match should be aware of its
8 direction (already applied some of this to NextMatch logic). The interpreter
9 needs to check left and right bounds. Anchoring and substring discovery need
10 to be reworked. RTL matches are going to have anchors on the right - ie $, \Z
11 and \z. This should be added to the anchor logic. QuickSearch needs to work in
12 reverse. There may be other stuff.... work through the code.
14 * Add ECMAScript support to the parser. For example, [.\w\s\d] map to ECMA
15 categories instead of canonical ones. There's different behaviour on
16 backreference/octal disambiguation. Find out what the runtime behavioural
17 difference is for cyclic backreferences eg (?(1)abc\1) - this is only briefly
18 mentioned in the spec. I couldn't find much on this in the ECMAScript
21 * Check the octal disambiguation for canonical syntax works as specced.
23 * Add a check in QuickSearch for single character substrings. This is likely to
24 be a common case. There's no need to go through a shift table. Also, have a
25 look at just computing a relevant subset of the shift table and using an
26 (offset, size) pair to help test inclusion. Characters not in the table get
27 the default len + 1 shift.
29 * Improve the perl test suite. Run under MS runtime to generate checksums for
30 each trial. Checksums should incorporate: all captures (index, length) for all
31 groups; names of explicit capturing groups, and the numbers they map to. Any
32 other state? RegexTrial.Execute() will then compare result and checksum.