1 2008-02-10 Zoltan Varga <vargaz@gmail.com>
3 * BaseMachine.cs Regex.cs: Make LTRReplace and RTLReplace instance methods to
4 avoid creating two machines for each Regex.Replace () call.
6 * interpreter.cs (Eval): Remove a needless string allocation.
8 2007-12-04 Arina Itkes <arinai@mainsoft.com>
10 * parser.cs: Max value of m for a construct {n,m} is 2147483647.
12 2007-11-15 Miguel de Icaza <miguel@novell.com>
14 * Revert the patch from Juraj Skripsky as it made the class
15 non-thread safe (see #341986).
17 2007-11-08 Raja R Harinath <harinath@gmail.com>
20 * BaseMachine.cs (LTRReplace): Don't use non-advancement of 'ptr'
21 to deduce absence of matches -- a match can have length 0.
22 (RTLReplace): Likewise.
24 2007-11-07 Raja R Harinath <harinath@gmail.com>
26 Support RegexOptions.RightToLeft in Replace().
27 * BaseMachine.cs (Replace): Use either LTRReplace or RTLReplace
29 (LTRReplace): Make internal and rename the MatchAppendEvaluator
30 version of Replace to this.
32 * Regex.cs (Replace): Use LTRReplace and RTLReplace from BaseMachine.
33 * replace.cs (ReplacementEvaluator.Evaluate): Optimize simple case.
34 Based on patch by Stephane Delcroix.
36 * replace.cs (Compile): Don't unescape string.
38 2007-11-01 Gert Driesen <drieseng@users.sourceforge.net>
40 * Match.cs: Do not throw NotSupportedException on zero-length
43 2007-10-29 Arina Itkes <arinai@mainsoft.com>
45 * Regex.cs: Moving creation of Regex machine to ctor.
46 It increases an initialization time of Regex but reduce a
47 process time while APIs calling. Also it solves the problem
48 of missed multi thread synchronization.
50 2007-10-29 Arina Itkes <arinai@mainsoft.com>
52 * Match.cs: Fix for Result method of Match. Throwing an exception
53 if Result method was called on a failed Match.
55 2007-10-24 Juraj Skripsky <js@hotfeet.ch>
57 * Regex.cs: Store and re-use IMachine, no need to re-instantiate
58 it every time we're matching.
60 2007-10-24 Arina Itkes <arinai@mainsoft.com>
62 * Regex.cs Match.cs arch.cs compiler.cs interpreter.cs
63 Refactoring of Interpreter with extracting of base abstract class
64 that executes some methods that were moved from Regex and Match classes.
65 Added a field that maps group numbers to group names in Regex for
66 improvement of performance of GroupNameFromNumber method.
68 2007-10-21 Gert Driesen <drieseng@users.sourceforge.net>
70 * RegexTest.cs: Removed. Test was already moved to the appropriate
73 2007-06-21 Juraj Skripsky <js@hotfeet.ch>
75 * quicksearch.ch: Optimization. Add byte array as skip table for
76 chars <= 255, falling back to the hashtable for chars > 255 and
79 2007-04-18 Raja R Harinath <rharinath@novell.com>
82 * parser.cs (ResolveReferences): Don't throw an expression if a
83 capture assertion reference cannot be resolved.
84 (ParseGroupingConstruct): Provide fallback expression to a capture
86 * syntax.cs (CaptureAssertion): If the bareword doesn't refer to
87 the name of a capture group, fallback to treating it as a literal
90 2007-04-04 Raja R Harinath <rharinath@novell.com>
92 * interpreter.cs (Eval) <OpCode.Reference>: Distribute for loop
94 for () if (a) s1; else s2; => if (a) for () s1; else for () s2;
96 2007-04-03 Raja R Harinath <rharinath@novell.com>
98 * Regex.cs (~Regex): Don't define in NET_2_0 profile.
100 2007-01-02 Raja R Harinath <rharinath@novell.com>
103 * parser.cs (Parser.GetMapping): Use the actual group numbers to
106 2006-09-28 Andrew Skiba <andrews@mainsoft.com>
108 * Regex.cs: TARGET_JVM
110 2006-05-30 Gert Driesen <drieseng@users.sourceforge.net>
112 * CaptureCollection.cs: Removed virtual keyword to fix API mismatches.
113 * MatchCollection.cs: Removed virtual keyword to fix API mismatches.
114 * GroupCollection.cs: Removed virtual keyword to fix API mismatches.
116 2006-05-08 Raja R Harinath <rharinath@novell.com>
119 Remove 65535-limit on number of repetitions matched by a pattern.
120 We still have a 65535 limit on the length of a pattern and the
121 number of groups in a pattern.
122 * compiler.cs (PatternCompiler.EmitCount): New. Emits an int as
123 two ushorts into the program stream.
124 (EmitInfo, EmitRepeat, EmitFastRepeat): Use it to emit integers
126 * interpreter.cs (Intepreter.ReadProgramCount): Read an int
127 emitted into the program stream.
128 (Interpreter): Use it. Update counts.
129 (Interpreter.Eval) [OpCode.Repeat, OpCode.FastRepeat]: Likewise.
130 * parser.cs (ParseGroup): Pass 0x7ffffff as the max value for '*'
131 and '+' repetition patterns.
132 * arch.cs (Info, Repeat, FastRepeat): Update description.
134 2006-04-18 Raja R Harinath <rharinath@novell.com>
136 Treat fixed repetitions of simple regexes as simple too.
137 * syntax.cs (Expression.IsComplex): Make abstract.
138 (Group.IsComplex, Alternation.IsComplex): Move ...
139 (CompositeExpression.IsComplex): ... here.
140 (Group.GetAnchorInfo): Reduce allocations. Avoid creating another
141 ArrayList, and use a StringBuilder to build up the string.
142 (Repetition.GetAnchorInfo): Use a StringBuilder to build up the string.
143 (ExpressionAssertion.IsComplex): Override.
145 2006-04-17 Florian Gross <flgr@ccan.de>
146 Raja R Harinath <rharinath@novell.com>
148 * syntax.cs (CharacterClass.Compile): Emit categories after the
149 character intervals so that the evaluator can pick up the
152 2006-04-07 Raja R Harinath <rharinath@novell.com>
155 * interpreter.cs (Interpreter.Eval) [Anchor, Position.StartOfString]:
156 Don't reset 'ptr' to 0 during forward scan.
159 * interpreter.cs (Interpreter.FastEval) [FastRepeat]: If the first
160 tail operation has a 'negate' flag, avoid the "match next char"
164 * arch.cs (OpCode.NotCategory): New. Stands for matching a
165 character _not_ from the given category.
166 * debug.cs (DisassembleBlock): Handle it.
167 * compiler.cs (ICompiler.EmitNotCategory): New.
168 (Compiler.EmitNotCategory): New. Emit OpCode.NotCategory.
169 * syntax.cs (CharacterClass.Compile): Don't conflate negation of
170 the character class and negation of the category. Use
172 * interpreter.cs (Interpreter.Eval): Pass OpCode.NotCategory to
174 (Interpreter.EvalChar): Handle it.
176 2006-04-06 Raja R Harinath <rharinath@novell.com>
179 * interpreter.cs (Eval) [Until, FastUntil]: Set 'deep' to null
180 when evaluating the tail. Ensure that backtracks don't confuse
181 the recursion vs. iteration detector.
183 2006-04-03 Raja R Harinath <rharinath@novell.com>
185 * interpreter.cs (Eval) [Until, lazy]: Avoid extra evaluation on a
188 2006-03-30 Raja R Harinath <harinath@gmail.com>
191 * parser.cs (Parser.ParseCharacterClass): Don't automatically
192 assume there's a range when we see '-'. Ensure that we have seen
193 at least one other character, and that we aren't already parsing a
194 range. Handle some more errors.
196 2005-12-19 Kornél Pál <kornelpal@hotmail.com>
198 * Regex.cs: Added support for regular expressions compiled to
199 assemblies by compiling the pattern. This solution ignores existing
200 CIL code but provides full support for regular expression classes
203 2005-11-21 Sebastien Pouliot <sebastien@ximian.com>
205 * CaptureCollection.cs: Fixed length check.
206 * Group.cs: Added missing validation for Synchronized method.
207 * Match.cs: Added missing validation for Synchronized and Result
209 * MatchEvaluator.cs: Added [Serializable] for 2.0 profile.
210 * RegexCompilationInfo.cs: Added missing property validation.
211 * Regex.cs: Implemented UseOptionC and UseOptionR protected methods
212 (now documented). Fixed API for 2.0 profile.
213 * RegexRunner.cs: Stubbed CharInClass for 2.0 profile.
215 2005-11-17 Sebastien Pouliot <sebastien@ximian.com>
217 * Match.cs: Removed the ": base ()" on the private ctor as it is
218 unrequired and cause an extra public ctor to added (bug #76736).
219 * MatchCollection.cs: Add missing virtual to indexer property.
221 2005-09-23 Raja R Harinath <rharinath@novell.com>
223 * interpreter.cs (Interpreter.Eval) [OpCode.Until]: Invert the
224 sense of a test to reflect the code re-organization.
226 2005-09-22 Raja R Harinath <rharinath@novell.com>
229 * interpreter.cs (Interpreter.Eval) [OpCode.Until]: Avoid some
230 cases of recursion when dealing with eager quantifiers too. We
231 now avoid recursion when handling the innermost quantifier.
232 (Interpreter.IntStack, Interpreter.stack): New. Stack to help
233 implement backtracking in eager quantifiers.
235 2005-09-21 Raja R Harinath <rharinath@novell.com>
237 * interpreter.cs (Interpreter.Eval) [OpCode.Until]: Avoid some
238 cases of recursion when dealing with the minimum count and lazy
241 2005-08-23 Raja R Harinath <rharinath@novell.com>
243 * regex.cs: Remove. Split into ...
244 * MatchEvaluator.cs, Regex.cs, RegexCompilationInfo.cs,
245 RegexOptions.cs: ... these. Now every publicly exposed type in
246 this namespace has its own file.
248 2005-07-21 Florian Gross <flgr@ccan.de>
250 * Fixed a bug in category.cs that caused ECMAScript \d to fail.
252 2005-07-13 Raja R Harinath <rharinath@novell.com>
254 Make even more lazier.
255 * MatchCollection.cs (TryToGet): Don't generate match i+1 when
256 we're looking for match i. Change post-conditions.
257 (FullList): New helper property. Ensures the list is fully populated.
258 (Count, CopyTo): Use it.
259 (Enumerator.Current): Update to new post-conditions of TryToGet.
260 (Enumerator.MoveNext): Likewise. Don't modify index if we're
263 2005-07-08 Raja R Harinath <rharinath@novell.com>
265 * MatchCollection.cs: Convert to incremental mode.
266 * regex.cs (Regex.Matches): Update. Pass responsibility of
267 generating all matches to MatchCollection.
269 2005-06-14 Raja R Harinath <harinath@gmail.com>
271 * parser.cs (Parser.ConsumeWhitespace): Add bounds check.
274 * Match.cs (Match) [zero-argument variant]: Make private.
275 * GroupCollection (Item) [string variant]: Don't look for the
276 group number in an empty match.
278 2005-06-10 Raja R Harinath <rharinath@novell.com>
280 * interpreter.cs (Interpreter.GenerateMatch): Avoid allocating two
281 intermediate arrays to build the final result.
282 (Interpreter.GetGroupInfo, Interpreter.PopulateGroup): New helper
284 * CaptureCollection.cs (list): Change from ArrayList to list.
285 (SetValue): New internal helper, used by Interpreter.PopulateGroup.
286 (Enumerator): Remove helper class.
287 (IEnumerator.GetEnumerator): Just use list.GetEnumerator.
288 * GroupCollection.cs: Likewise.
289 * Group.cs (Group): Move responsibility of populating 'Captures'
290 to Interpreter.PopulateGroup.
291 * Match.cs (Match): Move responsibility of populating 'Groups' to
292 Interpreter.GenerateMatch.
294 2005-05-25 Raja R Harinath <rharinath@novell.com>
296 * replace.cs (ReplacementEvaluator.Compile): Rewrite to avoid
297 creating several intermediate strings. Simplify internal
298 intermediate representation.
299 (ReplacementEvaluator.EvaluateAppend): New. Version of Evaluate
300 that builds the result directly on a passed-in StringBuilder.
301 (ReplacementEvaluator.Evaluate): Just a wrapper around
303 * regex.cs (MatchAppendEvaluator): New internal delegate.
304 (Regex.Replace): Use MatchAppendEvaluator.
305 (Regex.Adapter): New class used to adapt a MatchEvaluator to a
306 MatchAppendEvaluator.
308 2005-05-24 Raja R Harinath <rharinath@novell.com>
310 * replace.cs (ReplacementEvaluator.CompileTerm): Fix group
313 2005-05-20 Ben Maurer <bmaurer@ximian.com>
315 * regex.cs: Some memory allocation optimizations.
317 2005-05-20 Raja R Harinath <rharinath@novell.com>
320 * replace.cs (ReplacementEvaluator.Compile): Allow CompileTerm to
321 fail and yet have advanced the pointer. Append the scanned-over
322 portion to the "literal" being built.
323 (ReplacementEvaluator.CompileTerm): Don't throw any exceptions.
324 If a term cannot be recognized, just return null.
326 * compiler.cs (InterpreterFactory.GroupCount): Fix. The 0'th
327 index corresponds to Opcode.Info.
329 * parser.cs (Parser.Unescape): If the string doesn't contain any
330 '\' character, don't allocate a new string.
332 * replace.cs (ReplacementEvalutator.Term.AppendResult): Rename
333 from GetResult. Append to a passed-in StringBuilder rather than
335 (ReplacementEvaluator.Evaluate): Update.
337 * Capture.cs, Group.cs, Match.cs: New files split out of ...
338 * match.cs: ... this. Remove.
340 2005-02-27 Gonzalo Paniagua Javier <gonzalo@ximian.com>
342 * parser.cs: stuff inside {} might not be a quantifier. Fixes
345 2005-01-10 Gonzalo Paniagua Javier <gonzalo@ximian.com>
347 * quicksearch.cs: handle IgnoreCase when getting the shift distance.
348 Fixes bug #69065. Patch by mei@work.email.ne.jp.
350 2005-01-08 Miguel de Icaza <miguel@ximian.com>
352 * syntax.cs: Applied patch from mei@work.email.ne.jp to fix bug
355 * parser.cs: Turns out that \digit sequences are octal sequences
356 (no leading zero is needed); And the three octal digit rule
357 applies to the leading zero as well.
359 This fixes the Unescape method.
361 2004-11-29 Gonzalo Paniagua Javier <gonzalo@ximian.com>
363 * regex.cs: use NextMatch to move on to the next match. Fixes bug
366 2004-11-09 Atsushi Enomoto <atsushi@ximian.com>
370 2004-11-08 Ben Maurer <bmaurer@ximian.com>
372 * replace.cs, parser.cs: Use stringbuilder for allocation sanity.
374 2004-10-21 Joerg Rosenkranz <joergr@voelcker.com>
376 * regex.cs: Fixed a bug introduced with the last patch which
377 prevented any replacements when a postive count is given.
378 This also happens in all overloads without count parameter.
380 2004-10-18 Gonzalo Paniagua Javier <gonzalo@ximian.com>
382 * regex.cs: in Replace, when count is negative, replacement continues
383 to the end of the string.
385 Fixes bug #68398. Patch by Jon Larimer.
387 2004-06-10 Gert Driesen <drieseng@users.sourceforge.net>
389 * RegexRunner.cs: fixed case mismatch of methods
391 2004-06-10 Gert Driesen <drieseng@users.sourceforge.net>
393 * RegexRunner.cs: marked TODO, added missing protected internal
394 fields, throw NotImplementedException in all methods
396 2004-06-10 Gert Driesen <drieseng@users.sourceforge.net>
398 * RegexRunnerFactory.cs: removed comment, no longer throw exception
400 * regex.cs: fixed public API signature by renaming protected
401 internal fields and adding destructor, added MonoTODO attribute to
402 fields and method that are not yet implemented, changed not
403 implemented methods to throw NotImplementedException instead of
404 Exception, fixed names of field that are serialized
406 2004-06-06 Jambunathan K <kjambunathan@novell.com>
408 * parser.cs: Fixed issues with Regex.Unescape() identified as part of
409 debugging bug #58256. The original problem reported was about
410 inconsistency between the way we treat replacement patterns and the
411 way microsoft treats the replacement patterns in Regex.Replace(). MS
412 implementation is buggy and doesn't honour escape sequences in the
413 replacement patterns, even though the SDK claims otherwise.
416 2004-06-01 Gonzalo Paniagua Javier <gonzalo@ximian.com>
418 * syntax.cs: re-applied my patch from 2004-05-27 plus a fix which is
419 emitting a Category.All if both a category and its negated value are
422 2004-06-01 Gonzalo Paniagua Javier <gonzalo@ximian.com>
424 * syntax.cs: reverting my previous patch. It causes bigger problems.
426 2004-05-27 Gonzalo Paniagua Javier <gonzalo@ximian.com>
428 * category.cs: added LastValue field to mark the end of enum Category.
429 * syntax.cs: in CharacterClass, use Category.LastValue to get the size
430 of the array needed. Use a BitArray instead of bool[].
431 In AddCategory(), don't set the opposite category as false. Fixes
432 bug #59150. All tests pass.
434 2004-05-25 Jackson Harper <jackson@ximian.com>
436 * parser.cs: Allow creating a regular expression using {,n} as the
437 specified. The min bounds is set to -1, I am not completely sure
438 if that is what it is supposed to be but MS does not set it to 0
439 based on testing. Patch by dave-gnome-bugs@earth.li. Fixes bug #56761.
441 2004-05-12 Dick Porter <dick@ximian.com>
445 * RegexRunnerFactory.cs:
446 * RegexRunner.cs: More public API difference fixes.
448 * GroupCollection.cs:
449 * MatchCollection.cs:
450 * CaptureCollection.cs: Moved GroupCollection, MatchCollection and
451 CaptureCollection so that they no longer inherit from the
452 non-standard RegexCollectionBase class. Fixes the API difference.
454 2004-04-19 Gonzalo Paniagua Javier <gonzalo@ximian.com>
461 Patch by Eric Durand Tremblay.
462 1) Capture inner group when named.
463 2) Resolved parse error caused by not capturing inner group
464 3) Resolved incorrect capture group
465 4) Now, not capturing anything when unnamed ( correct behavior)
468 2004-04-19 Gonzalo Paniagua Javier <gonzalo@ximian.com>
474 * syntax.cs: converted to unix line endings.
476 2004-03-30 Lluis Sanchez Gual <lluis@ximian.com>
478 * collections.cs: In the indexer, return an empty group if the requested
480 * match.cs: Added default constructor for Group.
482 2004-03-24 Gonzalo Paniagua Javier <gonzalo@ximian.com>
484 * parser.cs: fixed group numbering.
486 2004-03-22 Jackson Harper <jackson@ximian.com>
488 * parser.cs: Use the group number as the name in mapping. Patch by
490 * regex.cs: Fix off by one error. Patch by Gert Driesen.
492 2004-03-17 Francois Beauchemin <beauche@softhome.net>
493 * syntax.cs, interpreter.cs, quicksearch.cs, regex.cs, compiler.cs :
494 Revised support for RigthToLeft.
495 quicksearch has now an reverse option.
496 This fixes bug #54537
498 * regex.cs, compiler.cs :
499 Some code to support CILCompiler.
501 Added some undocumented of MS.
503 2004-03-16 Gonzalo Paniagua Javier <gonzalo@ximian.com>
505 * parser.cs: allow a @"\0" escape sequence. Fixes bug #54797.
507 2004-02-01 Miguel de Icaza <miguel@ximian.com>
509 * syntax.cs, interval.cs: Applied patch from Marco Cravairo
510 through Francois Beauchemin who reviewed on the mailing list.
511 This fixes bug #45976
513 2004-01-16 Gonzalo Paniagua Javier <gonzalo@ximian.com>
515 * parser.cs: an opening brace without a
516 quantifier do not cause a parse error. Fixes bug #52924.
518 2004-01-07 Lluis Sanchez Gual <lluis@ximian.com>
520 * regex.cs: In Split(), if the last match is at the end of the string,
521 an empty string must be added to the array of results.
523 2003-12-15 Sanjay Gupta <gsanjay@novell.com>
524 * match.cs: Check for null value before Substring method call.
527 2003-11-21 Juraj Skripsky <js@hotfeet.ch>
529 * quicksearch.cs: Create and use hashtable only for "long" search
532 (Search): Use simple scan for a single-character search strings.
534 (GetChar): Simplify case sensitivity handling.
536 2003-11-27 Gonzalo Paniagua Javier <gonzalo@ximian.com>
538 * interpreter.cs: when evaluating a degenerate match, restore the
539 RepeatContext if fail. Fixes bug #42529.
541 2003-11-22 Jackson Harper <jackson@ximian.com>
543 * regex.cs: Add CultureInvariant flag to RegexOptions.
545 2003-11-20 Juraj Skripsky <js@hotfeet.ch>
547 * quicksearch.cs: Use a hashtable instead of an array for the
548 shift table to improve the memory usage.
550 2003-11-19 Gonzalo Paniagua Javier <gonzalo@ximian.com>
553 (Split): include capture groups in the results, if any. Fixes bug
556 2003-07-09 Gonzalo Paniagua Javier <gonzalo@ximian.com>
558 * regex.cs: patch from Eric Lindvall <eric@5stops.com> that fixes bug
561 2003-03-05 Miguel de Icaza <miguel@ximian.com>
563 * category.cs (CategoryUtils.CategoryFromName): Use StartsWith
564 ("Is") instead of a substring for (0,2) which was throwing an
565 exception causing Category.None to be returned
567 2003-01-17 Gonzalo Paniagua Javier <gonzalo@ximian.com>
569 * collections.cs: fixed bug #30091.
571 2002-12-20 Gonzalo Paniagua Javier <gonzalo@ximian.com>
573 * regex.cs: fixed little mistake (closes #35860).
575 2002-11-12 Jackson Harper <jackson@latitudegeo.com>
577 * arch.cs compiler.cs regex.cs: Added mapping attribute to MachineFactories
579 2002-11-06 Gonzalo Paniagua Javier <gonzalo@ximian.com>
581 * parser.cs: detect illegal \ at end of pattern. Fixes 31334.
583 2002-10-25 Gonzalo Paniagua Javier <gonzalo@ximian.com>
585 * parser.cs: applied fix from Tim Haynes (thaynes@openlinksw.com) to
586 solve bug #32807. Also modified GetMapping to return the same as MS.
588 2002-08-28 Juli Mallett <jmallett@FreeBSD.org>
590 * arch.cs, compiler.cs: Give the interpreter machine a property
591 for the retrieval of the group count.
593 * regex.cs: Use the new GroupCount property of the factory to
594 initialise the current group count, and restructure code to compile
595 the pattern only the first time it is needed (essentially backing
596 out the previous revision of regex.cs, to use the new code.)
598 2002-08-14 Cesar Octavio Lopez Nataren <cesar@ciencias.unam.mx>
600 * regex.cs: Added the ctr for ISerializable implementation and
601 implemented the GetObjectData function.
603 2002-07-30 Juli Mallett <jmallett@FreeBSD.org>
605 * regex.cs: Fixed bug where the expression would not be
606 re-evaluated for grouping purposes when factory caches were
607 used, resulting in no groups being recognised after one call
608 with a given pattern and no change in options.
610 2002-05-13 Dan Lewis <dihlewis@yahoo.co.uk>
612 * regex.cs: Fixed bug in split.
614 2002-05-08 Dan Lewis <dihlewis@yahoo.co.uk>
616 * interpreter.cs: Moved to an array-based stack representation
619 * match.cs, collections.cs: Decoupled capture representation from
620 interpreter internals.
622 * cache.cs: Changed Key type from struct to class for speed.
624 2002-04-06 Dan Lewis <dihlewis@yahoo.co.uk>
626 * cache.cs: Object methods should be overridden with "override".
628 2002-04-04 Dan Lewis <dihlewis@yahoo.co.uk>
630 * RegexRunner.cs, RegexRunnerFactory.cs: MS support classes. Stubs
631 added for completeness.
633 * regex.cs, match.cs, collections.cs: Serializable attribute.
635 2002-04-04 Dan Lewis <dihlewis@yahoo.co.uk>
637 * regex.cs: Added static Matches and IsMatch methods.
639 2002-04-03 Dan Lewis <dihlewis@yahoo.co.uk>
641 * ChangeLog: Added changelog.
643 * cache.cs: Fixed bug in MRUList.Evict.