1 2008-11-05 Atsushi Enomoto <atsushi@ximian.com>
3 * ucd.cs : Write type for *_count. Add notice to not edit
4 unicode-data.h directly.
6 2008-11-04 Atsushi Enomoto <atsushi@ximian.com>
8 * ucd.cs : new code to generate unicode table for eglib.
10 2008-07-04 Andreas Nahr <ClassDevelopment@A-SoftTech.com>
12 * SortKey: Fix parameter names, add attribute, small formatting
14 2008-06-27 Rodrigo Kumpera <rkumpera@novell.com>
16 * CodePointIndexer.cs : Make TableRange a struct instead
17 of a class so we save 2 memory ops per ToIndex loop.
19 2008-04-02 Atsushi Enomoto <atsushi@ximian.com>
21 * SortKey.cs : check null arguments. Fixed bug #376171.
23 2007-07-20 Atsushi Enomoto <atsushi@ximian.com>
25 * create-mscompat-collation-table.cs : I wonder how long its build
28 2007-03-06 Atsushi Enomoto <atsushi@ximian.com>
30 * SimpleCollator.cs : disable QuickCheckPossible(), which is
31 inaccurate and inefficient. Fixed bug #79714.
33 2007-02-15 Atsushi Enomoto <atsushi@ximian.com>
35 * SimpleCollator.cs : character filtering is needed for
36 OrdinalIgnoreCase in 2.0 profile. Fixed bug #80865.
38 2007-01-25 Atsushi Enomoto <atsushi@ximian.com>
40 * SimpleCollator.cs : GetTailContraction() was broken to pick correct
41 contraction/special sortkey out and thus LastIndexOf() failed when
42 it is involved. Fixed bug #80612.
44 2007-01-22 Atsushi Enomoto <atsushi@ximian.com>
46 * SimpleCollator.cs : for non-StringSort comparison, level5 (- and ')
47 should be still skipped after initial level5 check is done (while
48 they were simply treated as a normal character). Fixed bug #78748.
49 * SortKeyBuffer.cs : Fixed NRE in french sort.
51 2006-12-25 Atsushi Enomoto <atsushi@ximian.com>
53 * SimpleCollator.cs : added IndexOf() implementation for Ordinal
54 and OrdinalIgnoreCase, though Ordinal version is not used (since
55 it is slower than icall).
57 2006-05-30 Miguel de Icaza <miguel@novell.com>
59 * MSCompatUnicodeTable.cs: Remove the fixed loading and compute it
60 just when we actually consume it. This only fixes the
63 2006-04-14 Atsushi Enomoto <atsushi@ximian.com>
65 * README: removed obsolete info.
66 * Normalization.cs : canonical reordering should participate in the
67 decomposition step. In reordering, string append was incomplete.
68 Combining class check is required in NFD check. Icall is written
71 2005-12-07 Zoltan Varga <vargaz@gmail.com>
73 * SimpleCollator.cs: Fix a warning.
75 2005-11-30 Sebastien Pouliot <sebastien@ximian.com>
77 * SimpleCollator.cs: Fix CAS support. The static ctor/var try to get
78 the environment variable MUCH too soon (i.e. the security manager
81 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
83 * SimpleCollator.cs : direct fast-path optimization for IndexOf().
85 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
88 - CompareQuick(): added immediateBreakup to avoid extraneous sortkey
90 - QuickCheckPossible(): index used for s1 was incorrect.
92 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
94 * SimpleCollator.cs : added another quick check for CompareInternal()
95 that does almost ordinal comparison for quick-checkable strings.
96 (It affects on Compare(), IndexOf(), IsSuffix() etc. as well.)
98 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
100 * MSCompatUnicodeTable.cs : (IsIgnorable) \0 is not ignorable.
103 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
105 * SimpleCollator.cs :
106 Created another struct to reduce method arguments. Created another
107 flags that keeps "once-matched" state (counterpart of
108 checkedFlags, now neverMatchFlags).
110 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
112 * SimpleCollator.cs :
113 - Added CompareOrdinalIgnoreCase() for NET_2_0 RTM.
114 - Reduced extra parameter from LastIndexOfSortKey().
115 - LastIndexOf() should use GetTailContraction for the source string.
116 And then, target could match in the middle of the possible
117 "replacement contraction" of the source string, so use
118 LastIndexOfSortKey() to catch them.
119 - Fixed GetTailContraction() that caused index out of range.
121 2005-11-11 Atsushi Enomoto <atsushi@ximian.com>
123 * Makefile : Now use MONO_DISABLE_MANAGED_COLLATION.
124 * SortKey.cs : some members are virtual.
126 2005-10-14 Atsushi Enomoto <atsushi@ximian.com>
128 * SimpleCollator.cs : modified to use stackalloc for byte array.
130 2005-09-27 Atsushi Enomoto <atsushi@ximian.com>
132 * SimpleCollator.cs : in CompareInternal(), there was a possibility of
133 infinite loop. Fixed bug #76243.
135 2005-09-20 Atsushi Enomoto <atsushi@ximian.com>
137 * SimpleCollator.cs : In IsPrefix/IsSuffix, if target is an empty string,
138 immediately return true.
140 2005-09-09 Atsushi Enomoto <atsushi@ximian.com>
142 * SimpleCollator.cs : IsSuffix() optimization logic was buggy, so just
143 use pretty simple way with LastIndexOf() (no significant perf.
146 2005-09-01 Atsushi Enomoto <atsushi@ximian.com>
148 * README, Collation-notes.txt, CollationDataStructures.txt :
149 removing obsolete info and some added some notes.
151 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
153 * Normalization.cs : remove warned code.
154 * managed-collation.patch : now it's not required anymore.
156 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
158 * MSCompatUnicodeTable.cs : added IsSortable(string).
160 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
162 * SimpleCollator.cs : Now all collator methods are thread safe.
164 All instance non-readonly fields turned into arguments of every
165 methods that use those fields.
166 (Sadly it is the end of no-memory-cost collator era. mcs bootstrap
167 now needs +100KB memory consumption.)
169 2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
171 * SimpleCollator.cs : made "checkedFlags" as nullable and made it as
172 an argument of every index methods (to make it thread safe).
174 2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
177 MSCompatUnicodeTable.cs :
178 - Now IsIgnorable() is aggregated to be one invokation to check
179 completely ignorable, nonspacing and symbols.
180 - Introduced "already checked" flags for IndexOf() and LastIndexOf()
181 to skip sortkey binary check on the same characters. Significant
182 perf. improvement for such case as IndexOf("AABCBABC...Z",'Z').
184 2005-08-08 Gert Driesen <drieseng@users.sourceforge.net>
186 * SortKey.cs: Marked Serializable to match MS.NET.
188 2005-08-08 Atsushi Enomoto <atsushi@ximian.com>
190 * create-mscompat-collation-table.cs,
191 Makefile : changed resources output directory.
193 2005-08-04 Atsushi Enomoto <atsushi@ximian.com>
195 * create-normalization-tests.cs,
196 StringNormalizationTestSource.cs : new files for Unicode
197 Normalization test generator.
198 * Makefile : added support for above.
200 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
202 * NormalizationTableUtil.cs : oops, it does not compile.
203 * managed-collation.patch : I guess having managed resource would be
204 better for collation. At least current code has such #define so
205 Makefile should be in sync with it.
207 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
209 * create-normalization-source.cs : Fixed CharMapComparer which
210 incorrectly returned 0 when the second arg is shorter. Reduced
211 extraneous helperIndex map. Other minor fixes and code removal.
212 * Normalization.cs : several fixes to support blocked combine handling.
213 * NormalizationTableUtil.cs : tiny member renaming.
215 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
217 * create-normalization-source.cs,
218 NormalizationTableUtil.cs,
219 Normalization.cs : several bugfixes on index miscomputation.
220 Renamed using aliases (csc will bork). Primary combine safety is now
221 computed during UnicodeData.txt parse.
222 Maximum NFKD length was 18, not 4 (U+FDFA).
224 2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
226 * managed-collation.patch : added Normalization support.
227 * managed-collation-icall.patch : added, including normalization stuff.
229 BTW when will collation code checked in?
231 2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
233 * create-normalization-source.cs : Unified three normalization source
234 generators, to compute IsUnsafe flag. Fixed helperIndex array type
236 * create-char-mapping-source.cs,
237 create-combining-class-source.cs : thus removed.
238 * Makefile : thus modified for the above integration.
239 * NormalizationTableUtil.cs : Extended to contain IsUnsafe flag.
240 * Normalization.cs : Several fixes to make Normalize() actually work.
242 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
244 * create-normalization-source.cs,
246 create-char-mapping-source.cs,
247 create-combining-class-source.cs,
248 Makefile : converted managed array to pointers (like collation stuff).
250 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
252 * NormalizationTableUtil.cs : further table range optimization.
253 * create-normalization-source.cs,
254 create-char-mapping-source.cs,
255 create-combining-class-source.cs : added C header output support.
257 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
259 * create-normalization-source.cs, Normalization.cs :
260 Now property size is < 256, so directly embed value in "props" array.
261 Add QuickCheck(c,checkType) and remove IsNFD/C/KD/KC and delegates.
263 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
265 * create-combining-class-source.cs,
266 create-char-mapping-source.cs,
267 create-normalization-source.cs,
268 NormalizationTableUtil.cs,
269 Normalization.cs : String.Normalize() does not handle surrogate
270 characters. mapping information in DerivedNormalizationProps.txt
271 are not used in the code (those from UnicodeData.txt is used).
272 Hangul syllables are computed instead of embedded in the tables.
273 * managed-collation.patch : removed IntPtrStream and Makefile patches.
275 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
277 * MSCompatUnicodeTable.cs : IsSortable() was broken.
279 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
281 * MSCompatUnicodeTable.cs : added helper for CompareInfo.IsSortable().
283 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
285 * create-tailoring.cfg : added for convenience of contraction check.
287 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
289 * create-normalization-source.cs,
292 create-mscompat-collation-table.cs,
293 MSCompatUnicodeTableUtil.cs,
295 create-collation-element-table.cs,
296 MSCompatUnicodeTable.cs,
298 create-combining-class-source.cs : added copyright lines.
300 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
302 MSCompatUnicodeTable.cs : removed extraneous definition.
304 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
306 * create-mscompat-collation-table.cs
307 MSCompatUnicodeTable.cs : full C header support, finally.
309 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
312 NormalizationTableUtil.cs,
313 create-char-mapping-source.cs : more aggressive data compression.
314 It now ignores characters that are >= U+10000.
316 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
319 Normalization.template,
320 Normalization.cs : renamed existing file.
322 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
324 * NormalizationTableUtil.cs,
325 Normalization.template,
326 create-combining-class-source.cs : GetCombiningClass is now
327 implemented as indexer based array.
328 * Makefile : renamed output filename.
329 * create-mscompat-collation-table.cs : removed comments that does not
331 * create-tailoring.cs : use utf-8 output (and fixed filename).
333 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
335 * create-mscompat-collation-table.cs : hacked safer IPA extensions.
336 * Collation-notes.txt : status of sortkey table.
338 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
340 * create-mscompat-collation-table.cs : some Greek mapping fix.
342 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
344 * create-mscompat-collation-table.cs : diacritical weight is not
345 treated correctly when they are picked from letter names, as flags.
347 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
349 * create-mscompat-collation-table.cs : fixed culture-dependent
350 nonspacing mark weight.
352 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
354 * create-mscompat-collation-table.cs : some Hebrew case letter fixes.
355 Some diacritical fixes on symbols.
357 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
359 * create-mscompat-collation-table.cs : Fixed level 3 weight of
360 Arabic presentation forms.
362 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
364 * create-mscompat-collation-table.cs : Fixed some diacritical weight
365 of Arabic presentation forms.
367 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
369 * SimpleCollator.cs : more status updates. It's almost complete,
370 except for sortkey values.
372 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
374 * SimpleCollator.cs : similar optimization also for LastIndexOf().
376 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
378 * SimpleCollator.cs : the previous patch was missing IgnoreNonSpace
381 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
383 * SimpleCollator.cs : reduced extra sortkey value computation in
384 MatchesForward(). It makes IndexOf() roughly 30% faster.
386 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
388 * SortKey.cs : GetHashCode() returns a value based on its byte data.
391 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
393 * SimpleCollator.cs : consider extractions in invariant culture.
395 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
397 * SimpleCollator.cs : (unsafeFlags) be compact ;-)
399 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
401 * SimpleCollator.cs : When the tail of the target does not match more
402 than 3 times, then IsSuffix() will never be true (3 is the max
403 length of an expansion; \uFB03 -> ffi). It brings significant
404 performance boost when "source" string is very long.
405 * MSCompatUnicodeTable.cs : added MaxExpansionLength constant.
406 Reordered code lines.
408 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
410 * Collation-notes.txt : updated implementation status.
412 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
414 * SimpleCollator.cs : Implemented quick codepoint comparison in
415 Compare(). Comparison became 125x faster.
416 * mono-tailoring-source.txt : added tiny comment.
418 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
420 * mono-tailoring-source.txt : Added all single sortkey remapping to
421 all cultures (still need to fill contractions and annotate possible
422 buggy mapping referencing to CLDR).
423 * SimpleCollator.cs : removed unused code.
424 * MSCompatUnicodeTable.cs : tiny cast removal.
426 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
429 create-mscompat-collation-table.cs
430 MSCompatUnicodeTableUtil.cs
431 MSCompatUnicodeTable.cs : Now CJK mapping data is stored as byte
432 arrays. Thus SimpleCollator does not need to use bitwise and shift
433 operations to get sortkey value and they could be managed resources.
435 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
437 * create-mscompat-collation-table.cs,
438 MSCompatUnicodeTable.cs,
439 MSCompatUnicodeTableUtil.cs : From the result of sortkey comparison
440 between None and IgnoreWidth, width compat table could be computed
441 in somewhat simple way. So removed that table and all related code.
442 Increased the collation resource version.
444 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
446 * create-mscompat-collation-table.cs : Added C header output support.
448 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
450 * create-mscompat-collation-table.cs : FillLetterNFKD() could also be
451 applied to Cyrillic letters. Saved some of them.
453 2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
455 * MSCompatUnicodeTable.cs : oh, ok, so we already have
456 GetManifestResourceInternal() ;-)
457 * managed-collation.patch : in Assembly.cs made that method internal.
459 2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
461 * MSCompatUnicodeTable.cs : the pointer based icall code could be
462 also applicable for USE_MANAGED_RESOURCE mode.
464 2005-07-23 Atsushi Enomoto <atsushi@ximian.com>
466 * MSCompatUnicodeTable.cs : added icall support code (not enabled
467 unless the first line is commented out).
469 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
471 * create-mscompat-collation-table.cs,
472 MSCompatUnicodeTableUtil.cs,
473 MSCompatUnicodeTable.cs : Added resource version output (and ignore
474 in case of version mismatch). Removed obsolete, commented out code.
476 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
479 MSCompatUnicodeTable.cs,
480 create-mscompat-collation-table.cs : Now they use unmanaged pointers
481 instead of managed arrays.
482 * managed-collation.patch : Now it contains patch for IntPtrStream.cs
483 and Assembly.cs as well.
485 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
487 * MSCompatUnicodeTable.cs,
488 SimpleCollator.cs : Moved tailoring support classes to
489 MSCompatUnicodeTable.cs and drawn out from SimpleCollator.
490 Now that cjk and tailoring support are filled inside
491 MSCompatUnicodeTable, no managed array is exposed.
493 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
495 * create-mscompat-collation-table.cs,
497 MSCompatUnicodeTable.cs : Now it's not exposing collation table
498 internals as managed arrays (to switch to unmanaged pointers).
500 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
502 * create-mscompat-collation-table.cs : tiny nonspacing mark fix.
504 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
506 * create-mscompat-collation-table.cs : Fixed most of Greek mappings.
507 * MSCompatUnicodeTable.cs : don't lock string.
509 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
511 * create-mscompat-collation-table.cs : More Cyrillic diacritical fixes.
513 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
515 * create-mscompat-collation-table.cs : More Latin diacritical fixes.
517 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
519 * create-mscompat-collation-table.cs : There were still missing
520 math symbol mappings. Added several hacky diacritical weight for
523 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
525 * create-mscompat-collation-table.cs : fixed a few diacritical weight
526 on Cyrillic characters. Fixed ParseTailoringSource() to handle
527 non-heading escape sequence (\uXXXX) as expected.
529 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
531 * create-mscompat-collation-table.cs,
532 MSCompatUnicodeTableUtil.cs,
533 MSCompatUnicodeTable.cs : added more aggressive index limits for
534 table optimization at data size, in cost of speed.
536 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
538 * create-mscompat-collation-table.cs : fixed Arabic thirtial weight.
540 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
542 * create-mscompat-collation-table.cs : Mapping for hyphens and
543 punctuation are kinda finished. Rewrote batch mapping method to
544 collect all NFKD. Required modification on mapping is done.
546 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
548 * create-mscompat-collation-table.cs : minor mapping fixes on accent
549 marks and punctuations.
551 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
553 * create-mscompat-collation-table.cs : Fixed some MathSymbol mapping
554 and Box drawing mapping.
556 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
558 * create-mscompat-collation-table.cs : Fixed almost all numbers.
560 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
562 * create-mscompat-collation-table.cs : Symbol mappings are almost done.
563 Removed hack that gave dummy mappings to blank symbols.
565 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
567 * create-mscompat-collation-table.cs : more fix on arrows. Fix on box
568 drawings. Some code refactoring to eliminate hack.
570 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
572 * create-mscompat-collation-table.cs : Fixed some secondary weight
573 in Devanagari and arrows.
575 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
577 * create-mscompat-collation-table.cs : a set of tiny mapping fixes.
579 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
581 * create-mscompat-collation-table.cs : some diacritical fixes for
582 Latin. Added batch mapping method that considers computed
583 diacritical weight (for numbers).
585 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
587 * managed-collation.patch : forgot to add System.String patch.
589 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
591 * MSCompatUnicodeTable.cs : added resource existence check (required
592 for mscorlib transient time from the one without resources to the
595 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
597 * create-mscompat-collation-table.cs : fixed punctuations and hyphen
598 (shift) primary weight.
600 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
602 * create-mscompat-collation-table.cs : more nonspacing mark fixes.
603 Some non-basic Cyrillic diacritical weight fixes.
605 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
607 * create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
608 and level 3. Tiny Hangul weight fixes.
609 * MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
611 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
613 * create-mscompat-collation-table.cs : some normal characters who have
614 "narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
615 values were different. Handle U+30FB as category A.
616 * MSCompatUnicodeTable.cs : U+30FB does not have special weight.
618 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
620 * create-mscompat-collation-table.cs : more diacritical weight fixes.
621 Removed some unused code.
623 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
625 * create-mscompat-collation-table.cs : Fixed some Thai and Arabic
628 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
630 * create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
632 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
634 * create-mscompat-collation-table.cs : Fixed nonspacing marks in
635 Malayalam, Thai and Lao. Removed extraneous hack.
637 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
639 * SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
640 Some refactoring on IndexOf() code. Removed unused Matches().
641 * Collation-notes.txt : some methods needed to be reimplemented, so
642 rewrote the description.
644 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
646 * SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
647 Thus supported extenders in IsSuffix().
649 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
651 * SimpleCollator.cs : more IsSuffix() simplification, but it will be
652 stopped here since it cannot handle extenders (implementing new
655 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
657 * SimpleCollator.cs : simplified IsSuffix() code.
659 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
661 * SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
662 entire replacement string if char target was an expansion.
663 IsSuffix() was using a method for IsPrefix() which was incorrect.
664 Removed old IsPrefix() code.
666 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
668 * SimpleCollator.cs : IndexOf() was incorrectly sharing the same
669 byte[] field in different areas of code. Now extenders in both
670 source and target really work in IndexOf().
672 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
674 * create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
675 * SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
677 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
679 * SimpleCollator.cs : Now FilterExtender() handles all extender
680 support. IndexOf() and LastIndexOf() now supports extenders.
681 IndexOf() and LastIndexOf() did not proceed contraction source
682 length as expected. Tiny refactoring on private IsPrefix() to take
685 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
687 * SimpleCollator.cs : when restoring from expansion, go back to the
688 top of the loop (to avoid index out of range).
689 Now IsPrefix() is implemented to reuse Compare() and thus it now
690 supports extender as well.
691 * Collation-notes.txt : status update. Deleted optimization part in
692 status section (it is duplicate).
694 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
696 * SimpleCollator.cs : some code reordering.
697 * create-mscompat-collation-table.cs : it was still missing U+3094.
699 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
701 * SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
703 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
705 * SimpleCollator.cs : In GetSortKey(), don't update previousChar when
706 it is not primary (e.g. don't "extend" diacritical mark).
708 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
710 * managed-collation.patch : CompareInfo.Compare() should consider
711 the possibilities that non-empty string might be actually empty
712 in culture-sensitive context.
714 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
716 * SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
717 target is "empty" (in culture-sensitive context).
719 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
721 * SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
722 characters in target string.
724 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
726 * SimpleCollator.cs : When IgnoreWidth is specified, all Kana
727 characters are regarded as half-width.
728 Even though IgnoreWidth is specified, it should not ignore case.
729 For special weight comparison, the default values (E4) are bigger
730 than non-default values.
731 * SortKeyBuffer.cs : It should save LCID and original string.
732 * create-mscompat-collation-table.cs : For Japanese half-width kana,
733 it should not be counted in widthCompat map since IgnoreWidth does
734 not really ignore those differences.
736 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
738 * create-mscompat-collation-table.cs : Fixed missing Japanese bits.
740 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
742 * create-mscompat-collation-table.cs :
743 tiny diacritical weight fix for U+20D0-U+20E1.
745 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
747 * create-mscompat-collation-table.cs : ja CJK ideograph got completed.
749 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
751 * create-mscompat-collation-table.cs : Fixed CJK custom Japanese
752 mapping. It (maybe as well as other CJK tables) mixes NFKD. For
753 Japanese, modified NFKD table (because of Windows lame design).
755 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
757 * Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
758 * MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
759 invoked at any time it is required.
760 * SimpleCollator.cs : call FillCJK() above in .ctor().
761 * MSCompatUnicodeTableUtil.cs : CJK range was wider.
762 * create-mscompat-collation-table.cs : CJK binary was missing the
763 length. CJK remapping is being moved to ModifyUnidata().
764 For cjk-ja mapping, we have to consider compat characters to be
765 added to the map, besides the raw UCA table.
767 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
769 * SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
771 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
773 * SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
774 contraction as expected. Fixed Compare() to save s2's contraction
776 * TestDriver.cs :added LastIndexOf() tester w/ indexes.
778 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
780 * managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
781 incorrectly use Compare().
782 * TestDriver.cs : more moved to nunit tests.
784 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
786 * SimpleCollator.cs : several fixes on Compare().
787 - Ignorable characters are skippted at the top of the loop.
788 - IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
789 - In such case that s1 index is increased while s2 contraction is
790 replaced, s1 is inconsistently proceeded (bug).
791 - IsIgnorable() now also checks IgnoreNonSpace.
792 - Fixed FilterOptions() that does not work for IgnoreWidth at all.
793 * TestDriver.cs : now some are moved to nunit tests.
794 * Collation-notes.txt : minor todo update.
796 2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
798 * SimpleCollator.cs : Compare() was ignoring such case that both
799 entire strings have '-' to be compared.
800 * Collation-notes.txt : more status updates.
801 * TestDriver.cs : added '-' use cases.
803 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
805 * SimpleCollator.cs : to be same as other buggy part, it now handles
806 U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
808 Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
809 * create-mscompat-collation-table.cs : dummy values for extenders.
811 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
813 * SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
814 should be computed from ExtenderType, and voice mark weight should
816 * MSCompatUnicodeTable.cs : added tiny comment.
818 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
820 * SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
821 * SimpleCollator.cs : support for extender (U+309D etc.).
823 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
825 * create-mscompat-collation-table.cs : some punct/symbols fix.
826 * managed-collation.patch : new (and temporary) file to support
827 managed collation in mscorlib.
828 * README : described how to use managed collation.
830 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
832 * create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
833 U+482-4C8 (though needs diacritical fixes).
834 * MSCompatUnicodeTable.cs : tiny comment for alternative impl.
836 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
838 * create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
839 computation code, since it looks like the same way as Latin letters
840 have. Thus removed all other approach (UCA, by letter name).
842 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
844 * create-mscompat-collation-table.cs : diacritical fix for "double-
845 struck". Syriac nonspacing fixes.
847 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
849 * create-mscompat-collation-table.cs : more math symbol weight fixes.
851 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
853 * create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
855 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
857 * create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
858 implemented (no stub). Some other fixes on category 8-A.
860 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
862 * create-mscompat-collation-table.cs : some minor fixes on Arabic,
863 Korean and Japanese sortkey weights.
865 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
867 * create-mscompat-collation-table.cs : More diacritical fixes.
868 Georgian characters do not have level 2 weights but level 3.
870 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
872 * create-mscompat-collation-table.cs : Roman numeral characters
873 have diacritical weight. quick hack for control signs (U+2400..)
876 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
878 * create-mscompat-collation-table.cs : improving Latin mappings.
879 Setting non-ASCII Latin characters' primary weight between those
880 ASCII characters, and setting diacritical weight (hacky).
881 * MSCompatUnicodeTable.cs :
882 Kanatype check: fixed (voice marks) and improved (comparison order).
884 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
886 * create-mscompat-collation-table.cs : more diacritical fixes.
887 primary weight fixes on punctuations in category 07.
889 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
891 * create-mscompat-collation-table.cs : several diacritical fixes.
892 * TestDriver.cs : sortkey dumper should use StringSort.
894 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
896 * SimpleCollator.cs : fixed incorrect indexer setup. Optimized
897 GetContraction() call a bit.
899 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
901 * create-mscompat-collation-table.cs : fixed incorrect level 2
903 * MSCompatUnicodeTable.cs : remove debug line.
905 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
907 * MSCompatUnicodeTableUtil.cs,
908 MSCompatUnicodeTable.cs,
910 create-mscompat-collation-table.cs : made some members internal and
911 accessible from other classes. Many indexes could be 0 by default.
912 * SimpleCollator.cs : optimizations. avoid method call.
914 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
916 * Collation-notes.txt : more updates.
917 * SimpleCollator.cs : Added quick check for Ordinal comparison.
918 Fixed special weight comparison. It cannot be customizable in the
919 implementation (and it won't be harmful).
920 * mono-tailoring-source.txt : thus updated comment.
922 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
924 * SimpleCollator.cs : Compare() was missing French sort support.
925 * TestDriver.cs : added example case.
927 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
929 * Collation-notes.txt : updated status. Eliminated descriptions on
930 "iterator" (I avoided it for performance concern). Fixed misc.
931 incorrect descriptions.
933 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
935 * Collator.cs : Now that SimpleCollator became feature complete, it is
938 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
940 * SimpleCollator.cs : implemented decent Compare() that immediately
941 stops at first primary difference.
943 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
945 * SimpleCollator.cs : indexers might return -1.
947 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
949 * SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
950 buggy (length check for source was missing).
952 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
954 * create-mscompat-collation-table.cs : Fixed tailoring table output
955 to be in correct and countable order. Now if tailoring alias was not
956 found, just stop the build.
957 * MSCompatUnicodeTable.cs : several build fixes. Now it works to read
959 * mono-tailoring-source.txt : commented out CJK aliases that miss
961 * Makefile : needed further filename fixes.
963 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
965 * MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
966 (now it is working as a standalone file).
967 * Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
968 (the generator now creates both binary resources and C# source).
970 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
972 * create-mscompat-collation-table.cs : Now it generates binary
973 resources (to parent directory).
974 * MSCompatUnicodeTable.template : added conditional code that fills
975 collation tables from manifest resources.
976 * Makefile : remove collation table binaries as well on "make clean".
977 Removed extraneous dependency.
979 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
981 * MSCompatUnicodeTable.template,
982 SimpleCollator.cs : removed extraneous GetExpansion().
984 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
986 * SimpleCollator.cs : IsSuffix() also supports contractions.
987 * TestDriver.cs : IsSuffix() example contraction cases.
989 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
991 * SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
992 what current IsPrefix() does). For expansion of target, IsPrefix()
993 should check the no-match case that expansion is longer than input.
994 Some refactory on IsPrefix().
995 Added GetContractionTal() for IsSuffix() (not used yet).
997 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
999 * TestDriver.cs : added IsPrefix() expansion cases.
1000 * SimpleCollator.cs : IsPrefix() now supports contractions (with much
1001 of complexity), and it now returns bool again.
1002 IndexOf() for replacement should make use of IndexOfPrimitiveChar()
1003 since expansions won't be expanded recursively.
1005 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
1007 * SimpleCollator.cs : commonized character comparison in IsPrefix()
1008 and IsSuffix(). csc compile fix.
1009 * CompareInfoImpl.cs : deleted.
1011 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1013 * TestDriver.cs : added SimpleCollator.ctor() sanity check.
1014 Added replacement contraction example.
1015 * SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
1016 contraction in source string. Extracted matching code to Matches().
1017 Replacement contraction was including extraneous '\x0'.
1019 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1021 * Collation-notes.txt : updated status.
1022 * CollationDataStructures.txt : tiny fixes.
1023 * SimpleCollator.cs :
1024 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
1025 namespace Util and csc borked).
1026 GetContraction was incorrectly returning first item.
1027 Private IsPrefix() now returns int (but it might not be in real use).
1028 Extracted simple char comparison to CompareCharSimple().
1029 IndexOf() and LastIndexOf() now fully handle contractions (both
1030 binary key and string replacement) in "target" (for "s" not yet).
1031 * TestDriver.cs : be more verbose.
1032 * mono-tailoring-source.txt : added comment.
1033 * MSCompatUnicodeTable.template :
1034 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
1036 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1038 * create-mscompat-collation-table.cs : compute COMBINING blah marks as
1039 well as those characters WITH blah.
1040 * TestDriver.cs : added combining sortkey cases.
1042 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1044 * mono-tailoring-source.txt : fixed description on '*' in sortkeys.
1045 * SimpleCollator.cs : Now it fully uses tailoring info. Fixed
1046 contraction search that worked only when string is contraction.
1047 Removed commented code. Minor refactoring.
1048 * TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
1050 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1052 * create-mscompat-collation-table.cs,
1053 * mono-tailoring-source.txt : removed extraneous level 4 sortkey
1054 which cannot be supported.
1055 * SimpleCollator.cs : added GetContraction() and used in some places.
1056 Now CompareOptions is set only once. Reordered some code (e.g.
1057 ignorable check -> get compat char -> compare).
1059 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1061 * SimpleCollator.cs : sort tailoring tables before actual usage.
1062 Support diacritical remappings (it is customized collation rule
1063 which does not exist in UCA).
1065 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1067 * SimpleCollator.cs : build culture specific tailoring table from
1068 TailoringInfo and unified data array.
1069 * create-mscompat-collation-table.cs : Added null termination to
1070 sortkey map tailorings (mostly to save my eyes).
1071 * MSCompatUnicodeTable.template : added public TailoringValues.
1073 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1075 * SortKeyBuffer.cs : handle special weight (category 06) characters.
1076 * Collation-notes.txt : Updated description on special weight (it was
1078 * TestDriver.cs : added special weight cases.
1080 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1082 * MSCompatUnicodeTable.template : added GetTailoringInfo().
1083 * SimpleCollator.cs : Now tailoring information is acquired and used.
1084 (FrenchSort is supported but Compare() won't work expectedly since
1085 the table is still incomplete for those diacritical marks).
1086 * SortKeyBuffer.cs : On reversing diacritical weights, it should
1087 ignore zeros. Reset() should reset frenchSorted flag.
1089 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1091 * create-mscompat-collation-table.cs : Further fixes on Jamo,
1092 diacritical weights by character name, and *Numbers primary weights.
1094 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1096 * create-mscompat-collation-table.cs : More fix on Devanagari,
1097 Gujarati, Oliya, Tamil and Lao sortkeys.
1099 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1101 * create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
1104 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1106 * create-mscompat-collation-table.cs : Fixed Thai character primary
1107 and secondary values. Fixed Thaana letters. Added more LAMESPEC
1108 CJK compat. Fixed some circled CJK secondary weight.
1109 Hacked some nonspacing mark sortkey value adjustment.
1111 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1113 * create-mscompat-collation-table.cs : CP932.TXT was not parsed as
1114 expected. JIS ordering was incorrect. OtherNumbers that represents
1115 10 or more values were incorrectly computed the offset. Some Hangul
1116 compat characters has different offset.
1118 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1120 * create-mscompat-collation-table.cs : Fixed 0x8 category characters.
1121 Added hack for need-to-be-fixed characters to fall into 0xA category.
1122 * create-collation-element-table.cs : previous checkin seem failed :(
1123 * README: updated a bit.
1125 2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
1127 * CodePointIndexer.cs :
1128 removed extraneous switch (I could use empty array for that need).
1129 * CollationElementTableUtil.cs : primary weight type became ushort.
1130 * create-collation-element-table.cs : several bugfixes.
1131 collElem should be int. It was skipping most of entries because of
1132 incorrect string tokenization.
1134 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1136 * create-mscompat-collation-table.cs : handle some Jamo NKFD.
1138 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1140 * SimpleCollator.cs : forgot to commit in the last checkin.
1141 * create-mscompat-collation-table.cs : fixed arabic shift weight chars.
1142 * TestDriver.cs : switch table dumper and collator testing.
1143 * SortKey.cs : for now comment out internal indexes (not in use).
1145 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1147 * MSCompatUnicodeTable.template,
1148 SimpleCollator.cs : support for culture dependent CJK table.
1150 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1152 * create-mscompat-collation-table.cs,
1153 MSCompatUnicodeTableUtil.cs : make CJK table more compact.
1155 2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
1157 * SimpleCollator.cs : Fixed stupid index search when start != 0.
1159 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1161 * SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
1162 now starts from "start" and proceeds backward by "length".
1163 * TestDriver.cs : fix warning.
1165 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1167 * TestDriver.cs : more tests.
1168 * SimpleCollator.cs : LastIndexOf() is not setting search length
1169 on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
1171 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1173 * create-normalization-source.cs : output propValue as uint.
1175 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1177 * SortKey.cs : Now it is System.Globalization.SortKey.
1178 To replace existing implementation, it now requires lcid and
1179 CompareOptions. Added required members.
1180 * SortKeyBuffer.cs : thus .ctor() requires LCID.
1181 * SimpleCollator.cs : made required changes above.
1183 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1185 * CodePointIndexer.cs : added CompressArray(). Now it requires two more
1186 parameters for default index and codepoint.
1187 * CollationElementTableUtil.cs,
1188 NormalizationTableUtil.cs : required changes wrt above change.
1189 * MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
1190 * MSCompatUnicodeTable.template : Now it uses codepoint indexer.
1191 * create-mscompat-collation-table.cs : Now it outputs compressed array.
1192 * Makefile : now collation requires MSCompatUnicodeTableUtil.cs
1194 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1196 * SimpleCollator.cs :
1197 Implemented IsSuffix() and LastIndexOf().
1198 Several fixes on index > 0 cases.
1199 * TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
1201 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1203 * Collation-notes.txt : updated (status, impl. classes).
1204 * MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
1206 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1208 * SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
1209 and IsPrefix(). Tiny code refactory.
1210 * TestDriver.cs : sample IsPrefix() and IndexOf() usage.
1211 * MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
1213 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1215 * SimpleCollator.cs :
1216 IndexOf(string, char, CompareOptions) implementation.
1217 * TestDriver.cs : sample IndexOf() usage.
1219 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1221 * create-mscompat-collation-table.cs : was missing most important
1222 kind of blocks - equivalent expansions (e.g. invariant mappings).
1223 More readable mappings.
1225 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1227 * mono-tailoring-source.txt : new file. It describes tailoring
1228 information. Basically examined under .NET 1.x.
1229 * create-mscompat-collation-table.cs : consume the file above.
1230 * MSCompatUnicodeTable.template : now tailorings is not a stub.
1231 * CollationDataStructures.txt : minor fixes.
1233 SimpleCollator.cs : added FrenchSort support.
1234 * Collation-notes.txt : added description on Latin primary weights.
1235 * ldml-limited.rng : added note.
1236 * create-tailorings.cs : added note. more serialization (but won't be
1239 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1241 * SortKeyBuffer.cs : non-primary character is added to previous
1243 * TestDriver.cs : added example case of above.
1245 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1247 * SimpleCollator.cs : IgnoreSymbols support.
1248 * TestDriver.cs : compilation fix. IgnoreSymbols example.
1249 * create-mscompat-collation-table.cs : more Hangul fixes.
1251 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1253 * create-mscompat-collation-table.cs : more Hangul fixes.
1254 * SortKey.cs : it will replace sys.globalization.SortKey. It has
1255 some internal members.
1256 * SortKeyBuffer.cs : now it uses SortKey instead of byte[].
1257 * SimpleCollator.cs : CompareOptions support. However I don't think
1258 it will be developed anymore since SortKey never enables IndexOf().
1259 * TestDriver.cs : a few CompareOptions cases.
1261 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1263 * SimpleCollator.cs : simple collator implementation that just will
1264 use GetSortKey() for all its basis.
1265 * TestDriver.cs : sample code that uses this collator set.
1266 * MSCompatUnicodeTable.template : removed test driver from here.
1268 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1270 * create-mscompat-collation-table.cs : Hangul fixes.
1271 Now less than 300 characters that does not have sortkey weights.
1272 * MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
1274 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1276 * create-mscompat-collation-table.cs : Added control picture mappings.
1277 Minor primary weight fixes.
1279 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1281 * create-mscompat-collation-table.cs : Added mappings for box
1282 drawings and blocks.
1284 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1286 * create-mscompat-collation-table.cs : Added mappings for arrows.
1288 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1290 * create-mscompat-collation-table.cs : added support for letterlike
1291 characters and squared CJK compatibility characters, ordered by
1292 character names (0x0E category).
1293 * Collation-notes.txt : added description on that.
1295 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1297 * MSCompatUnicodeTable.template : Now expansions are simulated.
1298 * create-mscompat-collation-table.cs : filled Korean number level2.
1299 Reordered some code blocks to fill correct diacritical differences.
1300 * Collation-notes.txt : some corrections and minor additions.
1302 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1304 * MSCompatUnicodeTable.template :
1305 Now dumper test driver uses SortKeyBuffer for dogfooding.
1306 * create-mscompat-collation-table.cs : some diacritical level fixes
1307 (with non-working extra latin check).
1308 * SortKeyBuffer.cs : several fixes to get working as a practical code.
1309 * Collator.cs : make it compilable, leaving things as NotImplemented.
1311 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1313 * create-mscompat-collation-table.cs : some fixes on primary category
1314 07 (miscellaneous symbols and punctuations).
1316 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1318 * create-mscompat-collation-table.cs : more mapping fix on numbers,
1319 letters, variable weight characters, circled Japanese and CJK.
1320 * MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
1321 inclusive. Simplified dumper code.
1323 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1325 * create-mscompat-collation-table.cs : finished Hangul (both Jamo
1326 and Syllables). sortkey dumper diff lines became 8000 from 30000.
1328 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1330 * create-mscompat-collation-table.cs : added some nonspacing marks in
1331 either correct or hacky way.
1333 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
1335 * create-mscompat-collation-table.cs : several improvements. Japanese
1336 Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
1337 numeric characters, diacritically decorated latin alphabets. Fixed
1338 some diacritical weights detection.
1339 * MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
1340 marks' primary weight as empty.
1341 * Collation-notes.txt : some updates.
1343 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
1345 * create-mscompat-collation-table.cs : don't process nonexact NFKD
1346 mapping as equivalent, however store CJK extensions into NFKD map
1347 even if one does not strictly match.
1348 Now am going to fill Hangul into tables (unlike UCA it does not look
1349 possible to calculate sortkey value).
1350 Fixed Cyrillic and Georgian UCA based orderings.
1351 * MSCompatUnicodeTable.template : added CJK extension sortkey
1354 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
1356 * create-mscompat-collation-table.cs : Fixed latin alphabet support.
1357 Added latin with diacritical and CJK extension.
1358 * MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
1360 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
1362 * create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
1363 now not used thouth). Filled CJK ideograph, still not perfect.
1364 Fixed number primary keys. NFKD numbers and CJK ideographs are now
1365 considered, including brackets elimination.
1366 * Makefile : now it downloads DerivedAge.txt.
1367 * MSCompatUnicodeTable.template : added dummy code dumper. It computes
1368 PrivateUse, Surrogate and Hangul Syllables.
1369 * Collation-notes.txt : Noted that Hangul Syllables need more love.
1371 2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
1373 * create-tailorings.cs : added configuration support. sort them.
1374 I wonder if it is really usable. Having own format might be better.
1375 * create-mscompat-collation-table.cs : fixing some sortkey numbers,
1376 making closer to windows. Now it handles NFKD in some places.
1377 * MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
1378 * CollationDataStructures.txt : added description on tailoring
1379 fields, though they are subject to change.
1381 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
1383 * create-tailorings.cs, ldml-limited.rng : new file.
1384 * LdmlReader.cs : removed old file.
1386 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
1388 * SortKeyBuffer.cs : split from Collator.cs. Now it considers
1389 practical use, reflecting updated sortkey constant design.
1390 Especially level 4 weight is split to 4 arrays that are merged in
1391 the last stage of GetSortKey().
1392 * Collator.cs : thus SortKeyBuffer is removed from here.
1393 Additionally, removed some extraneous bits in other classes.
1394 * Collation-notes.txt : Some editorial fixes. Added information on
1395 Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
1396 be stored in simple byte arrays).
1397 * CodePointIndexer.cs,
1398 create-collation-element-table.cs,
1399 CollationElementTable.template,
1400 NormalizationTableUtil.cs : short CodePointIndexer method names.
1401 * create-mscompat-collation-table.cs : Additional info on why some
1402 meaningful characters are ignored in Windows (Unicode version
1403 difference). Removed U+070F from special check (was extraneous).
1405 2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
1407 * MSCompatUnicodeTable.template:
1408 Moved body implementation to table creator and put those bool
1409 results into an array.
1410 * create-mscompat-collation-table.cs :
1411 So imported those methods. Modified array output to emit "0x"
1412 only for more than 9.
1413 * create-normalization-source.cs : ditto on "0x" output matter.
1414 * CollationDataStructures.txt : so now it holds ignorableFlags.
1416 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
1418 * Collation-notes.txt, CollationDataStructures.txt :
1419 separate document for data structure design.
1421 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
1423 * create-mscompat-collation-table.cs : added culture-dependent CJK
1424 table creation. It uses CLDR as its basis. (Culture independent CJK
1426 * Makefile : added CLDR archive downloading support.
1427 * MSCompatUnicodeTable.template : tiny renamings.
1428 * Collation-notes.txt : additional CJK info.
1430 2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
1432 * Collation-notes.txt, create-mscompat-collation-table.cs :
1433 added secondary weight support for BlahNumber characters.
1435 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
1437 * downloaded : added directory. All downloaded files are stored here.
1438 * Makefile : use "downloaded" directory.
1439 Added more auto-download stuff.
1440 * create-mscompat-collation-table.cs :
1441 Added Japanese square kana support.
1443 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
1445 * Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
1446 * create-mscompat-collation-table.cs : added support for Arabic abjad,
1447 Estrangela and Thaana.
1448 * MSCompatUnicodeTable.template : removed BOM.
1450 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1452 * Collation-notes.txt : wrong comment cleanup and spelling fixes.
1453 * create-mscompat-collation-table.cs : added diacritic support for
1454 Latin letters (as long as covered in primary weight).
1456 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1458 * Makefile : minor fixes. Added warning lines to generated sources.
1460 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1462 * create-char-mapping-source.cs :
1463 Removed ToWidthInsensitive() generation.
1465 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1467 * create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
1468 ToWidthInsensitive() is implemented here, using an array (which is
1469 to be optimized using CodePointIndexer).
1470 * MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
1471 * MSCompatUnicodeTable.template : now it is used to generate
1472 MSCompatUnicodeTable.cs which got ready to be used.
1473 * Makefile : added MSCompatUnicodeTable.cs build support. Now it
1474 supports "make normalization" and "make collation".
1476 2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
1478 * Collation-notes.txt : Description on ICU is very incorrect. Now it
1479 became more rational and sane.
1480 * create-mscompat-collation-table.cs : fixed some indexes.
1481 * Makefile : added "mstablegen" target.
1482 * MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
1484 2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
1486 * Collation-notes.txt : more analysis on "letters".
1487 * create-mscompat-collation-table.cs : more proof of concepts.
1489 2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
1491 * Collation-notes.txt : more info. Started letter sortkey analysis
1492 (some of other stuff are really non-understandable right now.)
1493 * create-mscompat-collation-table.cs : table generator proof-of-
1494 concept source (not compilable).
1495 * MSCompatUnicodeTable.cs : moved some code to the new source.
1498 2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
1500 * Collation-notes.txt : started level 2 weight analysis.
1502 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1504 * Collation-notes.txt : Additional information on how to create
1506 * MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
1508 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1510 * Collation-notes.txt : More case weight (level 3) analysis. I'm
1511 likely to just write table generator.
1513 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1515 * MSCompatUnicodeTable.cs : part of level 4 weight implementation.
1517 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1519 * Collation-notes.txt :
1521 Revised comparison methods; backward iteration is possible.
1522 More on char-by-char comparison.
1523 Level 4 comparison is actually a bit more complex.
1525 * Collator.cs : some conceptual updates wrt above.
1527 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1529 * Collation-notes.txt : Japanese voice mark is level 2, and Hangul
1530 properties are level 3.
1532 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1534 * Collation-notes.txt : Make it more readable. More analysis on
1535 level 3 and 4 sortkey structures.
1536 * Collator.cs : some compilation fixes (not compilable yet).
1538 2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
1540 * Collation-notes.txt : Analysis on variable-weighting (level 5)
1542 * Collator.cs : updated corresponding part of level 5, and more.
1544 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1546 * Collation-notes.txt : more updates.
1547 * Collator.cs : rewrote from scratch. Some rough sketch for sortkey
1548 buffer, character iterator and collator methods. Not compiling.
1550 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1552 * Collator.cs : Am going to replace it with new one. No need for
1553 CompareOptions-dependent Comparer.
1555 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1557 * Collation-notes.txt : There seems a bit more complexity.
1559 2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
1561 * Collation-notes.txt : more updates, being close to write sortkey
1564 2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
1566 * CompareInfoImpl.cs, Collator.cs : conceptual update
1567 * Collation-notes.txt : some corrections and additions.
1568 * Makefile : added LDML input (but it won't be used at all).
1570 2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
1572 * Collation-notes.txt : more updates.
1574 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1576 * Collation-notes.txt : more updates.
1578 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1580 * Collation-notes.txt : some updates.
1581 * create-mapping-char-source.cs : superscripts and subscripts are also
1582 ignored in IgnoreWidth comparison.
1583 * Makefile : tiny touch fix.
1585 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1587 * CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
1589 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1591 * create-char-mapping-source.cs : Now it generates
1592 ToWidthInsensitive() from combining category <wide> and <narrow>.
1593 * MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
1594 ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
1596 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1598 * README, LdmlReader.cs, DataStructures.txt : new files.
1600 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1602 * CodePointIndexer.cs,
1603 Collation-notes.txt,
1604 CollationElementTable.template,
1605 CollationElementTableUtil.cs,
1606 create-char-mapping-source.cs,
1607 create-collation-element-table.cs,
1608 create-combining-class-source.cs,
1609 create-normalization-source.cs,
1611 MSCompatUnicodeTable.cs,
1612 Normalization.template,
1613 NormalizationTableUtil.cs : initial checkin (to private branch).