1 2006-12-25 Atsushi Enomoto <atsushi@ximian.com>
3 * SimpleCollator.cs : added IndexOf() implementation for Ordinal
4 and OrdinalIgnoreCase, though Ordinal version is not used (since
5 it is slower than icall).
7 2006-05-30 Miguel de Icaza <miguel@novell.com>
9 * MSCompatUnicodeTable.cs: Remove the fixed loading and compute it
10 just when we actually consume it. This only fixes the
13 2006-04-14 Atsushi Enomoto <atsushi@ximian.com>
15 * README: removed obsolete info.
16 * Normalization.cs : canonical reordering should participate in the
17 decomposition step. In reordering, string append was incomplete.
18 Combining class check is required in NFD check. Icall is written
21 2005-12-07 Zoltan Varga <vargaz@gmail.com>
23 * SimpleCollator.cs: Fix a warning.
25 2005-11-30 Sebastien Pouliot <sebastien@ximian.com>
27 * SimpleCollator.cs: Fix CAS support. The static ctor/var try to get
28 the environment variable MUCH too soon (i.e. the security manager
31 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
33 * SimpleCollator.cs : direct fast-path optimization for IndexOf().
35 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
38 - CompareQuick(): added immediateBreakup to avoid extraneous sortkey
40 - QuickCheckPossible(): index used for s1 was incorrect.
42 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
44 * SimpleCollator.cs : added another quick check for CompareInternal()
45 that does almost ordinal comparison for quick-checkable strings.
46 (It affects on Compare(), IndexOf(), IsSuffix() etc. as well.)
48 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
50 * MSCompatUnicodeTable.cs : (IsIgnorable) \0 is not ignorable.
53 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
56 Created another struct to reduce method arguments. Created another
57 flags that keeps "once-matched" state (counterpart of
58 checkedFlags, now neverMatchFlags).
60 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
63 - Added CompareOrdinalIgnoreCase() for NET_2_0 RTM.
64 - Reduced extra parameter from LastIndexOfSortKey().
65 - LastIndexOf() should use GetTailContraction for the source string.
66 And then, target could match in the middle of the possible
67 "replacement contraction" of the source string, so use
68 LastIndexOfSortKey() to catch them.
69 - Fixed GetTailContraction() that caused index out of range.
71 2005-11-11 Atsushi Enomoto <atsushi@ximian.com>
73 * Makefile : Now use MONO_DISABLE_MANAGED_COLLATION.
74 * SortKey.cs : some members are virtual.
76 2005-10-14 Atsushi Enomoto <atsushi@ximian.com>
78 * SimpleCollator.cs : modified to use stackalloc for byte array.
80 2005-09-27 Atsushi Enomoto <atsushi@ximian.com>
82 * SimpleCollator.cs : in CompareInternal(), there was a possibility of
83 infinite loop. Fixed bug #76243.
85 2005-09-20 Atsushi Enomoto <atsushi@ximian.com>
87 * SimpleCollator.cs : In IsPrefix/IsSuffix, if target is an empty string,
88 immediately return true.
90 2005-09-09 Atsushi Enomoto <atsushi@ximian.com>
92 * SimpleCollator.cs : IsSuffix() optimization logic was buggy, so just
93 use pretty simple way with LastIndexOf() (no significant perf.
96 2005-09-01 Atsushi Enomoto <atsushi@ximian.com>
98 * README, Collation-notes.txt, CollationDataStructures.txt :
99 removing obsolete info and some added some notes.
101 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
103 * Normalization.cs : remove warned code.
104 * managed-collation.patch : now it's not required anymore.
106 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
108 * MSCompatUnicodeTable.cs : added IsSortable(string).
110 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
112 * SimpleCollator.cs : Now all collator methods are thread safe.
114 All instance non-readonly fields turned into arguments of every
115 methods that use those fields.
116 (Sadly it is the end of no-memory-cost collator era. mcs bootstrap
117 now needs +100KB memory consumption.)
119 2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
121 * SimpleCollator.cs : made "checkedFlags" as nullable and made it as
122 an argument of every index methods (to make it thread safe).
124 2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
127 MSCompatUnicodeTable.cs :
128 - Now IsIgnorable() is aggregated to be one invokation to check
129 completely ignorable, nonspacing and symbols.
130 - Introduced "already checked" flags for IndexOf() and LastIndexOf()
131 to skip sortkey binary check on the same characters. Significant
132 perf. improvement for such case as IndexOf("AABCBABC...Z",'Z').
134 2005-08-08 Gert Driesen <drieseng@users.sourceforge.net>
136 * SortKey.cs: Marked Serializable to match MS.NET.
138 2005-08-08 Atsushi Enomoto <atsushi@ximian.com>
140 * create-mscompat-collation-table.cs,
141 Makefile : changed resources output directory.
143 2005-08-04 Atsushi Enomoto <atsushi@ximian.com>
145 * create-normalization-tests.cs,
146 StringNormalizationTestSource.cs : new files for Unicode
147 Normalization test generator.
148 * Makefile : added support for above.
150 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
152 * NormalizationTableUtil.cs : oops, it does not compile.
153 * managed-collation.patch : I guess having managed resource would be
154 better for collation. At least current code has such #define so
155 Makefile should be in sync with it.
157 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
159 * create-normalization-source.cs : Fixed CharMapComparer which
160 incorrectly returned 0 when the second arg is shorter. Reduced
161 extraneous helperIndex map. Other minor fixes and code removal.
162 * Normalization.cs : several fixes to support blocked combine handling.
163 * NormalizationTableUtil.cs : tiny member renaming.
165 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
167 * create-normalization-source.cs,
168 NormalizationTableUtil.cs,
169 Normalization.cs : several bugfixes on index miscomputation.
170 Renamed using aliases (csc will bork). Primary combine safety is now
171 computed during UnicodeData.txt parse.
172 Maximum NFKD length was 18, not 4 (U+FDFA).
174 2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
176 * managed-collation.patch : added Normalization support.
177 * managed-collation-icall.patch : added, including normalization stuff.
179 BTW when will collation code checked in?
181 2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
183 * create-normalization-source.cs : Unified three normalization source
184 generators, to compute IsUnsafe flag. Fixed helperIndex array type
186 * create-char-mapping-source.cs,
187 create-combining-class-source.cs : thus removed.
188 * Makefile : thus modified for the above integration.
189 * NormalizationTableUtil.cs : Extended to contain IsUnsafe flag.
190 * Normalization.cs : Several fixes to make Normalize() actually work.
192 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
194 * create-normalization-source.cs,
196 create-char-mapping-source.cs,
197 create-combining-class-source.cs,
198 Makefile : converted managed array to pointers (like collation stuff).
200 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
202 * NormalizationTableUtil.cs : further table range optimization.
203 * create-normalization-source.cs,
204 create-char-mapping-source.cs,
205 create-combining-class-source.cs : added C header output support.
207 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
209 * create-normalization-source.cs, Normalization.cs :
210 Now property size is < 256, so directly embed value in "props" array.
211 Add QuickCheck(c,checkType) and remove IsNFD/C/KD/KC and delegates.
213 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
215 * create-combining-class-source.cs,
216 create-char-mapping-source.cs,
217 create-normalization-source.cs,
218 NormalizationTableUtil.cs,
219 Normalization.cs : String.Normalize() does not handle surrogate
220 characters. mapping information in DerivedNormalizationProps.txt
221 are not used in the code (those from UnicodeData.txt is used).
222 Hangul syllables are computed instead of embedded in the tables.
223 * managed-collation.patch : removed IntPtrStream and Makefile patches.
225 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
227 * MSCompatUnicodeTable.cs : IsSortable() was broken.
229 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
231 * MSCompatUnicodeTable.cs : added helper for CompareInfo.IsSortable().
233 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
235 * create-tailoring.cfg : added for convenience of contraction check.
237 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
239 * create-normalization-source.cs,
242 create-mscompat-collation-table.cs,
243 MSCompatUnicodeTableUtil.cs,
245 create-collation-element-table.cs,
246 MSCompatUnicodeTable.cs,
248 create-combining-class-source.cs : added copyright lines.
250 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
252 MSCompatUnicodeTable.cs : removed extraneous definition.
254 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
256 * create-mscompat-collation-table.cs
257 MSCompatUnicodeTable.cs : full C header support, finally.
259 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
262 NormalizationTableUtil.cs,
263 create-char-mapping-source.cs : more aggressive data compression.
264 It now ignores characters that are >= U+10000.
266 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
269 Normalization.template,
270 Normalization.cs : renamed existing file.
272 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
274 * NormalizationTableUtil.cs,
275 Normalization.template,
276 create-combining-class-source.cs : GetCombiningClass is now
277 implemented as indexer based array.
278 * Makefile : renamed output filename.
279 * create-mscompat-collation-table.cs : removed comments that does not
281 * create-tailoring.cs : use utf-8 output (and fixed filename).
283 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
285 * create-mscompat-collation-table.cs : hacked safer IPA extensions.
286 * Collation-notes.txt : status of sortkey table.
288 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
290 * create-mscompat-collation-table.cs : some Greek mapping fix.
292 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
294 * create-mscompat-collation-table.cs : diacritical weight is not
295 treated correctly when they are picked from letter names, as flags.
297 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
299 * create-mscompat-collation-table.cs : fixed culture-dependent
300 nonspacing mark weight.
302 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
304 * create-mscompat-collation-table.cs : some Hebrew case letter fixes.
305 Some diacritical fixes on symbols.
307 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
309 * create-mscompat-collation-table.cs : Fixed level 3 weight of
310 Arabic presentation forms.
312 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
314 * create-mscompat-collation-table.cs : Fixed some diacritical weight
315 of Arabic presentation forms.
317 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
319 * SimpleCollator.cs : more status updates. It's almost complete,
320 except for sortkey values.
322 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
324 * SimpleCollator.cs : similar optimization also for LastIndexOf().
326 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
328 * SimpleCollator.cs : the previous patch was missing IgnoreNonSpace
331 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
333 * SimpleCollator.cs : reduced extra sortkey value computation in
334 MatchesForward(). It makes IndexOf() roughly 30% faster.
336 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
338 * SortKey.cs : GetHashCode() returns a value based on its byte data.
341 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
343 * SimpleCollator.cs : consider extractions in invariant culture.
345 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
347 * SimpleCollator.cs : (unsafeFlags) be compact ;-)
349 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
351 * SimpleCollator.cs : When the tail of the target does not match more
352 than 3 times, then IsSuffix() will never be true (3 is the max
353 length of an expansion; \uFB03 -> ffi). It brings significant
354 performance boost when "source" string is very long.
355 * MSCompatUnicodeTable.cs : added MaxExpansionLength constant.
356 Reordered code lines.
358 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
360 * Collation-notes.txt : updated implementation status.
362 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
364 * SimpleCollator.cs : Implemented quick codepoint comparison in
365 Compare(). Comparison became 125x faster.
366 * mono-tailoring-source.txt : added tiny comment.
368 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
370 * mono-tailoring-source.txt : Added all single sortkey remapping to
371 all cultures (still need to fill contractions and annotate possible
372 buggy mapping referencing to CLDR).
373 * SimpleCollator.cs : removed unused code.
374 * MSCompatUnicodeTable.cs : tiny cast removal.
376 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
379 create-mscompat-collation-table.cs
380 MSCompatUnicodeTableUtil.cs
381 MSCompatUnicodeTable.cs : Now CJK mapping data is stored as byte
382 arrays. Thus SimpleCollator does not need to use bitwise and shift
383 operations to get sortkey value and they could be managed resources.
385 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
387 * create-mscompat-collation-table.cs,
388 MSCompatUnicodeTable.cs,
389 MSCompatUnicodeTableUtil.cs : From the result of sortkey comparison
390 between None and IgnoreWidth, width compat table could be computed
391 in somewhat simple way. So removed that table and all related code.
392 Increased the collation resource version.
394 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
396 * create-mscompat-collation-table.cs : Added C header output support.
398 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
400 * create-mscompat-collation-table.cs : FillLetterNFKD() could also be
401 applied to Cyrillic letters. Saved some of them.
403 2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
405 * MSCompatUnicodeTable.cs : oh, ok, so we already have
406 GetManifestResourceInternal() ;-)
407 * managed-collation.patch : in Assembly.cs made that method internal.
409 2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
411 * MSCompatUnicodeTable.cs : the pointer based icall code could be
412 also applicable for USE_MANAGED_RESOURCE mode.
414 2005-07-23 Atsushi Enomoto <atsushi@ximian.com>
416 * MSCompatUnicodeTable.cs : added icall support code (not enabled
417 unless the first line is commented out).
419 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
421 * create-mscompat-collation-table.cs,
422 MSCompatUnicodeTableUtil.cs,
423 MSCompatUnicodeTable.cs : Added resource version output (and ignore
424 in case of version mismatch). Removed obsolete, commented out code.
426 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
429 MSCompatUnicodeTable.cs,
430 create-mscompat-collation-table.cs : Now they use unmanaged pointers
431 instead of managed arrays.
432 * managed-collation.patch : Now it contains patch for IntPtrStream.cs
433 and Assembly.cs as well.
435 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
437 * MSCompatUnicodeTable.cs,
438 SimpleCollator.cs : Moved tailoring support classes to
439 MSCompatUnicodeTable.cs and drawn out from SimpleCollator.
440 Now that cjk and tailoring support are filled inside
441 MSCompatUnicodeTable, no managed array is exposed.
443 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
445 * create-mscompat-collation-table.cs,
447 MSCompatUnicodeTable.cs : Now it's not exposing collation table
448 internals as managed arrays (to switch to unmanaged pointers).
450 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
452 * create-mscompat-collation-table.cs : tiny nonspacing mark fix.
454 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
456 * create-mscompat-collation-table.cs : Fixed most of Greek mappings.
457 * MSCompatUnicodeTable.cs : don't lock string.
459 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
461 * create-mscompat-collation-table.cs : More Cyrillic diacritical fixes.
463 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
465 * create-mscompat-collation-table.cs : More Latin diacritical fixes.
467 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
469 * create-mscompat-collation-table.cs : There were still missing
470 math symbol mappings. Added several hacky diacritical weight for
473 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
475 * create-mscompat-collation-table.cs : fixed a few diacritical weight
476 on Cyrillic characters. Fixed ParseTailoringSource() to handle
477 non-heading escape sequence (\uXXXX) as expected.
479 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
481 * create-mscompat-collation-table.cs,
482 MSCompatUnicodeTableUtil.cs,
483 MSCompatUnicodeTable.cs : added more aggressive index limits for
484 table optimization at data size, in cost of speed.
486 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
488 * create-mscompat-collation-table.cs : fixed Arabic thirtial weight.
490 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
492 * create-mscompat-collation-table.cs : Mapping for hyphens and
493 punctuation are kinda finished. Rewrote batch mapping method to
494 collect all NFKD. Required modification on mapping is done.
496 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
498 * create-mscompat-collation-table.cs : minor mapping fixes on accent
499 marks and punctuations.
501 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
503 * create-mscompat-collation-table.cs : Fixed some MathSymbol mapping
504 and Box drawing mapping.
506 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
508 * create-mscompat-collation-table.cs : Fixed almost all numbers.
510 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
512 * create-mscompat-collation-table.cs : Symbol mappings are almost done.
513 Removed hack that gave dummy mappings to blank symbols.
515 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
517 * create-mscompat-collation-table.cs : more fix on arrows. Fix on box
518 drawings. Some code refactoring to eliminate hack.
520 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
522 * create-mscompat-collation-table.cs : Fixed some secondary weight
523 in Devanagari and arrows.
525 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
527 * create-mscompat-collation-table.cs : a set of tiny mapping fixes.
529 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
531 * create-mscompat-collation-table.cs : some diacritical fixes for
532 Latin. Added batch mapping method that considers computed
533 diacritical weight (for numbers).
535 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
537 * managed-collation.patch : forgot to add System.String patch.
539 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
541 * MSCompatUnicodeTable.cs : added resource existence check (required
542 for mscorlib transient time from the one without resources to the
545 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
547 * create-mscompat-collation-table.cs : fixed punctuations and hyphen
548 (shift) primary weight.
550 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
552 * create-mscompat-collation-table.cs : more nonspacing mark fixes.
553 Some non-basic Cyrillic diacritical weight fixes.
555 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
557 * create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
558 and level 3. Tiny Hangul weight fixes.
559 * MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
561 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
563 * create-mscompat-collation-table.cs : some normal characters who have
564 "narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
565 values were different. Handle U+30FB as category A.
566 * MSCompatUnicodeTable.cs : U+30FB does not have special weight.
568 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
570 * create-mscompat-collation-table.cs : more diacritical weight fixes.
571 Removed some unused code.
573 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
575 * create-mscompat-collation-table.cs : Fixed some Thai and Arabic
578 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
580 * create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
582 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
584 * create-mscompat-collation-table.cs : Fixed nonspacing marks in
585 Malayalam, Thai and Lao. Removed extraneous hack.
587 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
589 * SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
590 Some refactoring on IndexOf() code. Removed unused Matches().
591 * Collation-notes.txt : some methods needed to be reimplemented, so
592 rewrote the description.
594 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
596 * SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
597 Thus supported extenders in IsSuffix().
599 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
601 * SimpleCollator.cs : more IsSuffix() simplification, but it will be
602 stopped here since it cannot handle extenders (implementing new
605 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
607 * SimpleCollator.cs : simplified IsSuffix() code.
609 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
611 * SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
612 entire replacement string if char target was an expansion.
613 IsSuffix() was using a method for IsPrefix() which was incorrect.
614 Removed old IsPrefix() code.
616 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
618 * SimpleCollator.cs : IndexOf() was incorrectly sharing the same
619 byte[] field in different areas of code. Now extenders in both
620 source and target really work in IndexOf().
622 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
624 * create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
625 * SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
627 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
629 * SimpleCollator.cs : Now FilterExtender() handles all extender
630 support. IndexOf() and LastIndexOf() now supports extenders.
631 IndexOf() and LastIndexOf() did not proceed contraction source
632 length as expected. Tiny refactoring on private IsPrefix() to take
635 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
637 * SimpleCollator.cs : when restoring from expansion, go back to the
638 top of the loop (to avoid index out of range).
639 Now IsPrefix() is implemented to reuse Compare() and thus it now
640 supports extender as well.
641 * Collation-notes.txt : status update. Deleted optimization part in
642 status section (it is duplicate).
644 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
646 * SimpleCollator.cs : some code reordering.
647 * create-mscompat-collation-table.cs : it was still missing U+3094.
649 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
651 * SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
653 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
655 * SimpleCollator.cs : In GetSortKey(), don't update previousChar when
656 it is not primary (e.g. don't "extend" diacritical mark).
658 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
660 * managed-collation.patch : CompareInfo.Compare() should consider
661 the possibilities that non-empty string might be actually empty
662 in culture-sensitive context.
664 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
666 * SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
667 target is "empty" (in culture-sensitive context).
669 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
671 * SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
672 characters in target string.
674 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
676 * SimpleCollator.cs : When IgnoreWidth is specified, all Kana
677 characters are regarded as half-width.
678 Even though IgnoreWidth is specified, it should not ignore case.
679 For special weight comparison, the default values (E4) are bigger
680 than non-default values.
681 * SortKeyBuffer.cs : It should save LCID and original string.
682 * create-mscompat-collation-table.cs : For Japanese half-width kana,
683 it should not be counted in widthCompat map since IgnoreWidth does
684 not really ignore those differences.
686 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
688 * create-mscompat-collation-table.cs : Fixed missing Japanese bits.
690 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
692 * create-mscompat-collation-table.cs :
693 tiny diacritical weight fix for U+20D0-U+20E1.
695 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
697 * create-mscompat-collation-table.cs : ja CJK ideograph got completed.
699 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
701 * create-mscompat-collation-table.cs : Fixed CJK custom Japanese
702 mapping. It (maybe as well as other CJK tables) mixes NFKD. For
703 Japanese, modified NFKD table (because of Windows lame design).
705 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
707 * Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
708 * MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
709 invoked at any time it is required.
710 * SimpleCollator.cs : call FillCJK() above in .ctor().
711 * MSCompatUnicodeTableUtil.cs : CJK range was wider.
712 * create-mscompat-collation-table.cs : CJK binary was missing the
713 length. CJK remapping is being moved to ModifyUnidata().
714 For cjk-ja mapping, we have to consider compat characters to be
715 added to the map, besides the raw UCA table.
717 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
719 * SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
721 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
723 * SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
724 contraction as expected. Fixed Compare() to save s2's contraction
726 * TestDriver.cs :added LastIndexOf() tester w/ indexes.
728 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
730 * managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
731 incorrectly use Compare().
732 * TestDriver.cs : more moved to nunit tests.
734 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
736 * SimpleCollator.cs : several fixes on Compare().
737 - Ignorable characters are skippted at the top of the loop.
738 - IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
739 - In such case that s1 index is increased while s2 contraction is
740 replaced, s1 is inconsistently proceeded (bug).
741 - IsIgnorable() now also checks IgnoreNonSpace.
742 - Fixed FilterOptions() that does not work for IgnoreWidth at all.
743 * TestDriver.cs : now some are moved to nunit tests.
744 * Collation-notes.txt : minor todo update.
746 2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
748 * SimpleCollator.cs : Compare() was ignoring such case that both
749 entire strings have '-' to be compared.
750 * Collation-notes.txt : more status updates.
751 * TestDriver.cs : added '-' use cases.
753 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
755 * SimpleCollator.cs : to be same as other buggy part, it now handles
756 U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
758 Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
759 * create-mscompat-collation-table.cs : dummy values for extenders.
761 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
763 * SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
764 should be computed from ExtenderType, and voice mark weight should
766 * MSCompatUnicodeTable.cs : added tiny comment.
768 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
770 * SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
771 * SimpleCollator.cs : support for extender (U+309D etc.).
773 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
775 * create-mscompat-collation-table.cs : some punct/symbols fix.
776 * managed-collation.patch : new (and temporary) file to support
777 managed collation in mscorlib.
778 * README : described how to use managed collation.
780 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
782 * create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
783 U+482-4C8 (though needs diacritical fixes).
784 * MSCompatUnicodeTable.cs : tiny comment for alternative impl.
786 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
788 * create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
789 computation code, since it looks like the same way as Latin letters
790 have. Thus removed all other approach (UCA, by letter name).
792 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
794 * create-mscompat-collation-table.cs : diacritical fix for "double-
795 struck". Syriac nonspacing fixes.
797 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
799 * create-mscompat-collation-table.cs : more math symbol weight fixes.
801 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
803 * create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
805 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
807 * create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
808 implemented (no stub). Some other fixes on category 8-A.
810 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
812 * create-mscompat-collation-table.cs : some minor fixes on Arabic,
813 Korean and Japanese sortkey weights.
815 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
817 * create-mscompat-collation-table.cs : More diacritical fixes.
818 Georgian characters do not have level 2 weights but level 3.
820 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
822 * create-mscompat-collation-table.cs : Roman numeral characters
823 have diacritical weight. quick hack for control signs (U+2400..)
826 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
828 * create-mscompat-collation-table.cs : improving Latin mappings.
829 Setting non-ASCII Latin characters' primary weight between those
830 ASCII characters, and setting diacritical weight (hacky).
831 * MSCompatUnicodeTable.cs :
832 Kanatype check: fixed (voice marks) and improved (comparison order).
834 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
836 * create-mscompat-collation-table.cs : more diacritical fixes.
837 primary weight fixes on punctuations in category 07.
839 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
841 * create-mscompat-collation-table.cs : several diacritical fixes.
842 * TestDriver.cs : sortkey dumper should use StringSort.
844 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
846 * SimpleCollator.cs : fixed incorrect indexer setup. Optimized
847 GetContraction() call a bit.
849 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
851 * create-mscompat-collation-table.cs : fixed incorrect level 2
853 * MSCompatUnicodeTable.cs : remove debug line.
855 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
857 * MSCompatUnicodeTableUtil.cs,
858 MSCompatUnicodeTable.cs,
860 create-mscompat-collation-table.cs : made some members internal and
861 accessible from other classes. Many indexes could be 0 by default.
862 * SimpleCollator.cs : optimizations. avoid method call.
864 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
866 * Collation-notes.txt : more updates.
867 * SimpleCollator.cs : Added quick check for Ordinal comparison.
868 Fixed special weight comparison. It cannot be customizable in the
869 implementation (and it won't be harmful).
870 * mono-tailoring-source.txt : thus updated comment.
872 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
874 * SimpleCollator.cs : Compare() was missing French sort support.
875 * TestDriver.cs : added example case.
877 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
879 * Collation-notes.txt : updated status. Eliminated descriptions on
880 "iterator" (I avoided it for performance concern). Fixed misc.
881 incorrect descriptions.
883 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
885 * Collator.cs : Now that SimpleCollator became feature complete, it is
888 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
890 * SimpleCollator.cs : implemented decent Compare() that immediately
891 stops at first primary difference.
893 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
895 * SimpleCollator.cs : indexers might return -1.
897 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
899 * SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
900 buggy (length check for source was missing).
902 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
904 * create-mscompat-collation-table.cs : Fixed tailoring table output
905 to be in correct and countable order. Now if tailoring alias was not
906 found, just stop the build.
907 * MSCompatUnicodeTable.cs : several build fixes. Now it works to read
909 * mono-tailoring-source.txt : commented out CJK aliases that miss
911 * Makefile : needed further filename fixes.
913 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
915 * MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
916 (now it is working as a standalone file).
917 * Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
918 (the generator now creates both binary resources and C# source).
920 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
922 * create-mscompat-collation-table.cs : Now it generates binary
923 resources (to parent directory).
924 * MSCompatUnicodeTable.template : added conditional code that fills
925 collation tables from manifest resources.
926 * Makefile : remove collation table binaries as well on "make clean".
927 Removed extraneous dependency.
929 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
931 * MSCompatUnicodeTable.template,
932 SimpleCollator.cs : removed extraneous GetExpansion().
934 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
936 * SimpleCollator.cs : IsSuffix() also supports contractions.
937 * TestDriver.cs : IsSuffix() example contraction cases.
939 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
941 * SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
942 what current IsPrefix() does). For expansion of target, IsPrefix()
943 should check the no-match case that expansion is longer than input.
944 Some refactory on IsPrefix().
945 Added GetContractionTal() for IsSuffix() (not used yet).
947 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
949 * TestDriver.cs : added IsPrefix() expansion cases.
950 * SimpleCollator.cs : IsPrefix() now supports contractions (with much
951 of complexity), and it now returns bool again.
952 IndexOf() for replacement should make use of IndexOfPrimitiveChar()
953 since expansions won't be expanded recursively.
955 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
957 * SimpleCollator.cs : commonized character comparison in IsPrefix()
958 and IsSuffix(). csc compile fix.
959 * CompareInfoImpl.cs : deleted.
961 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
963 * TestDriver.cs : added SimpleCollator.ctor() sanity check.
964 Added replacement contraction example.
965 * SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
966 contraction in source string. Extracted matching code to Matches().
967 Replacement contraction was including extraneous '\x0'.
969 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
971 * Collation-notes.txt : updated status.
972 * CollationDataStructures.txt : tiny fixes.
973 * SimpleCollator.cs :
974 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
975 namespace Util and csc borked).
976 GetContraction was incorrectly returning first item.
977 Private IsPrefix() now returns int (but it might not be in real use).
978 Extracted simple char comparison to CompareCharSimple().
979 IndexOf() and LastIndexOf() now fully handle contractions (both
980 binary key and string replacement) in "target" (for "s" not yet).
981 * TestDriver.cs : be more verbose.
982 * mono-tailoring-source.txt : added comment.
983 * MSCompatUnicodeTable.template :
984 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
986 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
988 * create-mscompat-collation-table.cs : compute COMBINING blah marks as
989 well as those characters WITH blah.
990 * TestDriver.cs : added combining sortkey cases.
992 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
994 * mono-tailoring-source.txt : fixed description on '*' in sortkeys.
995 * SimpleCollator.cs : Now it fully uses tailoring info. Fixed
996 contraction search that worked only when string is contraction.
997 Removed commented code. Minor refactoring.
998 * TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
1000 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1002 * create-mscompat-collation-table.cs,
1003 * mono-tailoring-source.txt : removed extraneous level 4 sortkey
1004 which cannot be supported.
1005 * SimpleCollator.cs : added GetContraction() and used in some places.
1006 Now CompareOptions is set only once. Reordered some code (e.g.
1007 ignorable check -> get compat char -> compare).
1009 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1011 * SimpleCollator.cs : sort tailoring tables before actual usage.
1012 Support diacritical remappings (it is customized collation rule
1013 which does not exist in UCA).
1015 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1017 * SimpleCollator.cs : build culture specific tailoring table from
1018 TailoringInfo and unified data array.
1019 * create-mscompat-collation-table.cs : Added null termination to
1020 sortkey map tailorings (mostly to save my eyes).
1021 * MSCompatUnicodeTable.template : added public TailoringValues.
1023 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1025 * SortKeyBuffer.cs : handle special weight (category 06) characters.
1026 * Collation-notes.txt : Updated description on special weight (it was
1028 * TestDriver.cs : added special weight cases.
1030 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1032 * MSCompatUnicodeTable.template : added GetTailoringInfo().
1033 * SimpleCollator.cs : Now tailoring information is acquired and used.
1034 (FrenchSort is supported but Compare() won't work expectedly since
1035 the table is still incomplete for those diacritical marks).
1036 * SortKeyBuffer.cs : On reversing diacritical weights, it should
1037 ignore zeros. Reset() should reset frenchSorted flag.
1039 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1041 * create-mscompat-collation-table.cs : Further fixes on Jamo,
1042 diacritical weights by character name, and *Numbers primary weights.
1044 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1046 * create-mscompat-collation-table.cs : More fix on Devanagari,
1047 Gujarati, Oliya, Tamil and Lao sortkeys.
1049 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1051 * create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
1054 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1056 * create-mscompat-collation-table.cs : Fixed Thai character primary
1057 and secondary values. Fixed Thaana letters. Added more LAMESPEC
1058 CJK compat. Fixed some circled CJK secondary weight.
1059 Hacked some nonspacing mark sortkey value adjustment.
1061 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1063 * create-mscompat-collation-table.cs : CP932.TXT was not parsed as
1064 expected. JIS ordering was incorrect. OtherNumbers that represents
1065 10 or more values were incorrectly computed the offset. Some Hangul
1066 compat characters has different offset.
1068 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1070 * create-mscompat-collation-table.cs : Fixed 0x8 category characters.
1071 Added hack for need-to-be-fixed characters to fall into 0xA category.
1072 * create-collation-element-table.cs : previous checkin seem failed :(
1073 * README: updated a bit.
1075 2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
1077 * CodePointIndexer.cs :
1078 removed extraneous switch (I could use empty array for that need).
1079 * CollationElementTableUtil.cs : primary weight type became ushort.
1080 * create-collation-element-table.cs : several bugfixes.
1081 collElem should be int. It was skipping most of entries because of
1082 incorrect string tokenization.
1084 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1086 * create-mscompat-collation-table.cs : handle some Jamo NKFD.
1088 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1090 * SimpleCollator.cs : forgot to commit in the last checkin.
1091 * create-mscompat-collation-table.cs : fixed arabic shift weight chars.
1092 * TestDriver.cs : switch table dumper and collator testing.
1093 * SortKey.cs : for now comment out internal indexes (not in use).
1095 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1097 * MSCompatUnicodeTable.template,
1098 SimpleCollator.cs : support for culture dependent CJK table.
1100 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1102 * create-mscompat-collation-table.cs,
1103 MSCompatUnicodeTableUtil.cs : make CJK table more compact.
1105 2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
1107 * SimpleCollator.cs : Fixed stupid index search when start != 0.
1109 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1111 * SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
1112 now starts from "start" and proceeds backward by "length".
1113 * TestDriver.cs : fix warning.
1115 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1117 * TestDriver.cs : more tests.
1118 * SimpleCollator.cs : LastIndexOf() is not setting search length
1119 on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
1121 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1123 * create-normalization-source.cs : output propValue as uint.
1125 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1127 * SortKey.cs : Now it is System.Globalization.SortKey.
1128 To replace existing implementation, it now requires lcid and
1129 CompareOptions. Added required members.
1130 * SortKeyBuffer.cs : thus .ctor() requires LCID.
1131 * SimpleCollator.cs : made required changes above.
1133 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1135 * CodePointIndexer.cs : added CompressArray(). Now it requires two more
1136 parameters for default index and codepoint.
1137 * CollationElementTableUtil.cs,
1138 NormalizationTableUtil.cs : required changes wrt above change.
1139 * MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
1140 * MSCompatUnicodeTable.template : Now it uses codepoint indexer.
1141 * create-mscompat-collation-table.cs : Now it outputs compressed array.
1142 * Makefile : now collation requires MSCompatUnicodeTableUtil.cs
1144 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1146 * SimpleCollator.cs :
1147 Implemented IsSuffix() and LastIndexOf().
1148 Several fixes on index > 0 cases.
1149 * TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
1151 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1153 * Collation-notes.txt : updated (status, impl. classes).
1154 * MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
1156 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1158 * SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
1159 and IsPrefix(). Tiny code refactory.
1160 * TestDriver.cs : sample IsPrefix() and IndexOf() usage.
1161 * MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
1163 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1165 * SimpleCollator.cs :
1166 IndexOf(string, char, CompareOptions) implementation.
1167 * TestDriver.cs : sample IndexOf() usage.
1169 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1171 * create-mscompat-collation-table.cs : was missing most important
1172 kind of blocks - equivalent expansions (e.g. invariant mappings).
1173 More readable mappings.
1175 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1177 * mono-tailoring-source.txt : new file. It describes tailoring
1178 information. Basically examined under .NET 1.x.
1179 * create-mscompat-collation-table.cs : consume the file above.
1180 * MSCompatUnicodeTable.template : now tailorings is not a stub.
1181 * CollationDataStructures.txt : minor fixes.
1183 SimpleCollator.cs : added FrenchSort support.
1184 * Collation-notes.txt : added description on Latin primary weights.
1185 * ldml-limited.rng : added note.
1186 * create-tailorings.cs : added note. more serialization (but won't be
1189 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1191 * SortKeyBuffer.cs : non-primary character is added to previous
1193 * TestDriver.cs : added example case of above.
1195 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1197 * SimpleCollator.cs : IgnoreSymbols support.
1198 * TestDriver.cs : compilation fix. IgnoreSymbols example.
1199 * create-mscompat-collation-table.cs : more Hangul fixes.
1201 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1203 * create-mscompat-collation-table.cs : more Hangul fixes.
1204 * SortKey.cs : it will replace sys.globalization.SortKey. It has
1205 some internal members.
1206 * SortKeyBuffer.cs : now it uses SortKey instead of byte[].
1207 * SimpleCollator.cs : CompareOptions support. However I don't think
1208 it will be developed anymore since SortKey never enables IndexOf().
1209 * TestDriver.cs : a few CompareOptions cases.
1211 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1213 * SimpleCollator.cs : simple collator implementation that just will
1214 use GetSortKey() for all its basis.
1215 * TestDriver.cs : sample code that uses this collator set.
1216 * MSCompatUnicodeTable.template : removed test driver from here.
1218 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1220 * create-mscompat-collation-table.cs : Hangul fixes.
1221 Now less than 300 characters that does not have sortkey weights.
1222 * MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
1224 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1226 * create-mscompat-collation-table.cs : Added control picture mappings.
1227 Minor primary weight fixes.
1229 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1231 * create-mscompat-collation-table.cs : Added mappings for box
1232 drawings and blocks.
1234 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1236 * create-mscompat-collation-table.cs : Added mappings for arrows.
1238 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1240 * create-mscompat-collation-table.cs : added support for letterlike
1241 characters and squared CJK compatibility characters, ordered by
1242 character names (0x0E category).
1243 * Collation-notes.txt : added description on that.
1245 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1247 * MSCompatUnicodeTable.template : Now expansions are simulated.
1248 * create-mscompat-collation-table.cs : filled Korean number level2.
1249 Reordered some code blocks to fill correct diacritical differences.
1250 * Collation-notes.txt : some corrections and minor additions.
1252 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1254 * MSCompatUnicodeTable.template :
1255 Now dumper test driver uses SortKeyBuffer for dogfooding.
1256 * create-mscompat-collation-table.cs : some diacritical level fixes
1257 (with non-working extra latin check).
1258 * SortKeyBuffer.cs : several fixes to get working as a practical code.
1259 * Collator.cs : make it compilable, leaving things as NotImplemented.
1261 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1263 * create-mscompat-collation-table.cs : some fixes on primary category
1264 07 (miscellaneous symbols and punctuations).
1266 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1268 * create-mscompat-collation-table.cs : more mapping fix on numbers,
1269 letters, variable weight characters, circled Japanese and CJK.
1270 * MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
1271 inclusive. Simplified dumper code.
1273 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1275 * create-mscompat-collation-table.cs : finished Hangul (both Jamo
1276 and Syllables). sortkey dumper diff lines became 8000 from 30000.
1278 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1280 * create-mscompat-collation-table.cs : added some nonspacing marks in
1281 either correct or hacky way.
1283 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
1285 * create-mscompat-collation-table.cs : several improvements. Japanese
1286 Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
1287 numeric characters, diacritically decorated latin alphabets. Fixed
1288 some diacritical weights detection.
1289 * MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
1290 marks' primary weight as empty.
1291 * Collation-notes.txt : some updates.
1293 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
1295 * create-mscompat-collation-table.cs : don't process nonexact NFKD
1296 mapping as equivalent, however store CJK extensions into NFKD map
1297 even if one does not strictly match.
1298 Now am going to fill Hangul into tables (unlike UCA it does not look
1299 possible to calculate sortkey value).
1300 Fixed Cyrillic and Georgian UCA based orderings.
1301 * MSCompatUnicodeTable.template : added CJK extension sortkey
1304 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
1306 * create-mscompat-collation-table.cs : Fixed latin alphabet support.
1307 Added latin with diacritical and CJK extension.
1308 * MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
1310 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
1312 * create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
1313 now not used thouth). Filled CJK ideograph, still not perfect.
1314 Fixed number primary keys. NFKD numbers and CJK ideographs are now
1315 considered, including brackets elimination.
1316 * Makefile : now it downloads DerivedAge.txt.
1317 * MSCompatUnicodeTable.template : added dummy code dumper. It computes
1318 PrivateUse, Surrogate and Hangul Syllables.
1319 * Collation-notes.txt : Noted that Hangul Syllables need more love.
1321 2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
1323 * create-tailorings.cs : added configuration support. sort them.
1324 I wonder if it is really usable. Having own format might be better.
1325 * create-mscompat-collation-table.cs : fixing some sortkey numbers,
1326 making closer to windows. Now it handles NFKD in some places.
1327 * MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
1328 * CollationDataStructures.txt : added description on tailoring
1329 fields, though they are subject to change.
1331 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
1333 * create-tailorings.cs, ldml-limited.rng : new file.
1334 * LdmlReader.cs : removed old file.
1336 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
1338 * SortKeyBuffer.cs : split from Collator.cs. Now it considers
1339 practical use, reflecting updated sortkey constant design.
1340 Especially level 4 weight is split to 4 arrays that are merged in
1341 the last stage of GetSortKey().
1342 * Collator.cs : thus SortKeyBuffer is removed from here.
1343 Additionally, removed some extraneous bits in other classes.
1344 * Collation-notes.txt : Some editorial fixes. Added information on
1345 Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
1346 be stored in simple byte arrays).
1347 * CodePointIndexer.cs,
1348 create-collation-element-table.cs,
1349 CollationElementTable.template,
1350 NormalizationTableUtil.cs : short CodePointIndexer method names.
1351 * create-mscompat-collation-table.cs : Additional info on why some
1352 meaningful characters are ignored in Windows (Unicode version
1353 difference). Removed U+070F from special check (was extraneous).
1355 2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
1357 * MSCompatUnicodeTable.template:
1358 Moved body implementation to table creator and put those bool
1359 results into an array.
1360 * create-mscompat-collation-table.cs :
1361 So imported those methods. Modified array output to emit "0x"
1362 only for more than 9.
1363 * create-normalization-source.cs : ditto on "0x" output matter.
1364 * CollationDataStructures.txt : so now it holds ignorableFlags.
1366 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
1368 * Collation-notes.txt, CollationDataStructures.txt :
1369 separate document for data structure design.
1371 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
1373 * create-mscompat-collation-table.cs : added culture-dependent CJK
1374 table creation. It uses CLDR as its basis. (Culture independent CJK
1376 * Makefile : added CLDR archive downloading support.
1377 * MSCompatUnicodeTable.template : tiny renamings.
1378 * Collation-notes.txt : additional CJK info.
1380 2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
1382 * Collation-notes.txt, create-mscompat-collation-table.cs :
1383 added secondary weight support for BlahNumber characters.
1385 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
1387 * downloaded : added directory. All downloaded files are stored here.
1388 * Makefile : use "downloaded" directory.
1389 Added more auto-download stuff.
1390 * create-mscompat-collation-table.cs :
1391 Added Japanese square kana support.
1393 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
1395 * Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
1396 * create-mscompat-collation-table.cs : added support for Arabic abjad,
1397 Estrangela and Thaana.
1398 * MSCompatUnicodeTable.template : removed BOM.
1400 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1402 * Collation-notes.txt : wrong comment cleanup and spelling fixes.
1403 * create-mscompat-collation-table.cs : added diacritic support for
1404 Latin letters (as long as covered in primary weight).
1406 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1408 * Makefile : minor fixes. Added warning lines to generated sources.
1410 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1412 * create-char-mapping-source.cs :
1413 Removed ToWidthInsensitive() generation.
1415 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1417 * create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
1418 ToWidthInsensitive() is implemented here, using an array (which is
1419 to be optimized using CodePointIndexer).
1420 * MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
1421 * MSCompatUnicodeTable.template : now it is used to generate
1422 MSCompatUnicodeTable.cs which got ready to be used.
1423 * Makefile : added MSCompatUnicodeTable.cs build support. Now it
1424 supports "make normalization" and "make collation".
1426 2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
1428 * Collation-notes.txt : Description on ICU is very incorrect. Now it
1429 became more rational and sane.
1430 * create-mscompat-collation-table.cs : fixed some indexes.
1431 * Makefile : added "mstablegen" target.
1432 * MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
1434 2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
1436 * Collation-notes.txt : more analysis on "letters".
1437 * create-mscompat-collation-table.cs : more proof of concepts.
1439 2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
1441 * Collation-notes.txt : more info. Started letter sortkey analysis
1442 (some of other stuff are really non-understandable right now.)
1443 * create-mscompat-collation-table.cs : table generator proof-of-
1444 concept source (not compilable).
1445 * MSCompatUnicodeTable.cs : moved some code to the new source.
1448 2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
1450 * Collation-notes.txt : started level 2 weight analysis.
1452 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1454 * Collation-notes.txt : Additional information on how to create
1456 * MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
1458 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1460 * Collation-notes.txt : More case weight (level 3) analysis. I'm
1461 likely to just write table generator.
1463 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1465 * MSCompatUnicodeTable.cs : part of level 4 weight implementation.
1467 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1469 * Collation-notes.txt :
1471 Revised comparison methods; backward iteration is possible.
1472 More on char-by-char comparison.
1473 Level 4 comparison is actually a bit more complex.
1475 * Collator.cs : some conceptual updates wrt above.
1477 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1479 * Collation-notes.txt : Japanese voice mark is level 2, and Hangul
1480 properties are level 3.
1482 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1484 * Collation-notes.txt : Make it more readable. More analysis on
1485 level 3 and 4 sortkey structures.
1486 * Collator.cs : some compilation fixes (not compilable yet).
1488 2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
1490 * Collation-notes.txt : Analysis on variable-weighting (level 5)
1492 * Collator.cs : updated corresponding part of level 5, and more.
1494 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1496 * Collation-notes.txt : more updates.
1497 * Collator.cs : rewrote from scratch. Some rough sketch for sortkey
1498 buffer, character iterator and collator methods. Not compiling.
1500 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1502 * Collator.cs : Am going to replace it with new one. No need for
1503 CompareOptions-dependent Comparer.
1505 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1507 * Collation-notes.txt : There seems a bit more complexity.
1509 2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
1511 * Collation-notes.txt : more updates, being close to write sortkey
1514 2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
1516 * CompareInfoImpl.cs, Collator.cs : conceptual update
1517 * Collation-notes.txt : some corrections and additions.
1518 * Makefile : added LDML input (but it won't be used at all).
1520 2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
1522 * Collation-notes.txt : more updates.
1524 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1526 * Collation-notes.txt : more updates.
1528 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1530 * Collation-notes.txt : some updates.
1531 * create-mapping-char-source.cs : superscripts and subscripts are also
1532 ignored in IgnoreWidth comparison.
1533 * Makefile : tiny touch fix.
1535 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1537 * CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
1539 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1541 * create-char-mapping-source.cs : Now it generates
1542 ToWidthInsensitive() from combining category <wide> and <narrow>.
1543 * MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
1544 ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
1546 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1548 * README, LdmlReader.cs, DataStructures.txt : new files.
1550 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1552 * CodePointIndexer.cs,
1553 Collation-notes.txt,
1554 CollationElementTable.template,
1555 CollationElementTableUtil.cs,
1556 create-char-mapping-source.cs,
1557 create-collation-element-table.cs,
1558 create-combining-class-source.cs,
1559 create-normalization-source.cs,
1561 MSCompatUnicodeTable.cs,
1562 Normalization.template,
1563 NormalizationTableUtil.cs : initial checkin (to private branch).