1 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
3 * MSCompatUnicodeTable.cs,
4 SimpleCollator.cs : Moved tailoring support classes to
5 MSCompatUnicodeTable.cs and drawn out from SimpleCollator.
6 Now that cjk and tailoring support are filled inside
7 MSCompatUnicodeTable, no managed array is exposed.
9 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
11 * create-mscompat-collation-table.cs,
13 MSCompatUnicodeTable.cs : Now it's not exposing collation table
14 internals as managed arrays (to switch to unmanaged pointers).
16 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
18 * create-mscompat-collation-table.cs : tiny nonspacing mark fix.
20 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
22 * create-mscompat-collation-table.cs : Fixed most of Greek mappings.
23 * MSCompatUnicodeTable.cs : don't lock string.
25 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
27 * create-mscompat-collation-table.cs : More Cyrillic diacritical fixes.
29 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
31 * create-mscompat-collation-table.cs : More Latin diacritical fixes.
33 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
35 * create-mscompat-collation-table.cs : There were still missing
36 math symbol mappings. Added several hacky diacritical weight for
39 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
41 * create-mscompat-collation-table.cs : fixed a few diacritical weight
42 on Cyrillic characters. Fixed ParseTailoringSource() to handle
43 non-heading escape sequence (\uXXXX) as expected.
45 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
47 * create-mscompat-collation-table.cs,
48 MSCompatUnicodeTableUtil.cs,
49 MSCompatUnicodeTable.cs : added more aggressive index limits for
50 table optimization at data size, in cost of speed.
52 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
54 * create-mscompat-collation-table.cs : fixed Arabic thirtial weight.
56 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
58 * create-mscompat-collation-table.cs : Mapping for hyphens and
59 punctuation are kinda finished. Rewrote batch mapping method to
60 collect all NFKD. Required modification on mapping is done.
62 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
64 * create-mscompat-collation-table.cs : minor mapping fixes on accent
65 marks and punctuations.
67 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
69 * create-mscompat-collation-table.cs : Fixed some MathSymbol mapping
70 and Box drawing mapping.
72 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
74 * create-mscompat-collation-table.cs : Fixed almost all numbers.
76 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
78 * create-mscompat-collation-table.cs : Symbol mappings are almost done.
79 Removed hack that gave dummy mappings to blank symbols.
81 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
83 * create-mscompat-collation-table.cs : more fix on arrows. Fix on box
84 drawings. Some code refactoring to eliminate hack.
86 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
88 * create-mscompat-collation-table.cs : Fixed some secondary weight
89 in Devanagari and arrows.
91 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
93 * create-mscompat-collation-table.cs : a set of tiny mapping fixes.
95 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
97 * create-mscompat-collation-table.cs : some diacritical fixes for
98 Latin. Added batch mapping method that considers computed
99 diacritical weight (for numbers).
101 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
103 * managed-collation.patch : forgot to add System.String patch.
105 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
107 * MSCompatUnicodeTable.cs : added resource existence check (required
108 for mscorlib transient time from the one without resources to the
111 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
113 * create-mscompat-collation-table.cs : fixed punctuations and hyphen
114 (shift) primary weight.
116 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
118 * create-mscompat-collation-table.cs : more nonspacing mark fixes.
119 Some non-basic Cyrillic diacritical weight fixes.
121 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
123 * create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
124 and level 3. Tiny Hangul weight fixes.
125 * MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
127 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
129 * create-mscompat-collation-table.cs : some normal characters who have
130 "narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
131 values were different. Handle U+30FB as category A.
132 * MSCompatUnicodeTable.cs : U+30FB does not have special weight.
134 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
136 * create-mscompat-collation-table.cs : more diacritical weight fixes.
137 Removed some unused code.
139 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
141 * create-mscompat-collation-table.cs : Fixed some Thai and Arabic
144 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
146 * create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
148 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
150 * create-mscompat-collation-table.cs : Fixed nonspacing marks in
151 Malayalam, Thai and Lao. Removed extraneous hack.
153 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
155 * SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
156 Some refactoring on IndexOf() code. Removed unused Matches().
157 * Collation-notes.txt : some methods needed to be reimplemented, so
158 rewrote the description.
160 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
162 * SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
163 Thus supported extenders in IsSuffix().
165 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
167 * SimpleCollator.cs : more IsSuffix() simplification, but it will be
168 stopped here since it cannot handle extenders (implementing new
171 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
173 * SimpleCollator.cs : simplified IsSuffix() code.
175 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
177 * SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
178 entire replacement string if char target was an expansion.
179 IsSuffix() was using a method for IsPrefix() which was incorrect.
180 Removed old IsPrefix() code.
182 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
184 * SimpleCollator.cs : IndexOf() was incorrectly sharing the same
185 byte[] field in different areas of code. Now extenders in both
186 source and target really work in IndexOf().
188 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
190 * create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
191 * SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
193 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
195 * SimpleCollator.cs : Now FilterExtender() handles all extender
196 support. IndexOf() and LastIndexOf() now supports extenders.
197 IndexOf() and LastIndexOf() did not proceed contraction source
198 length as expected. Tiny refactoring on private IsPrefix() to take
201 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
203 * SimpleCollator.cs : when restoring from expansion, go back to the
204 top of the loop (to avoid index out of range).
205 Now IsPrefix() is implemented to reuse Compare() and thus it now
206 supports extender as well.
207 * Collation-notes.txt : status update. Deleted optimization part in
208 status section (it is duplicate).
210 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
212 * SimpleCollator.cs : some code reordering.
213 * create-mscompat-collation-table.cs : it was still missing U+3094.
215 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
217 * SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
219 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
221 * SimpleCollator.cs : In GetSortKey(), don't update previousChar when
222 it is not primary (e.g. don't "extend" diacritical mark).
224 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
226 * managed-collation.patch : CompareInfo.Compare() should consider
227 the possibilities that non-empty string might be actually empty
228 in culture-sensitive context.
230 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
232 * SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
233 target is "empty" (in culture-sensitive context).
235 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
237 * SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
238 characters in target string.
240 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
242 * SimpleCollator.cs : When IgnoreWidth is specified, all Kana
243 characters are regarded as half-width.
244 Even though IgnoreWidth is specified, it should not ignore case.
245 For special weight comparison, the default values (E4) are bigger
246 than non-default values.
247 * SortKeyBuffer.cs : It should save LCID and original string.
248 * create-mscompat-collation-table.cs : For Japanese half-width kana,
249 it should not be counted in widthCompat map since IgnoreWidth does
250 not really ignore those differences.
252 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
254 * create-mscompat-collation-table.cs : Fixed missing Japanese bits.
256 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
258 * create-mscompat-collation-table.cs :
259 tiny diacritical weight fix for U+20D0-U+20E1.
261 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
263 * create-mscompat-collation-table.cs : ja CJK ideograph got completed.
265 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
267 * create-mscompat-collation-table.cs : Fixed CJK custom Japanese
268 mapping. It (maybe as well as other CJK tables) mixes NFKD. For
269 Japanese, modified NFKD table (because of Windows lame design).
271 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
273 * Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
274 * MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
275 invoked at any time it is required.
276 * SimpleCollator.cs : call FillCJK() above in .ctor().
277 * MSCompatUnicodeTableUtil.cs : CJK range was wider.
278 * create-mscompat-collation-table.cs : CJK binary was missing the
279 length. CJK remapping is being moved to ModifyUnidata().
280 For cjk-ja mapping, we have to consider compat characters to be
281 added to the map, besides the raw UCA table.
283 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
285 * SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
287 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
289 * SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
290 contraction as expected. Fixed Compare() to save s2's contraction
292 * TestDriver.cs :added LastIndexOf() tester w/ indexes.
294 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
296 * managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
297 incorrectly use Compare().
298 * TestDriver.cs : more moved to nunit tests.
300 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
302 * SimpleCollator.cs : several fixes on Compare().
303 - Ignorable characters are skippted at the top of the loop.
304 - IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
305 - In such case that s1 index is increased while s2 contraction is
306 replaced, s1 is inconsistently proceeded (bug).
307 - IsIgnorable() now also checks IgnoreNonSpace.
308 - Fixed FilterOptions() that does not work for IgnoreWidth at all.
309 * TestDriver.cs : now some are moved to nunit tests.
310 * Collation-notes.txt : minor todo update.
312 2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
314 * SimpleCollator.cs : Compare() was ignoring such case that both
315 entire strings have '-' to be compared.
316 * Collation-notes.txt : more status updates.
317 * TestDriver.cs : added '-' use cases.
319 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
321 * SimpleCollator.cs : to be same as other buggy part, it now handles
322 U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
324 Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
325 * create-mscompat-collation-table.cs : dummy values for extenders.
327 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
329 * SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
330 should be computed from ExtenderType, and voice mark weight should
332 * MSCompatUnicodeTable.cs : added tiny comment.
334 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
336 * SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
337 * SimpleCollator.cs : support for extender (U+309D etc.).
339 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
341 * create-mscompat-collation-table.cs : some punct/symbols fix.
342 * managed-collation.patch : new (and temporary) file to support
343 managed collation in mscorlib.
344 * README : described how to use managed collation.
346 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
348 * create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
349 U+482-4C8 (though needs diacritical fixes).
350 * MSCompatUnicodeTable.cs : tiny comment for alternative impl.
352 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
354 * create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
355 computation code, since it looks like the same way as Latin letters
356 have. Thus removed all other approach (UCA, by letter name).
358 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
360 * create-mscompat-collation-table.cs : diacritical fix for "double-
361 struck". Syriac nonspacing fixes.
363 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
365 * create-mscompat-collation-table.cs : more math symbol weight fixes.
367 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
369 * create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
371 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
373 * create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
374 implemented (no stub). Some other fixes on category 8-A.
376 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
378 * create-mscompat-collation-table.cs : some minor fixes on Arabic,
379 Korean and Japanese sortkey weights.
381 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
383 * create-mscompat-collation-table.cs : More diacritical fixes.
384 Georgian characters do not have level 2 weights but level 3.
386 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
388 * create-mscompat-collation-table.cs : Roman numeral characters
389 have diacritical weight. quick hack for control signs (U+2400..)
392 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
394 * create-mscompat-collation-table.cs : improving Latin mappings.
395 Setting non-ASCII Latin characters' primary weight between those
396 ASCII characters, and setting diacritical weight (hacky).
397 * MSCompatUnicodeTable.cs :
398 Kanatype check: fixed (voice marks) and improved (comparison order).
400 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
402 * create-mscompat-collation-table.cs : more diacritical fixes.
403 primary weight fixes on punctuations in category 07.
405 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
407 * create-mscompat-collation-table.cs : several diacritical fixes.
408 * TestDriver.cs : sortkey dumper should use StringSort.
410 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
412 * SimpleCollator.cs : fixed incorrect indexer setup. Optimized
413 GetContraction() call a bit.
415 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
417 * create-mscompat-collation-table.cs : fixed incorrect level 2
419 * MSCompatUnicodeTable.cs : remove debug line.
421 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
423 * MSCompatUnicodeTableUtil.cs,
424 MSCompatUnicodeTable.cs,
426 create-mscompat-collation-table.cs : made some members internal and
427 accessible from other classes. Many indexes could be 0 by default.
428 * SimpleCollator.cs : optimizations. avoid method call.
430 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
432 * Collation-notes.txt : more updates.
433 * SimpleCollator.cs : Added quick check for Ordinal comparison.
434 Fixed special weight comparison. It cannot be customizable in the
435 implementation (and it won't be harmful).
436 * mono-tailoring-source.txt : thus updated comment.
438 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
440 * SimpleCollator.cs : Compare() was missing French sort support.
441 * TestDriver.cs : added example case.
443 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
445 * Collation-notes.txt : updated status. Eliminated descriptions on
446 "iterator" (I avoided it for performance concern). Fixed misc.
447 incorrect descriptions.
449 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
451 * Collator.cs : Now that SimpleCollator became feature complete, it is
454 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
456 * SimpleCollator.cs : implemented decent Compare() that immediately
457 stops at first primary difference.
459 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
461 * SimpleCollator.cs : indexers might return -1.
463 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
465 * SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
466 buggy (length check for source was missing).
468 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
470 * create-mscompat-collation-table.cs : Fixed tailoring table output
471 to be in correct and countable order. Now if tailoring alias was not
472 found, just stop the build.
473 * MSCompatUnicodeTable.cs : several build fixes. Now it works to read
475 * mono-tailoring-source.txt : commented out CJK aliases that miss
477 * Makefile : needed further filename fixes.
479 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
481 * MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
482 (now it is working as a standalone file).
483 * Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
484 (the generator now creates both binary resources and C# source).
486 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
488 * create-mscompat-collation-table.cs : Now it generates binary
489 resources (to parent directory).
490 * MSCompatUnicodeTable.template : added conditional code that fills
491 collation tables from manifest resources.
492 * Makefile : remove collation table binaries as well on "make clean".
493 Removed extraneous dependency.
495 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
497 * MSCompatUnicodeTable.template,
498 SimpleCollator.cs : removed extraneous GetExpansion().
500 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
502 * SimpleCollator.cs : IsSuffix() also supports contractions.
503 * TestDriver.cs : IsSuffix() example contraction cases.
505 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
507 * SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
508 what current IsPrefix() does). For expansion of target, IsPrefix()
509 should check the no-match case that expansion is longer than input.
510 Some refactory on IsPrefix().
511 Added GetContractionTal() for IsSuffix() (not used yet).
513 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
515 * TestDriver.cs : added IsPrefix() expansion cases.
516 * SimpleCollator.cs : IsPrefix() now supports contractions (with much
517 of complexity), and it now returns bool again.
518 IndexOf() for replacement should make use of IndexOfPrimitiveChar()
519 since expansions won't be expanded recursively.
521 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
523 * SimpleCollator.cs : commonized character comparison in IsPrefix()
524 and IsSuffix(). csc compile fix.
525 * CompareInfoImpl.cs : deleted.
527 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
529 * TestDriver.cs : added SimpleCollator.ctor() sanity check.
530 Added replacement contraction example.
531 * SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
532 contraction in source string. Extracted matching code to Matches().
533 Replacement contraction was including extraneous '\x0'.
535 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
537 * Collation-notes.txt : updated status.
538 * CollationDataStructures.txt : tiny fixes.
539 * SimpleCollator.cs :
540 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
541 namespace Util and csc borked).
542 GetContraction was incorrectly returning first item.
543 Private IsPrefix() now returns int (but it might not be in real use).
544 Extracted simple char comparison to CompareCharSimple().
545 IndexOf() and LastIndexOf() now fully handle contractions (both
546 binary key and string replacement) in "target" (for "s" not yet).
547 * TestDriver.cs : be more verbose.
548 * mono-tailoring-source.txt : added comment.
549 * MSCompatUnicodeTable.template :
550 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
552 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
554 * create-mscompat-collation-table.cs : compute COMBINING blah marks as
555 well as those characters WITH blah.
556 * TestDriver.cs : added combining sortkey cases.
558 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
560 * mono-tailoring-source.txt : fixed description on '*' in sortkeys.
561 * SimpleCollator.cs : Now it fully uses tailoring info. Fixed
562 contraction search that worked only when string is contraction.
563 Removed commented code. Minor refactoring.
564 * TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
566 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
568 * create-mscompat-collation-table.cs,
569 * mono-tailoring-source.txt : removed extraneous level 4 sortkey
570 which cannot be supported.
571 * SimpleCollator.cs : added GetContraction() and used in some places.
572 Now CompareOptions is set only once. Reordered some code (e.g.
573 ignorable check -> get compat char -> compare).
575 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
577 * SimpleCollator.cs : sort tailoring tables before actual usage.
578 Support diacritical remappings (it is customized collation rule
579 which does not exist in UCA).
581 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
583 * SimpleCollator.cs : build culture specific tailoring table from
584 TailoringInfo and unified data array.
585 * create-mscompat-collation-table.cs : Added null termination to
586 sortkey map tailorings (mostly to save my eyes).
587 * MSCompatUnicodeTable.template : added public TailoringValues.
589 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
591 * SortKeyBuffer.cs : handle special weight (category 06) characters.
592 * Collation-notes.txt : Updated description on special weight (it was
594 * TestDriver.cs : added special weight cases.
596 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
598 * MSCompatUnicodeTable.template : added GetTailoringInfo().
599 * SimpleCollator.cs : Now tailoring information is acquired and used.
600 (FrenchSort is supported but Compare() won't work expectedly since
601 the table is still incomplete for those diacritical marks).
602 * SortKeyBuffer.cs : On reversing diacritical weights, it should
603 ignore zeros. Reset() should reset frenchSorted flag.
605 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
607 * create-mscompat-collation-table.cs : Further fixes on Jamo,
608 diacritical weights by character name, and *Numbers primary weights.
610 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
612 * create-mscompat-collation-table.cs : More fix on Devanagari,
613 Gujarati, Oliya, Tamil and Lao sortkeys.
615 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
617 * create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
620 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
622 * create-mscompat-collation-table.cs : Fixed Thai character primary
623 and secondary values. Fixed Thaana letters. Added more LAMESPEC
624 CJK compat. Fixed some circled CJK secondary weight.
625 Hacked some nonspacing mark sortkey value adjustment.
627 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
629 * create-mscompat-collation-table.cs : CP932.TXT was not parsed as
630 expected. JIS ordering was incorrect. OtherNumbers that represents
631 10 or more values were incorrectly computed the offset. Some Hangul
632 compat characters has different offset.
634 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
636 * create-mscompat-collation-table.cs : Fixed 0x8 category characters.
637 Added hack for need-to-be-fixed characters to fall into 0xA category.
638 * create-collation-element-table.cs : previous checkin seem failed :(
639 * README: updated a bit.
641 2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
643 * CodePointIndexer.cs :
644 removed extraneous switch (I could use empty array for that need).
645 * CollationElementTableUtil.cs : primary weight type became ushort.
646 * create-collation-element-table.cs : several bugfixes.
647 collElem should be int. It was skipping most of entries because of
648 incorrect string tokenization.
650 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
652 * create-mscompat-collation-table.cs : handle some Jamo NKFD.
654 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
656 * SimpleCollator.cs : forgot to commit in the last checkin.
657 * create-mscompat-collation-table.cs : fixed arabic shift weight chars.
658 * TestDriver.cs : switch table dumper and collator testing.
659 * SortKey.cs : for now comment out internal indexes (not in use).
661 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
663 * MSCompatUnicodeTable.template,
664 SimpleCollator.cs : support for culture dependent CJK table.
666 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
668 * create-mscompat-collation-table.cs,
669 MSCompatUnicodeTableUtil.cs : make CJK table more compact.
671 2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
673 * SimpleCollator.cs : Fixed stupid index search when start != 0.
675 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
677 * SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
678 now starts from "start" and proceeds backward by "length".
679 * TestDriver.cs : fix warning.
681 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
683 * TestDriver.cs : more tests.
684 * SimpleCollator.cs : LastIndexOf() is not setting search length
685 on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
687 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
689 * create-normalization-source.cs : output propValue as uint.
691 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
693 * SortKey.cs : Now it is System.Globalization.SortKey.
694 To replace existing implementation, it now requires lcid and
695 CompareOptions. Added required members.
696 * SortKeyBuffer.cs : thus .ctor() requires LCID.
697 * SimpleCollator.cs : made required changes above.
699 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
701 * CodePointIndexer.cs : added CompressArray(). Now it requires two more
702 parameters for default index and codepoint.
703 * CollationElementTableUtil.cs,
704 NormalizationTableUtil.cs : required changes wrt above change.
705 * MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
706 * MSCompatUnicodeTable.template : Now it uses codepoint indexer.
707 * create-mscompat-collation-table.cs : Now it outputs compressed array.
708 * Makefile : now collation requires MSCompatUnicodeTableUtil.cs
710 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
712 * SimpleCollator.cs :
713 Implemented IsSuffix() and LastIndexOf().
714 Several fixes on index > 0 cases.
715 * TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
717 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
719 * Collation-notes.txt : updated (status, impl. classes).
720 * MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
722 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
724 * SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
725 and IsPrefix(). Tiny code refactory.
726 * TestDriver.cs : sample IsPrefix() and IndexOf() usage.
727 * MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
729 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
731 * SimpleCollator.cs :
732 IndexOf(string, char, CompareOptions) implementation.
733 * TestDriver.cs : sample IndexOf() usage.
735 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
737 * create-mscompat-collation-table.cs : was missing most important
738 kind of blocks - equivalent expansions (e.g. invariant mappings).
739 More readable mappings.
741 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
743 * mono-tailoring-source.txt : new file. It describes tailoring
744 information. Basically examined under .NET 1.x.
745 * create-mscompat-collation-table.cs : consume the file above.
746 * MSCompatUnicodeTable.template : now tailorings is not a stub.
747 * CollationDataStructures.txt : minor fixes.
749 SimpleCollator.cs : added FrenchSort support.
750 * Collation-notes.txt : added description on Latin primary weights.
751 * ldml-limited.rng : added note.
752 * create-tailorings.cs : added note. more serialization (but won't be
755 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
757 * SortKeyBuffer.cs : non-primary character is added to previous
759 * TestDriver.cs : added example case of above.
761 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
763 * SimpleCollator.cs : IgnoreSymbols support.
764 * TestDriver.cs : compilation fix. IgnoreSymbols example.
765 * create-mscompat-collation-table.cs : more Hangul fixes.
767 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
769 * create-mscompat-collation-table.cs : more Hangul fixes.
770 * SortKey.cs : it will replace sys.globalization.SortKey. It has
771 some internal members.
772 * SortKeyBuffer.cs : now it uses SortKey instead of byte[].
773 * SimpleCollator.cs : CompareOptions support. However I don't think
774 it will be developed anymore since SortKey never enables IndexOf().
775 * TestDriver.cs : a few CompareOptions cases.
777 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
779 * SimpleCollator.cs : simple collator implementation that just will
780 use GetSortKey() for all its basis.
781 * TestDriver.cs : sample code that uses this collator set.
782 * MSCompatUnicodeTable.template : removed test driver from here.
784 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
786 * create-mscompat-collation-table.cs : Hangul fixes.
787 Now less than 300 characters that does not have sortkey weights.
788 * MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
790 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
792 * create-mscompat-collation-table.cs : Added control picture mappings.
793 Minor primary weight fixes.
795 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
797 * create-mscompat-collation-table.cs : Added mappings for box
800 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
802 * create-mscompat-collation-table.cs : Added mappings for arrows.
804 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
806 * create-mscompat-collation-table.cs : added support for letterlike
807 characters and squared CJK compatibility characters, ordered by
808 character names (0x0E category).
809 * Collation-notes.txt : added description on that.
811 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
813 * MSCompatUnicodeTable.template : Now expansions are simulated.
814 * create-mscompat-collation-table.cs : filled Korean number level2.
815 Reordered some code blocks to fill correct diacritical differences.
816 * Collation-notes.txt : some corrections and minor additions.
818 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
820 * MSCompatUnicodeTable.template :
821 Now dumper test driver uses SortKeyBuffer for dogfooding.
822 * create-mscompat-collation-table.cs : some diacritical level fixes
823 (with non-working extra latin check).
824 * SortKeyBuffer.cs : several fixes to get working as a practical code.
825 * Collator.cs : make it compilable, leaving things as NotImplemented.
827 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
829 * create-mscompat-collation-table.cs : some fixes on primary category
830 07 (miscellaneous symbols and punctuations).
832 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
834 * create-mscompat-collation-table.cs : more mapping fix on numbers,
835 letters, variable weight characters, circled Japanese and CJK.
836 * MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
837 inclusive. Simplified dumper code.
839 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
841 * create-mscompat-collation-table.cs : finished Hangul (both Jamo
842 and Syllables). sortkey dumper diff lines became 8000 from 30000.
844 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
846 * create-mscompat-collation-table.cs : added some nonspacing marks in
847 either correct or hacky way.
849 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
851 * create-mscompat-collation-table.cs : several improvements. Japanese
852 Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
853 numeric characters, diacritically decorated latin alphabets. Fixed
854 some diacritical weights detection.
855 * MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
856 marks' primary weight as empty.
857 * Collation-notes.txt : some updates.
859 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
861 * create-mscompat-collation-table.cs : don't process nonexact NFKD
862 mapping as equivalent, however store CJK extensions into NFKD map
863 even if one does not strictly match.
864 Now am going to fill Hangul into tables (unlike UCA it does not look
865 possible to calculate sortkey value).
866 Fixed Cyrillic and Georgian UCA based orderings.
867 * MSCompatUnicodeTable.template : added CJK extension sortkey
870 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
872 * create-mscompat-collation-table.cs : Fixed latin alphabet support.
873 Added latin with diacritical and CJK extension.
874 * MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
876 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
878 * create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
879 now not used thouth). Filled CJK ideograph, still not perfect.
880 Fixed number primary keys. NFKD numbers and CJK ideographs are now
881 considered, including brackets elimination.
882 * Makefile : now it downloads DerivedAge.txt.
883 * MSCompatUnicodeTable.template : added dummy code dumper. It computes
884 PrivateUse, Surrogate and Hangul Syllables.
885 * Collation-notes.txt : Noted that Hangul Syllables need more love.
887 2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
889 * create-tailorings.cs : added configuration support. sort them.
890 I wonder if it is really usable. Having own format might be better.
891 * create-mscompat-collation-table.cs : fixing some sortkey numbers,
892 making closer to windows. Now it handles NFKD in some places.
893 * MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
894 * CollationDataStructures.txt : added description on tailoring
895 fields, though they are subject to change.
897 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
899 * create-tailorings.cs, ldml-limited.rng : new file.
900 * LdmlReader.cs : removed old file.
902 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
904 * SortKeyBuffer.cs : split from Collator.cs. Now it considers
905 practical use, reflecting updated sortkey constant design.
906 Especially level 4 weight is split to 4 arrays that are merged in
907 the last stage of GetSortKey().
908 * Collator.cs : thus SortKeyBuffer is removed from here.
909 Additionally, removed some extraneous bits in other classes.
910 * Collation-notes.txt : Some editorial fixes. Added information on
911 Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
912 be stored in simple byte arrays).
913 * CodePointIndexer.cs,
914 create-collation-element-table.cs,
915 CollationElementTable.template,
916 NormalizationTableUtil.cs : short CodePointIndexer method names.
917 * create-mscompat-collation-table.cs : Additional info on why some
918 meaningful characters are ignored in Windows (Unicode version
919 difference). Removed U+070F from special check (was extraneous).
921 2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
923 * MSCompatUnicodeTable.template:
924 Moved body implementation to table creator and put those bool
925 results into an array.
926 * create-mscompat-collation-table.cs :
927 So imported those methods. Modified array output to emit "0x"
928 only for more than 9.
929 * create-normalization-source.cs : ditto on "0x" output matter.
930 * CollationDataStructures.txt : so now it holds ignorableFlags.
932 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
934 * Collation-notes.txt, CollationDataStructures.txt :
935 separate document for data structure design.
937 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
939 * create-mscompat-collation-table.cs : added culture-dependent CJK
940 table creation. It uses CLDR as its basis. (Culture independent CJK
942 * Makefile : added CLDR archive downloading support.
943 * MSCompatUnicodeTable.template : tiny renamings.
944 * Collation-notes.txt : additional CJK info.
946 2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
948 * Collation-notes.txt, create-mscompat-collation-table.cs :
949 added secondary weight support for BlahNumber characters.
951 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
953 * downloaded : added directory. All downloaded files are stored here.
954 * Makefile : use "downloaded" directory.
955 Added more auto-download stuff.
956 * create-mscompat-collation-table.cs :
957 Added Japanese square kana support.
959 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
961 * Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
962 * create-mscompat-collation-table.cs : added support for Arabic abjad,
963 Estrangela and Thaana.
964 * MSCompatUnicodeTable.template : removed BOM.
966 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
968 * Collation-notes.txt : wrong comment cleanup and spelling fixes.
969 * create-mscompat-collation-table.cs : added diacritic support for
970 Latin letters (as long as covered in primary weight).
972 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
974 * Makefile : minor fixes. Added warning lines to generated sources.
976 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
978 * create-char-mapping-source.cs :
979 Removed ToWidthInsensitive() generation.
981 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
983 * create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
984 ToWidthInsensitive() is implemented here, using an array (which is
985 to be optimized using CodePointIndexer).
986 * MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
987 * MSCompatUnicodeTable.template : now it is used to generate
988 MSCompatUnicodeTable.cs which got ready to be used.
989 * Makefile : added MSCompatUnicodeTable.cs build support. Now it
990 supports "make normalization" and "make collation".
992 2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
994 * Collation-notes.txt : Description on ICU is very incorrect. Now it
995 became more rational and sane.
996 * create-mscompat-collation-table.cs : fixed some indexes.
997 * Makefile : added "mstablegen" target.
998 * MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
1000 2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
1002 * Collation-notes.txt : more analysis on "letters".
1003 * create-mscompat-collation-table.cs : more proof of concepts.
1005 2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
1007 * Collation-notes.txt : more info. Started letter sortkey analysis
1008 (some of other stuff are really non-understandable right now.)
1009 * create-mscompat-collation-table.cs : table generator proof-of-
1010 concept source (not compilable).
1011 * MSCompatUnicodeTable.cs : moved some code to the new source.
1014 2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
1016 * Collation-notes.txt : started level 2 weight analysis.
1018 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1020 * Collation-notes.txt : Additional information on how to create
1022 * MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
1024 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1026 * Collation-notes.txt : More case weight (level 3) analysis. I'm
1027 likely to just write table generator.
1029 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1031 * MSCompatUnicodeTable.cs : part of level 4 weight implementation.
1033 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1035 * Collation-notes.txt :
1037 Revised comparison methods; backward iteration is possible.
1038 More on char-by-char comparison.
1039 Level 4 comparison is actually a bit more complex.
1041 * Collator.cs : some conceptual updates wrt above.
1043 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1045 * Collation-notes.txt : Japanese voice mark is level 2, and Hangul
1046 properties are level 3.
1048 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1050 * Collation-notes.txt : Make it more readable. More analysis on
1051 level 3 and 4 sortkey structures.
1052 * Collator.cs : some compilation fixes (not compilable yet).
1054 2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
1056 * Collation-notes.txt : Analysis on variable-weighting (level 5)
1058 * Collator.cs : updated corresponding part of level 5, and more.
1060 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1062 * Collation-notes.txt : more updates.
1063 * Collator.cs : rewrote from scratch. Some rough sketch for sortkey
1064 buffer, character iterator and collator methods. Not compiling.
1066 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1068 * Collator.cs : Am going to replace it with new one. No need for
1069 CompareOptions-dependent Comparer.
1071 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1073 * Collation-notes.txt : There seems a bit more complexity.
1075 2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
1077 * Collation-notes.txt : more updates, being close to write sortkey
1080 2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
1082 * CompareInfoImpl.cs, Collator.cs : conceptual update
1083 * Collation-notes.txt : some corrections and additions.
1084 * Makefile : added LDML input (but it won't be used at all).
1086 2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
1088 * Collation-notes.txt : more updates.
1090 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1092 * Collation-notes.txt : more updates.
1094 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1096 * Collation-notes.txt : some updates.
1097 * create-mapping-char-source.cs : superscripts and subscripts are also
1098 ignored in IgnoreWidth comparison.
1099 * Makefile : tiny touch fix.
1101 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1103 * CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
1105 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1107 * create-char-mapping-source.cs : Now it generates
1108 ToWidthInsensitive() from combining category <wide> and <narrow>.
1109 * MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
1110 ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
1112 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1114 * README, LdmlReader.cs, DataStructures.txt : new files.
1116 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1118 * CodePointIndexer.cs,
1119 Collation-notes.txt,
1120 CollationElementTable.template,
1121 CollationElementTableUtil.cs,
1122 create-char-mapping-source.cs,
1123 create-collation-element-table.cs,
1124 create-combining-class-source.cs,
1125 create-normalization-source.cs,
1127 MSCompatUnicodeTable.cs,
1128 Normalization.template,
1129 NormalizationTableUtil.cs : initial checkin (to private branch).