1 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
3 * create-mscompat-collation-table.cs : a set of tiny mapping fixes.
5 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
7 * create-mscompat-collation-table.cs : some diacritical fixes for
8 Latin. Added batch mapping method that considers computed
9 diacritical weight (for numbers).
11 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
13 * managed-collation.patch : forgot to add System.String patch.
15 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
17 * MSCompatUnicodeTable.cs : added resource existence check (required
18 for mscorlib transient time from the one without resources to the
21 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
23 * create-mscompat-collation-table.cs : fixed punctuations and hyphen
24 (shift) primary weight.
26 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
28 * create-mscompat-collation-table.cs : more nonspacing mark fixes.
29 Some non-basic Cyrillic diacritical weight fixes.
31 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
33 * create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
34 and level 3. Tiny Hangul weight fixes.
35 * MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
37 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
39 * create-mscompat-collation-table.cs : some normal characters who have
40 "narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
41 values were different. Handle U+30FB as category A.
42 * MSCompatUnicodeTable.cs : U+30FB does not have special weight.
44 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
46 * create-mscompat-collation-table.cs : more diacritical weight fixes.
47 Removed some unused code.
49 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
51 * create-mscompat-collation-table.cs : Fixed some Thai and Arabic
54 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
56 * create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
58 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
60 * create-mscompat-collation-table.cs : Fixed nonspacing marks in
61 Malayalam, Thai and Lao. Removed extraneous hack.
63 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
65 * SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
66 Some refactoring on IndexOf() code. Removed unused Matches().
67 * Collation-notes.txt : some methods needed to be reimplemented, so
68 rewrote the description.
70 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
72 * SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
73 Thus supported extenders in IsSuffix().
75 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
77 * SimpleCollator.cs : more IsSuffix() simplification, but it will be
78 stopped here since it cannot handle extenders (implementing new
81 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
83 * SimpleCollator.cs : simplified IsSuffix() code.
85 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
87 * SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
88 entire replacement string if char target was an expansion.
89 IsSuffix() was using a method for IsPrefix() which was incorrect.
90 Removed old IsPrefix() code.
92 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
94 * SimpleCollator.cs : IndexOf() was incorrectly sharing the same
95 byte[] field in different areas of code. Now extenders in both
96 source and target really work in IndexOf().
98 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
100 * create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
101 * SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
103 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
105 * SimpleCollator.cs : Now FilterExtender() handles all extender
106 support. IndexOf() and LastIndexOf() now supports extenders.
107 IndexOf() and LastIndexOf() did not proceed contraction source
108 length as expected. Tiny refactoring on private IsPrefix() to take
111 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
113 * SimpleCollator.cs : when restoring from expansion, go back to the
114 top of the loop (to avoid index out of range).
115 Now IsPrefix() is implemented to reuse Compare() and thus it now
116 supports extender as well.
117 * Collation-notes.txt : status update. Deleted optimization part in
118 status section (it is duplicate).
120 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
122 * SimpleCollator.cs : some code reordering.
123 * create-mscompat-collation-table.cs : it was still missing U+3094.
125 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
127 * SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
129 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
131 * SimpleCollator.cs : In GetSortKey(), don't update previousChar when
132 it is not primary (e.g. don't "extend" diacritical mark).
134 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
136 * managed-collation.patch : CompareInfo.Compare() should consider
137 the possibilities that non-empty string might be actually empty
138 in culture-sensitive context.
140 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
142 * SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
143 target is "empty" (in culture-sensitive context).
145 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
147 * SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
148 characters in target string.
150 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
152 * SimpleCollator.cs : When IgnoreWidth is specified, all Kana
153 characters are regarded as half-width.
154 Even though IgnoreWidth is specified, it should not ignore case.
155 For special weight comparison, the default values (E4) are bigger
156 than non-default values.
157 * SortKeyBuffer.cs : It should save LCID and original string.
158 * create-mscompat-collation-table.cs : For Japanese half-width kana,
159 it should not be counted in widthCompat map since IgnoreWidth does
160 not really ignore those differences.
162 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
164 * create-mscompat-collation-table.cs : Fixed missing Japanese bits.
166 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
168 * create-mscompat-collation-table.cs :
169 tiny diacritical weight fix for U+20D0-U+20E1.
171 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
173 * create-mscompat-collation-table.cs : ja CJK ideograph got completed.
175 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
177 * create-mscompat-collation-table.cs : Fixed CJK custom Japanese
178 mapping. It (maybe as well as other CJK tables) mixes NFKD. For
179 Japanese, modified NFKD table (because of Windows lame design).
181 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
183 * Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
184 * MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
185 invoked at any time it is required.
186 * SimpleCollator.cs : call FillCJK() above in .ctor().
187 * MSCompatUnicodeTableUtil.cs : CJK range was wider.
188 * create-mscompat-collation-table.cs : CJK binary was missing the
189 length. CJK remapping is being moved to ModifyUnidata().
190 For cjk-ja mapping, we have to consider compat characters to be
191 added to the map, besides the raw UCA table.
193 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
195 * SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
197 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
199 * SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
200 contraction as expected. Fixed Compare() to save s2's contraction
202 * TestDriver.cs :added LastIndexOf() tester w/ indexes.
204 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
206 * managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
207 incorrectly use Compare().
208 * TestDriver.cs : more moved to nunit tests.
210 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
212 * SimpleCollator.cs : several fixes on Compare().
213 - Ignorable characters are skippted at the top of the loop.
214 - IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
215 - In such case that s1 index is increased while s2 contraction is
216 replaced, s1 is inconsistently proceeded (bug).
217 - IsIgnorable() now also checks IgnoreNonSpace.
218 - Fixed FilterOptions() that does not work for IgnoreWidth at all.
219 * TestDriver.cs : now some are moved to nunit tests.
220 * Collation-notes.txt : minor todo update.
222 2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
224 * SimpleCollator.cs : Compare() was ignoring such case that both
225 entire strings have '-' to be compared.
226 * Collation-notes.txt : more status updates.
227 * TestDriver.cs : added '-' use cases.
229 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
231 * SimpleCollator.cs : to be same as other buggy part, it now handles
232 U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
234 Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
235 * create-mscompat-collation-table.cs : dummy values for extenders.
237 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
239 * SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
240 should be computed from ExtenderType, and voice mark weight should
242 * MSCompatUnicodeTable.cs : added tiny comment.
244 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
246 * SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
247 * SimpleCollator.cs : support for extender (U+309D etc.).
249 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
251 * create-mscompat-collation-table.cs : some punct/symbols fix.
252 * managed-collation.patch : new (and temporary) file to support
253 managed collation in mscorlib.
254 * README : described how to use managed collation.
256 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
258 * create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
259 U+482-4C8 (though needs diacritical fixes).
260 * MSCompatUnicodeTable.cs : tiny comment for alternative impl.
262 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
264 * create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
265 computation code, since it looks like the same way as Latin letters
266 have. Thus removed all other approach (UCA, by letter name).
268 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
270 * create-mscompat-collation-table.cs : diacritical fix for "double-
271 struck". Syriac nonspacing fixes.
273 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
275 * create-mscompat-collation-table.cs : more math symbol weight fixes.
277 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
279 * create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
281 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
283 * create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
284 implemented (no stub). Some other fixes on category 8-A.
286 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
288 * create-mscompat-collation-table.cs : some minor fixes on Arabic,
289 Korean and Japanese sortkey weights.
291 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
293 * create-mscompat-collation-table.cs : More diacritical fixes.
294 Georgian characters do not have level 2 weights but level 3.
296 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
298 * create-mscompat-collation-table.cs : Roman numeral characters
299 have diacritical weight. quick hack for control signs (U+2400..)
302 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
304 * create-mscompat-collation-table.cs : improving Latin mappings.
305 Setting non-ASCII Latin characters' primary weight between those
306 ASCII characters, and setting diacritical weight (hacky).
307 * MSCompatUnicodeTable.cs :
308 Kanatype check: fixed (voice marks) and improved (comparison order).
310 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
312 * create-mscompat-collation-table.cs : more diacritical fixes.
313 primary weight fixes on punctuations in category 07.
315 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
317 * create-mscompat-collation-table.cs : several diacritical fixes.
318 * TestDriver.cs : sortkey dumper should use StringSort.
320 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
322 * SimpleCollator.cs : fixed incorrect indexer setup. Optimized
323 GetContraction() call a bit.
325 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
327 * create-mscompat-collation-table.cs : fixed incorrect level 2
329 * MSCompatUnicodeTable.cs : remove debug line.
331 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
333 * MSCompatUnicodeTableUtil.cs,
334 MSCompatUnicodeTable.cs,
336 create-mscompat-collation-table.cs : made some members internal and
337 accessible from other classes. Many indexes could be 0 by default.
338 * SimpleCollator.cs : optimizations. avoid method call.
340 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
342 * Collation-notes.txt : more updates.
343 * SimpleCollator.cs : Added quick check for Ordinal comparison.
344 Fixed special weight comparison. It cannot be customizable in the
345 implementation (and it won't be harmful).
346 * mono-tailoring-source.txt : thus updated comment.
348 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
350 * SimpleCollator.cs : Compare() was missing French sort support.
351 * TestDriver.cs : added example case.
353 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
355 * Collation-notes.txt : updated status. Eliminated descriptions on
356 "iterator" (I avoided it for performance concern). Fixed misc.
357 incorrect descriptions.
359 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
361 * Collator.cs : Now that SimpleCollator became feature complete, it is
364 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
366 * SimpleCollator.cs : implemented decent Compare() that immediately
367 stops at first primary difference.
369 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
371 * SimpleCollator.cs : indexers might return -1.
373 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
375 * SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
376 buggy (length check for source was missing).
378 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
380 * create-mscompat-collation-table.cs : Fixed tailoring table output
381 to be in correct and countable order. Now if tailoring alias was not
382 found, just stop the build.
383 * MSCompatUnicodeTable.cs : several build fixes. Now it works to read
385 * mono-tailoring-source.txt : commented out CJK aliases that miss
387 * Makefile : needed further filename fixes.
389 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
391 * MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
392 (now it is working as a standalone file).
393 * Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
394 (the generator now creates both binary resources and C# source).
396 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
398 * create-mscompat-collation-table.cs : Now it generates binary
399 resources (to parent directory).
400 * MSCompatUnicodeTable.template : added conditional code that fills
401 collation tables from manifest resources.
402 * Makefile : remove collation table binaries as well on "make clean".
403 Removed extraneous dependency.
405 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
407 * MSCompatUnicodeTable.template,
408 SimpleCollator.cs : removed extraneous GetExpansion().
410 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
412 * SimpleCollator.cs : IsSuffix() also supports contractions.
413 * TestDriver.cs : IsSuffix() example contraction cases.
415 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
417 * SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
418 what current IsPrefix() does). For expansion of target, IsPrefix()
419 should check the no-match case that expansion is longer than input.
420 Some refactory on IsPrefix().
421 Added GetContractionTal() for IsSuffix() (not used yet).
423 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
425 * TestDriver.cs : added IsPrefix() expansion cases.
426 * SimpleCollator.cs : IsPrefix() now supports contractions (with much
427 of complexity), and it now returns bool again.
428 IndexOf() for replacement should make use of IndexOfPrimitiveChar()
429 since expansions won't be expanded recursively.
431 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
433 * SimpleCollator.cs : commonized character comparison in IsPrefix()
434 and IsSuffix(). csc compile fix.
435 * CompareInfoImpl.cs : deleted.
437 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
439 * TestDriver.cs : added SimpleCollator.ctor() sanity check.
440 Added replacement contraction example.
441 * SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
442 contraction in source string. Extracted matching code to Matches().
443 Replacement contraction was including extraneous '\x0'.
445 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
447 * Collation-notes.txt : updated status.
448 * CollationDataStructures.txt : tiny fixes.
449 * SimpleCollator.cs :
450 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
451 namespace Util and csc borked).
452 GetContraction was incorrectly returning first item.
453 Private IsPrefix() now returns int (but it might not be in real use).
454 Extracted simple char comparison to CompareCharSimple().
455 IndexOf() and LastIndexOf() now fully handle contractions (both
456 binary key and string replacement) in "target" (for "s" not yet).
457 * TestDriver.cs : be more verbose.
458 * mono-tailoring-source.txt : added comment.
459 * MSCompatUnicodeTable.template :
460 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
462 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
464 * create-mscompat-collation-table.cs : compute COMBINING blah marks as
465 well as those characters WITH blah.
466 * TestDriver.cs : added combining sortkey cases.
468 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
470 * mono-tailoring-source.txt : fixed description on '*' in sortkeys.
471 * SimpleCollator.cs : Now it fully uses tailoring info. Fixed
472 contraction search that worked only when string is contraction.
473 Removed commented code. Minor refactoring.
474 * TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
476 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
478 * create-mscompat-collation-table.cs,
479 * mono-tailoring-source.txt : removed extraneous level 4 sortkey
480 which cannot be supported.
481 * SimpleCollator.cs : added GetContraction() and used in some places.
482 Now CompareOptions is set only once. Reordered some code (e.g.
483 ignorable check -> get compat char -> compare).
485 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
487 * SimpleCollator.cs : sort tailoring tables before actual usage.
488 Support diacritical remappings (it is customized collation rule
489 which does not exist in UCA).
491 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
493 * SimpleCollator.cs : build culture specific tailoring table from
494 TailoringInfo and unified data array.
495 * create-mscompat-collation-table.cs : Added null termination to
496 sortkey map tailorings (mostly to save my eyes).
497 * MSCompatUnicodeTable.template : added public TailoringValues.
499 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
501 * SortKeyBuffer.cs : handle special weight (category 06) characters.
502 * Collation-notes.txt : Updated description on special weight (it was
504 * TestDriver.cs : added special weight cases.
506 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
508 * MSCompatUnicodeTable.template : added GetTailoringInfo().
509 * SimpleCollator.cs : Now tailoring information is acquired and used.
510 (FrenchSort is supported but Compare() won't work expectedly since
511 the table is still incomplete for those diacritical marks).
512 * SortKeyBuffer.cs : On reversing diacritical weights, it should
513 ignore zeros. Reset() should reset frenchSorted flag.
515 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
517 * create-mscompat-collation-table.cs : Further fixes on Jamo,
518 diacritical weights by character name, and *Numbers primary weights.
520 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
522 * create-mscompat-collation-table.cs : More fix on Devanagari,
523 Gujarati, Oliya, Tamil and Lao sortkeys.
525 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
527 * create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
530 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
532 * create-mscompat-collation-table.cs : Fixed Thai character primary
533 and secondary values. Fixed Thaana letters. Added more LAMESPEC
534 CJK compat. Fixed some circled CJK secondary weight.
535 Hacked some nonspacing mark sortkey value adjustment.
537 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
539 * create-mscompat-collation-table.cs : CP932.TXT was not parsed as
540 expected. JIS ordering was incorrect. OtherNumbers that represents
541 10 or more values were incorrectly computed the offset. Some Hangul
542 compat characters has different offset.
544 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
546 * create-mscompat-collation-table.cs : Fixed 0x8 category characters.
547 Added hack for need-to-be-fixed characters to fall into 0xA category.
548 * create-collation-element-table.cs : previous checkin seem failed :(
549 * README: updated a bit.
551 2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
553 * CodePointIndexer.cs :
554 removed extraneous switch (I could use empty array for that need).
555 * CollationElementTableUtil.cs : primary weight type became ushort.
556 * create-collation-element-table.cs : several bugfixes.
557 collElem should be int. It was skipping most of entries because of
558 incorrect string tokenization.
560 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
562 * create-mscompat-collation-table.cs : handle some Jamo NKFD.
564 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
566 * SimpleCollator.cs : forgot to commit in the last checkin.
567 * create-mscompat-collation-table.cs : fixed arabic shift weight chars.
568 * TestDriver.cs : switch table dumper and collator testing.
569 * SortKey.cs : for now comment out internal indexes (not in use).
571 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
573 * MSCompatUnicodeTable.template,
574 SimpleCollator.cs : support for culture dependent CJK table.
576 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
578 * create-mscompat-collation-table.cs,
579 MSCompatUnicodeTableUtil.cs : make CJK table more compact.
581 2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
583 * SimpleCollator.cs : Fixed stupid index search when start != 0.
585 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
587 * SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
588 now starts from "start" and proceeds backward by "length".
589 * TestDriver.cs : fix warning.
591 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
593 * TestDriver.cs : more tests.
594 * SimpleCollator.cs : LastIndexOf() is not setting search length
595 on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
597 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
599 * create-normalization-source.cs : output propValue as uint.
601 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
603 * SortKey.cs : Now it is System.Globalization.SortKey.
604 To replace existing implementation, it now requires lcid and
605 CompareOptions. Added required members.
606 * SortKeyBuffer.cs : thus .ctor() requires LCID.
607 * SimpleCollator.cs : made required changes above.
609 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
611 * CodePointIndexer.cs : added CompressArray(). Now it requires two more
612 parameters for default index and codepoint.
613 * CollationElementTableUtil.cs,
614 NormalizationTableUtil.cs : required changes wrt above change.
615 * MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
616 * MSCompatUnicodeTable.template : Now it uses codepoint indexer.
617 * create-mscompat-collation-table.cs : Now it outputs compressed array.
618 * Makefile : now collation requires MSCompatUnicodeTableUtil.cs
620 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
622 * SimpleCollator.cs :
623 Implemented IsSuffix() and LastIndexOf().
624 Several fixes on index > 0 cases.
625 * TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
627 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
629 * Collation-notes.txt : updated (status, impl. classes).
630 * MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
632 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
634 * SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
635 and IsPrefix(). Tiny code refactory.
636 * TestDriver.cs : sample IsPrefix() and IndexOf() usage.
637 * MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
639 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
641 * SimpleCollator.cs :
642 IndexOf(string, char, CompareOptions) implementation.
643 * TestDriver.cs : sample IndexOf() usage.
645 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
647 * create-mscompat-collation-table.cs : was missing most important
648 kind of blocks - equivalent expansions (e.g. invariant mappings).
649 More readable mappings.
651 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
653 * mono-tailoring-source.txt : new file. It describes tailoring
654 information. Basically examined under .NET 1.x.
655 * create-mscompat-collation-table.cs : consume the file above.
656 * MSCompatUnicodeTable.template : now tailorings is not a stub.
657 * CollationDataStructures.txt : minor fixes.
659 SimpleCollator.cs : added FrenchSort support.
660 * Collation-notes.txt : added description on Latin primary weights.
661 * ldml-limited.rng : added note.
662 * create-tailorings.cs : added note. more serialization (but won't be
665 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
667 * SortKeyBuffer.cs : non-primary character is added to previous
669 * TestDriver.cs : added example case of above.
671 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
673 * SimpleCollator.cs : IgnoreSymbols support.
674 * TestDriver.cs : compilation fix. IgnoreSymbols example.
675 * create-mscompat-collation-table.cs : more Hangul fixes.
677 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
679 * create-mscompat-collation-table.cs : more Hangul fixes.
680 * SortKey.cs : it will replace sys.globalization.SortKey. It has
681 some internal members.
682 * SortKeyBuffer.cs : now it uses SortKey instead of byte[].
683 * SimpleCollator.cs : CompareOptions support. However I don't think
684 it will be developed anymore since SortKey never enables IndexOf().
685 * TestDriver.cs : a few CompareOptions cases.
687 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
689 * SimpleCollator.cs : simple collator implementation that just will
690 use GetSortKey() for all its basis.
691 * TestDriver.cs : sample code that uses this collator set.
692 * MSCompatUnicodeTable.template : removed test driver from here.
694 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
696 * create-mscompat-collation-table.cs : Hangul fixes.
697 Now less than 300 characters that does not have sortkey weights.
698 * MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
700 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
702 * create-mscompat-collation-table.cs : Added control picture mappings.
703 Minor primary weight fixes.
705 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
707 * create-mscompat-collation-table.cs : Added mappings for box
710 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
712 * create-mscompat-collation-table.cs : Added mappings for arrows.
714 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
716 * create-mscompat-collation-table.cs : added support for letterlike
717 characters and squared CJK compatibility characters, ordered by
718 character names (0x0E category).
719 * Collation-notes.txt : added description on that.
721 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
723 * MSCompatUnicodeTable.template : Now expansions are simulated.
724 * create-mscompat-collation-table.cs : filled Korean number level2.
725 Reordered some code blocks to fill correct diacritical differences.
726 * Collation-notes.txt : some corrections and minor additions.
728 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
730 * MSCompatUnicodeTable.template :
731 Now dumper test driver uses SortKeyBuffer for dogfooding.
732 * create-mscompat-collation-table.cs : some diacritical level fixes
733 (with non-working extra latin check).
734 * SortKeyBuffer.cs : several fixes to get working as a practical code.
735 * Collator.cs : make it compilable, leaving things as NotImplemented.
737 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
739 * create-mscompat-collation-table.cs : some fixes on primary category
740 07 (miscellaneous symbols and punctuations).
742 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
744 * create-mscompat-collation-table.cs : more mapping fix on numbers,
745 letters, variable weight characters, circled Japanese and CJK.
746 * MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
747 inclusive. Simplified dumper code.
749 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
751 * create-mscompat-collation-table.cs : finished Hangul (both Jamo
752 and Syllables). sortkey dumper diff lines became 8000 from 30000.
754 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
756 * create-mscompat-collation-table.cs : added some nonspacing marks in
757 either correct or hacky way.
759 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
761 * create-mscompat-collation-table.cs : several improvements. Japanese
762 Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
763 numeric characters, diacritically decorated latin alphabets. Fixed
764 some diacritical weights detection.
765 * MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
766 marks' primary weight as empty.
767 * Collation-notes.txt : some updates.
769 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
771 * create-mscompat-collation-table.cs : don't process nonexact NFKD
772 mapping as equivalent, however store CJK extensions into NFKD map
773 even if one does not strictly match.
774 Now am going to fill Hangul into tables (unlike UCA it does not look
775 possible to calculate sortkey value).
776 Fixed Cyrillic and Georgian UCA based orderings.
777 * MSCompatUnicodeTable.template : added CJK extension sortkey
780 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
782 * create-mscompat-collation-table.cs : Fixed latin alphabet support.
783 Added latin with diacritical and CJK extension.
784 * MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
786 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
788 * create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
789 now not used thouth). Filled CJK ideograph, still not perfect.
790 Fixed number primary keys. NFKD numbers and CJK ideographs are now
791 considered, including brackets elimination.
792 * Makefile : now it downloads DerivedAge.txt.
793 * MSCompatUnicodeTable.template : added dummy code dumper. It computes
794 PrivateUse, Surrogate and Hangul Syllables.
795 * Collation-notes.txt : Noted that Hangul Syllables need more love.
797 2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
799 * create-tailorings.cs : added configuration support. sort them.
800 I wonder if it is really usable. Having own format might be better.
801 * create-mscompat-collation-table.cs : fixing some sortkey numbers,
802 making closer to windows. Now it handles NFKD in some places.
803 * MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
804 * CollationDataStructures.txt : added description on tailoring
805 fields, though they are subject to change.
807 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
809 * create-tailorings.cs, ldml-limited.rng : new file.
810 * LdmlReader.cs : removed old file.
812 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
814 * SortKeyBuffer.cs : split from Collator.cs. Now it considers
815 practical use, reflecting updated sortkey constant design.
816 Especially level 4 weight is split to 4 arrays that are merged in
817 the last stage of GetSortKey().
818 * Collator.cs : thus SortKeyBuffer is removed from here.
819 Additionally, removed some extraneous bits in other classes.
820 * Collation-notes.txt : Some editorial fixes. Added information on
821 Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
822 be stored in simple byte arrays).
823 * CodePointIndexer.cs,
824 create-collation-element-table.cs,
825 CollationElementTable.template,
826 NormalizationTableUtil.cs : short CodePointIndexer method names.
827 * create-mscompat-collation-table.cs : Additional info on why some
828 meaningful characters are ignored in Windows (Unicode version
829 difference). Removed U+070F from special check (was extraneous).
831 2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
833 * MSCompatUnicodeTable.template:
834 Moved body implementation to table creator and put those bool
835 results into an array.
836 * create-mscompat-collation-table.cs :
837 So imported those methods. Modified array output to emit "0x"
838 only for more than 9.
839 * create-normalization-source.cs : ditto on "0x" output matter.
840 * CollationDataStructures.txt : so now it holds ignorableFlags.
842 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
844 * Collation-notes.txt, CollationDataStructures.txt :
845 separate document for data structure design.
847 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
849 * create-mscompat-collation-table.cs : added culture-dependent CJK
850 table creation. It uses CLDR as its basis. (Culture independent CJK
852 * Makefile : added CLDR archive downloading support.
853 * MSCompatUnicodeTable.template : tiny renamings.
854 * Collation-notes.txt : additional CJK info.
856 2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
858 * Collation-notes.txt, create-mscompat-collation-table.cs :
859 added secondary weight support for BlahNumber characters.
861 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
863 * downloaded : added directory. All downloaded files are stored here.
864 * Makefile : use "downloaded" directory.
865 Added more auto-download stuff.
866 * create-mscompat-collation-table.cs :
867 Added Japanese square kana support.
869 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
871 * Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
872 * create-mscompat-collation-table.cs : added support for Arabic abjad,
873 Estrangela and Thaana.
874 * MSCompatUnicodeTable.template : removed BOM.
876 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
878 * Collation-notes.txt : wrong comment cleanup and spelling fixes.
879 * create-mscompat-collation-table.cs : added diacritic support for
880 Latin letters (as long as covered in primary weight).
882 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
884 * Makefile : minor fixes. Added warning lines to generated sources.
886 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
888 * create-char-mapping-source.cs :
889 Removed ToWidthInsensitive() generation.
891 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
893 * create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
894 ToWidthInsensitive() is implemented here, using an array (which is
895 to be optimized using CodePointIndexer).
896 * MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
897 * MSCompatUnicodeTable.template : now it is used to generate
898 MSCompatUnicodeTable.cs which got ready to be used.
899 * Makefile : added MSCompatUnicodeTable.cs build support. Now it
900 supports "make normalization" and "make collation".
902 2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
904 * Collation-notes.txt : Description on ICU is very incorrect. Now it
905 became more rational and sane.
906 * create-mscompat-collation-table.cs : fixed some indexes.
907 * Makefile : added "mstablegen" target.
908 * MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
910 2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
912 * Collation-notes.txt : more analysis on "letters".
913 * create-mscompat-collation-table.cs : more proof of concepts.
915 2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
917 * Collation-notes.txt : more info. Started letter sortkey analysis
918 (some of other stuff are really non-understandable right now.)
919 * create-mscompat-collation-table.cs : table generator proof-of-
920 concept source (not compilable).
921 * MSCompatUnicodeTable.cs : moved some code to the new source.
924 2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
926 * Collation-notes.txt : started level 2 weight analysis.
928 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
930 * Collation-notes.txt : Additional information on how to create
932 * MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
934 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
936 * Collation-notes.txt : More case weight (level 3) analysis. I'm
937 likely to just write table generator.
939 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
941 * MSCompatUnicodeTable.cs : part of level 4 weight implementation.
943 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
945 * Collation-notes.txt :
947 Revised comparison methods; backward iteration is possible.
948 More on char-by-char comparison.
949 Level 4 comparison is actually a bit more complex.
951 * Collator.cs : some conceptual updates wrt above.
953 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
955 * Collation-notes.txt : Japanese voice mark is level 2, and Hangul
956 properties are level 3.
958 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
960 * Collation-notes.txt : Make it more readable. More analysis on
961 level 3 and 4 sortkey structures.
962 * Collator.cs : some compilation fixes (not compilable yet).
964 2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
966 * Collation-notes.txt : Analysis on variable-weighting (level 5)
968 * Collator.cs : updated corresponding part of level 5, and more.
970 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
972 * Collation-notes.txt : more updates.
973 * Collator.cs : rewrote from scratch. Some rough sketch for sortkey
974 buffer, character iterator and collator methods. Not compiling.
976 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
978 * Collator.cs : Am going to replace it with new one. No need for
979 CompareOptions-dependent Comparer.
981 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
983 * Collation-notes.txt : There seems a bit more complexity.
985 2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
987 * Collation-notes.txt : more updates, being close to write sortkey
990 2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
992 * CompareInfoImpl.cs, Collator.cs : conceptual update
993 * Collation-notes.txt : some corrections and additions.
994 * Makefile : added LDML input (but it won't be used at all).
996 2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
998 * Collation-notes.txt : more updates.
1000 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1002 * Collation-notes.txt : more updates.
1004 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1006 * Collation-notes.txt : some updates.
1007 * create-mapping-char-source.cs : superscripts and subscripts are also
1008 ignored in IgnoreWidth comparison.
1009 * Makefile : tiny touch fix.
1011 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1013 * CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
1015 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1017 * create-char-mapping-source.cs : Now it generates
1018 ToWidthInsensitive() from combining category <wide> and <narrow>.
1019 * MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
1020 ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
1022 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1024 * README, LdmlReader.cs, DataStructures.txt : new files.
1026 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1028 * CodePointIndexer.cs,
1029 Collation-notes.txt,
1030 CollationElementTable.template,
1031 CollationElementTableUtil.cs,
1032 create-char-mapping-source.cs,
1033 create-collation-element-table.cs,
1034 create-combining-class-source.cs,
1035 create-normalization-source.cs,
1037 MSCompatUnicodeTable.cs,
1038 Normalization.template,
1039 NormalizationTableUtil.cs : initial checkin (to private branch).