!!!NOTE!!! The implementation had been improved without updating this document, so some part of it is obsolete. * CharUnicodeInfo internal design ** Normalization information structures From DerivedNormalizationProps.txt NormalizationCorrections.txt UnicodeData.txt To Normalization.cs 1) byte [] NormalizationPropIndexes (length = 65536) 2) int [] NormalizationPropValues (length = property value count) A value consists of those flags: bit semantics 0 NFD_QC 1 NFKD_QC 2 NFC_QC : NO 3 NFC_QC : Maybe 4 NFKC_QC : NO 5 NFKC_QC : Maybe 6 Expands_On_NFD 7 Expands_On_NFC 8 Expands_On_NKFD 9 Expands_On_NKFC 10 Full_Composition_Exclusion 11-12 index to NormalizationCorrections 16-31 FC_NFKC : short[] 0 : no replacement; plus - index to SingleNormalization minus - 0xF0000000 and index to MultiNormalization 3) char [] SingleNormalization FIXME: it should consider NormalizationCorrections.txt 4) char [] MultiNormalization (index for each 4 bytes, zero padded) 5) int [] canonicalEq 6) int [] canonEquiv : index to canonicalEq ** Default Collation Element Table From allkeys.txt from UCA To CollationElementTable.cs struct SortKeyValue { bool Alt; int Primary; int Secondary; int Thirtiary; int Quartenary; } SortKeyValue [] keyValues; of length count of defined values struct CollationElementEntry { short Index; short Count; } int [] collElem; of length char.MaxValue bits: 0-15 : index of top entry to keyValues 16-31 : count or 0 (count = 1) Indexes are computed for compression (eliminating vast zero-arrays). See CollationElementTableUtil.cs for the implementation. ** Combining class information From DerivedCombiningClass.txt To CombiningClass.cs It generates one simple GetCombiningClass () method. ** Width information From EastAsianWidth.txt 1) WidthType : enum ( : short) values are: N/A/H/W/F/Na (FIXME: descriptive names!) ** Case folding information FromCaseFolding.txt (SpecialCasing.txt is not considered. We could use TextInfo.) Common case folding is handled in Char.ToLower(InvariantCulture). 1) CaseFoldingIndexes : short index to CaseFolding 2) CaseFoldingDetails : array of CaseFolding structure defined below: struct CaseFolding { char Simple; char Full1; char Full2; char Full3; } ** Properties which do not consist of tables - Hex_Digit, ASCII_Hex_Digit : too few to create tables. - Logical_Order_Exception : too few to create table.