International Components for Unicode

bad Bad Bat bat bäd Bäd bät Bät côté coté côte cote black-bird blackbird black-birds blackbirds Search for: (or leave blank for no search) Overlap?

Show Collation Key

Options

Original

00: bad

01: Bad

02: Bat

03: bat

04: bäd

05: Bäd

06: bät

07: Bät

08: côté

09: coté

10: côte

11: cote

12: black-bird

13: blackbird

14: black-birds

15: blackbirds

Collated

01: bad

02: Bad

05: bäd

06: Bäd

04: bat

03: Bat

07: bät

08: Bät

13: black-bird

15: black-birds

14: blackbird

16: blackbirds

12: cote

10: coté

11: côte

09: côté

Custom Rules (Click 'Fetch rules for locale' above, to edit rules)

Instructions:

Type in the lines of text you want to sort under Input Text.
Select the Options you want, and hit Sort.
The two output columns will show the original order and the sorted order, each numbered according to the original line. Any lines in the same box (with the same color) are sorted identically, according to the options you provide.
If you want to try changing the sorting rules, hit Edit Rules. It inserts the rules for the current locale, which you can then alter and try sorting with. You will need to know the format of the rules: see Collation in the ICU User Guide for more information.
- Note: if you hit Edit Rules again, it will replace whatever you have altered!

Options:

ICU implements the Unicode Collation Algorithm, which is a multi-level sort.
1. If there are any differences in base letters, that determines the result
2. Otherwise, if there are any differences in accents*, that determines the results
3. Otherwise, if there are any differences in case*, that determines the results
4. Otherwise, if there are any differences in punctuation*, that determines the results
The Level option determines which of the above levels to take into account when sorting.
With Force Case, the normal case order (a < A vs. A < a) can be changed.
If Punctutation = Base, then punctuation is treated like base letters. If punctuation is Shifted, it is ignored except at L4.
A Case level can be used to keep a case level even if the strength is L1 or L2.
A Hiragana level adds a special level for JIS compatibility. It is only used if the level is L4 .. L5.
French accents force accents to be considered backwards, for the end of the string forwards.
With Full Normalization, all strings are compared

For more information, see the ICU userguide

Your settings: (click to change)

Powered by ICU 3.9.3
Sunday, May 11, 2008 11:42:34 AM PT
Timezone ID: PST8PDT (Change)

Label Locale:

Language
English

Region / Variant
(none)

Transliteration:

off

Help Transliteration Help XML Source Compare File a bug