<i18n dev> Possible error in tr18?
Tom Christiansen
tchrist at perl.com
Wed Jan 26 12:10:20 PST 2011
Under the RL2.2 link of tr18, there appears to be a error:
C2. An implementation claiming conformance to Level 2 of this
specification shall satisfy C1, and meet the requirements
described in the following sections:
RL2.1 Canonical Equivalents
RL2.2 Extended Grapheme Clusters
RL2.3 Default Word Boundaries
RL2.4 Default Loose Matches
RL2.5 Name Properties
RL2.6 Wildcards in Property Values
Following the RL2.2 link, you find this:
2.2 Extended Grapheme Clusters
One or more Unicode characters may make up what the user thinks of as a
character. To avoid ambiguity with the computer use of the term character,
this is called a grapheme cluster. For example, "G" + acute-accent is a
grapheme cluster: it is thought of as a single character by users, yet is
actually represented by two Unicode characters. The Unicode Standard
defines extended grapheme clusters that keep Hangul syllables together and
do not break between base characters and combining marks. The precise
definition is in UTR #29: Text Boundaries [UAX29]. These extended grapheme
clusters are not the same as tailored grapheme clusters, which are covered
in Level 3, Tailored Grapheme Clusters.
RL3.12 Extended Grapheme Clusters
To meet this requirement, an implementation shall provide a mechanism
for matching against an arbitrary extended grapheme cluster, a literal
cluster, and matching extended grapheme cluster boundaries.
Do you guys imagine that that should be "RL2.2" there instead of "RL3.12"?
Why should RL2.2 -> 2.2 -> RL3.12? Or is that actually talking about
tailored grapheme clusters? I can't tell.
--tom
More information about the i18n-dev
mailing list