<i18n dev> RL1.2 Properties (part 1 of 2)
Tom Christiansen
tchrist at perl.com
Sun Jan 23 00:18:56 PST 2011
Sherman wrote:
> The Unicode/java version of lowercase, uppercase, withespace
> and letter character classes are provided via \p{javaXYZ},
I'm afraid that is *not* true; please see part 2.
> and the \p{Lower/Upper/Alpha/Space} are specified/implemented
> for POSIX version, which is clearly documented in the API
> document. I would not use "worst" for this. I don't think the
> "conformance" requests the implementation to use exactly the
> name specified in standard.
> The following classes/properties are actually
> supported/implemented, while only the \p{javaLowerCase},
> \p{javaUpperCase}, \p{javaWhitespace} and \p{javaMirrored} are
> explicitly documented in Pattern API, the rest are covered by
> notes as "Categories that behave like the java.lang.Character
> boolean ismethodname methods are available through the same
> \p{prop} syntax..."
> \p{javaLowerCase}
> \p{javaUpperCase}
> \p{javaTitleCase}
> \p{javaDigit}
> \p{javaDefined}
> \p{javaLetter}
> \p{javaLetterOrDigit}
> \p{javaJavaIdentifierStart}
> \p{javaJavaIdentifierPart}
> \p{javaUnicodeIdentifierStart}
> \p{javaUnicodeIdentifierPart}
> \p{javaIdentifierIgnorable}
> \p{javaSpaceChar}
> \p{javaWhitespace}
> \p{javaISOControl}
> \p{javaMirrored}
Last I checked there was also a \p{javaJavaIdentifierPart}, which is
pretty silly. I think.
> It appears the "noncharacter_cp and "default_ignorable_cp" are
> missing from the list, will take a look later, but I guess
> these 2 are really not that "significant".
They are two of the eleven properties which must be supported to
meet RL1.2 compliance, and therefore Level 1 compliance.
Having access to the real Unicode properties is more important
than having these java versions, which don't work right.
See part 2, please.
--tom
More information about the i18n-dev
mailing list