JLS bug (unicode escapes)?

Thu Jan 7 07:15:09 PST 2010

On Thu, Jan 7, 2010 at 6:48 AM, Reinier Zwitserloot
<reinier at zwitserloot.com> wrote:
> Am I reading this:
>
> http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.3
>
> correctly?
>
> A UnicodeMarker seems to be defined as, in regexp terms: "u+" instead of the
> expected "u". So, that would mean:
>
> \uuuuuuuuuuuuuuuuuuuuuuu0041  will still turn into "A" just like \u0041
> would. What on earth is the thinking behind this?
>
> Amazingly, I tested this in javac and it actually works:
> System.out.println("\uuuuuuuu0041"); will print 'A' to stdout. At the very
> least the descriptive text in chapter 3.3 should highlight this oddity.

The descriptive text in section 3.3 does describe the thinking for
this, in the two paragraphs starting with, "The Java programming
language specifies a standard way of transforming a program written in
Unicode into ASCII...".

-- Peter