JDK-8268622 - Performance issues in javac `Name` class
Archie Cobbs
archie.cobbs at gmail.com
Sun Mar 5 23:12:38 UTC 2023
Hi Jon,
Thanks for taking a look at the patch.
On Fri, Mar 3, 2023 at 5:07 PM Jonathan Gibbons <jonathan.gibbons at oracle.com>
wrote:
> I would give you inline code comments, except that it's not a PR yet. I
> note that I generally distrust the `getMessage` for any exception for which
> the message is not formally specified in some way ... in other words, don't
> assume that `e.getMessage()` by itself is interesting.
>
That makes sense, and is easy to fix - thanks for the suggestion.
> Is it possible to write a test for the bug fix in PoolReader? What is an
> example of a name encoded in two different ways?
>
In any multi-byte UTF-8 sequence, the bytes after the first are supposed to
all look like 0x10xxxxxx. But the code is not checking that, so e.g., you
could have 0x11xxxxxx instead and it would encode the same character but
not match byte-for-byte. For example, è = c3 a8, but Convert.java would
also accept c3 e8 or c3 28 for "è".
Because the Name hash tables store UTF-8 byte sequences, if the same Name
were encoded two different ways, it would get added to the hash table twice.
Another way this can happen is e.g. encoding a character as a 3-byte
sequence when the character is actually small enough to fit in a 2-byte
sequence. For example, e0 84 80 encodes character 0x0100, but it should
really be encoded as c4 80.
Thinking more about this, I think I should create a separate bug and patch
for this particular problem. So, expect a digression on that next...
Although conceptually simple, this is a significant change for a very low
> level data type. It would be worth doing more testing than just the usual
> langtools tests. For example, if you build JDK before and after this
> change, are the generated class files the same?
>
Definitely a test worth doing.
-Archie
--
Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/compiler-dev/attachments/20230305/b0b096d7/attachment.htm>
More information about the compiler-dev
mailing list