RFR: 8218675: Reduce verification overhead in ClassFileParser

David Holmes david.holmes at oracle.com
Mon Feb 11 11:21:51 UTC 2019


On 11/02/2019 7:07 pm, Claes Redestad wrote:
> On 2019-02-11 09:44, David Holmes wrote:
>>
>>> All of the modified UTF8-encoded bytes UTF8::next would skip over will
>>> have values >= 128 (assuming legal UTF-8, which UTF8::next already 
>>> assumes), so they can't match the special characters we're looking 
>>> for anyway, so the switch will never match any of them. Thus the new 
>>> method
>>> is semantically equivalent, but measurably faster.
>>
>> New code compares p1, p2, p3 ... to each of the special characters to 
>> see that it doesn't match. Or are you saying the switch somehow 
>> generates a set of range checks rather than comparisons??
> 
> No, I'm saying that any byte value that would have been skipped over by
> UTF8::next will be skipped over by the switch, since they are all 128
> and higher. It's impossible that, say, a ';' hides in one of the bytes
> of a codepoint.

Understood but my query is about how many comparisons are needed before 
deciding that "this is not the byte you're looking for". Old code first 
compares against 128 so high-byte is skipped after one comparison. New 
code appears to check all 6 values before determining that.

What am I missing?

David

> UTF8 was designed to allow scanning for ASCII characters byte-by-byte
> like this, and modified UTF-8 is no different except for '\0' (which we
> don't care about here).
> 
> /Claes


More information about the hotspot-runtime-dev mailing list