RFR: 8218675: Reduce verification overhead in ClassFileParser

Mon Feb 11 08:44:49 UTC 2019

On 11/02/2019 6:37 pm, Claes Redestad wrote:
> On 2019-02-11 02:50, David Holmes wrote:
>>>
>>> - Symbol names are routinely copied into stack- or heap-allocated
>>> buffers.
>>
>> These changes seem okay. I think you will find that with your changes 
>> as_utf8_flexible_buffer and it's helper as_C_string_flexible_buffer, 
>> are now dead code.
> 
> Yes, I have a follow-up patch to remove these and a few other things in
> the neighborhood.
> 
>>
>>> - In verify_unqualified_name we unnecessarily test to detect non-ASCII
>>> characters: each byte in multi-byte characters will be over > 127, so
>>> it's faster to keep the loop simple. Rewriting as a switch improves it
>>> further (this code is exercised by some internal calls independent of
>>> -Xverify mode)s 
>>
>> I don't see how this can be true. With the switch we will generate a 
>> test for every character against all 6 special characters. With the 
>> existing code we will bail out for non-ascii characters after one 
>> comparison.
> 
> No, the old non-ASCII path only skips over the codepoint and starts
> over, i.e., UTF8::next increments p 1 to 3. There's no bail out
> happening.

But for p 1 we do one comparison then jump to p 3. No p 2.

> All of the modified UTF8-encoded bytes UTF8::next would skip over will
> have values >= 128 (assuming legal UTF-8, which UTF8::next already 
> assumes), so they can't match the special characters we're looking for 
> anyway, so the switch will never match any of them. Thus the new method
> is semantically equivalent, but measurably faster.

New code compares p1, p2, p3 ... to each of the special characters to 
see that it doesn't match. Or are you saying the switch somehow 
generates a set of range checks rather than comparisons??

David

> /Claes