RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination [v4]

John R Rose jrose at openjdk.java.net
Sat Jan 15 04:32:23 UTC 2022


On Fri, 14 Jan 2022 16:26:12 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> In the micro benchmark for this bug, the bytecode has a single loop
>> head with 2 backedges. One branch is taken a lot more frequently than
>> the other (for instance, profiling reports 3820938 for the most
>> common, 134068 for the least common). When ciTypeFlow builds its loop
>> tree, the least common loop ends up as the inner loop (ciTypeFlow uses
>> the tail block number to order loops with a shared header). Then
>> ciTypeFlow clones the inner loop head.
>> 
>> The results I get then is:
>> 
>> StringBenchmark.safeDecoding    avgt    5  141.965 ± 53.511  ns/op
>> StringBenchmark.unsafeDecoding  avgt    5  123.051 ± 77.855  ns/op
>> 
>> The unsafe version uses unsafe instead of standard array accesses. The
>> reason for the unsafe version is to demonstrate that a missed range
>> check elimination in the common path hurts performance (the range
>> check is not performed on the loop iv but another phi so can't be
>> eliminated).
>> 
>> The patch I propose takes back branch count (from profile data) into
>> account when building the ciTypeFlow tree. As a consequence, the inner
>> loop (the one for which the loop head is cloned), is the most most
>> frequent one. The range check in the common path is also hoisted in
>> that case (it's now performed on the iv). Running the micro benchmark
>> again, I get:
>> 
>> StringBenchmark.safeDecoding    avgt    5  53.368 ± 2.232  ns/op
>> StringBenchmark.unsafeDecoding  avgt    5  55.565 ± 5.124  ns/op
>> 
>> Both are improved (2-3x) and the unsafe version doesn't perform better.
>
> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review

Thank you; that's much clearer.

Reviewed!

A couple of minor style issues I noticed which you might address (or not):

- The name insertion_point should be probably be named is_insertion_point or at_insertion_point  since it's a predicate
- In `assert((succs->at(ciTypeFlow::IF_NOT_TAKEN) == loop->head()) == (tail->limit() == loop->head()->start()), "bytecode and CFG not consistent")` the top-level `==` should be reversed so that the assert looks more like the previous assert.  It's easier to tell what they are doing if their structures line up with each other.

I wish we could handle profile counts on `switch` arms.  But that's a matter for another day.

-------------

Marked as reviewed by jrose (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7034


More information about the hotspot-compiler-dev mailing list