RFR: 8257837: Performance regression in heap byte buffer views
Claes Redestad
redestad at openjdk.java.net
Thu Dec 10 16:01:41 UTC 2020
On Thu, 10 Dec 2020 13:15:41 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
> As a result of the recent integration of the foreign memory access API, some of the buffer implementations now use `ScopedMemoryAccess` instead of `Unsafe`. While this works generally well, there are situations where profile pollution arises, which result in a considerable slowdown. The profile pollution occurs because the same ScopedMemoryAccess method (e.g. `getIntUnaligned`) is called with two different buffer kinds (e.g. an off heap buffer where base == null, and an on-heap buffer where base == byte[]). Because of that, unsafe access cannot be optimized, since C2 can't guess what the unsafe base access is.
>
> In reality, this problem was already known (and solved) elsewhere: the sun.misc.Unsafe wrapper does basically the same thing that ScopedMemoryAccess does. To make sure that profile pollution does not occur in those cases, argument profiling is enabled for sun.misc.Unsafe as well. This patch adds yet another case for ScopedMemoryAccess.
>
> Here are the benchmark results:
>
> Before:
>
> Benchmark Mode Cnt Score Error Units
> LoopOverPollutedBuffer.direct_byte_buffer_get_float avgt 30 0.612 ? 0.005 ms/op
> LoopOverPollutedBuffer.heap_byte_buffer_get_int avgt 30 2.740 ? 0.039 ms/op
> LoopOverPollutedBuffer.unsafe_get_float avgt 30 0.504 ? 0.020 ms/op
>
> After
>
> Benchmark Mode Cnt Score Error Units
> LoopOverPollutedBuffer.direct_byte_buffer_get_float avgt 30 0.613 ? 0.007 ms/op
> LoopOverPollutedBuffer.heap_byte_buffer_get_int avgt 30 0.304 ? 0.002 ms/op
> LoopOverPollutedBuffer.unsafe_get_float avgt 30 0.491 ? 0.004 ms/op
src/hotspot/share/oops/methodData.cpp line 1593:
> 1591: ResourceMark rm;
> 1592: char* name = inv.name()->as_C_string();
> 1593: if (!strncmp(name, "get", 3) || !strncmp(name, "put", 3)) {
Pre-existing, but `!strncmp(name, "get", 3)` seem a very circumspect way of writing `inv->name()->starts_with("get")` - which shouldn't need a `ResourceMark` either. Another observation is that `inv.klass()` isn't inlined (defined in bytecode.cpp), so introducing a local for `inv.klass()` avoids multiple calls. How about this:
if (inv.is_invokevirtual()) {
Symbol* klass = inv.klass();
if (klass == vmSymbols::jdk_internal_misc_Unsafe() ||
klass == vmSymbols::sun_misc_Unsafe() ||
klass == vmSymbols::jdk_internal_misc_ScopedMemoryAccess()) {
Symbol* name = inv.name();
if (name->starts_with("get") || name->starts_with("put")) {
return true;
}
}
}
-------------
PR: https://git.openjdk.java.net/jdk/pull/1733
More information about the core-libs-dev
mailing list