RFR: 8257837: Performance regression in heap byte buffer views

Claes Redestad redestad at openjdk.java.net
Thu Dec 10 16:01:41 UTC 2020


On Thu, 10 Dec 2020 13:15:41 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

> As a result of the recent integration of the foreign memory access API, some of the buffer implementations now use `ScopedMemoryAccess` instead of `Unsafe`. While this works generally well, there are situations where profile pollution arises, which result in a considerable slowdown. The profile pollution occurs because the same ScopedMemoryAccess method (e.g. `getIntUnaligned`) is called with two different buffer kinds (e.g. an off heap buffer where base == null, and an on-heap buffer where base == byte[]). Because of that, unsafe access cannot be optimized, since C2 can't guess what the unsafe base access is.
> 
> In reality, this problem was already known (and solved) elsewhere: the sun.misc.Unsafe wrapper does basically the same thing that ScopedMemoryAccess does. To make sure that profile pollution does not occur in those cases, argument profiling is enabled for sun.misc.Unsafe as well. This patch adds yet another case for ScopedMemoryAccess.
> 
> Here are the benchmark results:
> 
> Before:
> 
> Benchmark                                            Mode  Cnt  Score   Error  Units
> LoopOverPollutedBuffer.direct_byte_buffer_get_float  avgt   30  0.612 ? 0.005  ms/op
> LoopOverPollutedBuffer.heap_byte_buffer_get_int      avgt   30  2.740 ? 0.039  ms/op
> LoopOverPollutedBuffer.unsafe_get_float              avgt   30  0.504 ? 0.020  ms/op
> 
> After
> 
> Benchmark                                            Mode  Cnt  Score   Error  Units
> LoopOverPollutedBuffer.direct_byte_buffer_get_float  avgt   30  0.613 ? 0.007  ms/op
> LoopOverPollutedBuffer.heap_byte_buffer_get_int      avgt   30  0.304 ? 0.002  ms/op
> LoopOverPollutedBuffer.unsafe_get_float              avgt   30  0.491 ? 0.004  ms/op

src/hotspot/share/oops/methodData.cpp line 1593:

> 1591:       ResourceMark rm;
> 1592:       char* name = inv.name()->as_C_string();
> 1593:       if (!strncmp(name, "get", 3) || !strncmp(name, "put", 3)) {

Pre-existing, but `!strncmp(name, "get", 3)` seem a very circumspect way of writing `inv->name()->starts_with("get")` - which shouldn't need a `ResourceMark` either. Another observation is that `inv.klass()` isn't inlined (defined in bytecode.cpp), so introducing a local for `inv.klass()` avoids multiple calls. How about this:

  if (inv.is_invokevirtual()) {
    Symbol* klass = inv.klass();
    if (klass == vmSymbols::jdk_internal_misc_Unsafe() ||
        klass == vmSymbols::sun_misc_Unsafe() ||
        klass == vmSymbols::jdk_internal_misc_ScopedMemoryAccess()) {
      Symbol* name = inv.name();
      if (name->starts_with("get") || name->starts_with("put")) {
        return true;
      }
    }
  }

-------------

PR: https://git.openjdk.java.net/jdk/pull/1733


More information about the core-libs-dev mailing list