[vectorIntrinsics] RFR: Optimize mem barriers for ByteBuffer cases [v4]

Radoslaw Smogura github.com+7535718+rsmogura at openjdk.java.net
Mon Aug 2 21:02:24 UTC 2021


> # Description
> This change tries to remove mem bars for byte buffer cases.
> 
> Previously mem bars were inserted almost unconditionally if attemp to native memory acees where detected. This patch tries to follow up inline_unsafe_access and insert bar only if can't determine if it's heap or off-heap (type missmatch cases are not ported).
> 
> # Testing
> Memory tests should include rollbacking JDK changes, and leaving only hotspot, as intrinsics should be well guarded
> 
> # Notes
> Polluted cases to be addressed later
> 
> # Benchmarks
> 
> Benchmark                                (size)  Mode  Cnt    Score   Error  Units
> ByteBufferVectorAccess.arrays              1024  avgt   10   12.585 ? 0.409  ns/op
> ByteBufferVectorAccess.directBuffers       1024  avgt   10   19.962 ? 0.080  ns/op
> ByteBufferVectorAccess.heapBuffers         1024  avgt   10   15.878 ? 0.187  ns/op
> ByteBufferVectorAccess.pollutedBuffers2    1024  avgt   10  123.702 ? 0.723  ns/op
> ByteBufferVectorAccess.pollutedBuffers3    1024  avgt   10  223.928 ? 1.906  ns/op
> 
> Before
> 
> Benchmark                                (size)  Mode  Cnt    Score   Error  Units
> ByteBufferVectorAccess.arrays              1024  avgt   10   14.730 ? 0.061  ns/op
> ByteBufferVectorAccess.directBuffers       1024  avgt   10   77.707 ? 4.867  ns/op
> ByteBufferVectorAccess.heapBuffers         1024  avgt   10   76.530 ? 1.076  ns/op
> ByteBufferVectorAccess.pollutedBuffers2    1024  avgt   10  143.331 ? 1.096  ns/op
> ByteBufferVectorAccess.pollutedBuffers3    1024  avgt   10  286.645 ? 3.444  ns/op

Radoslaw Smogura has updated the pull request incrementally with one additional commit since the last revision:

  Support polluted cases.
  
  Factor load and stores to supported polluted cases.
  
  Use more immutable memory and instance fields, to avoid
  virtual calls.
  
  Use immutable memory to help unswitching loops.
  
  This code works suspicousyly well (I see loop get unswitched 4 times).
  
  ```
  Benchmark                                (size)  Mode  Cnt   Score   Error  Units
  ByteBufferVectorAccess.arrayCopy           1024  avgt   10  14.524 ? 0.356  ns/op
  ByteBufferVectorAccess.directBuffers       1024  avgt   10  19.633 ? 0.137  ns/op
  ByteBufferVectorAccess.heapBuffers         1024  avgt   10  19.148 ? 0.505  ns/op
  ByteBufferVectorAccess.pollutedBuffers2    1024  avgt   10  31.682 ? 0.762  ns/op
  ByteBufferVectorAccess.pollutedBuffers3    1024  avgt   10  74.878 ? 1.127  ns/op
  ByteBufferVectorAccess.pollutedBuffers4    1024  avgt   10  71.133 ? 1.822  ns/op
  ByteBufferVectorAccess.pollutedBuffers5    1024  avgt   10  66.990 ? 1.323  ns/op
  ```
  
  With loop unrolling
  ```
  Benchmark                                (size)  Mode  Cnt     Score    Error  Units
  ByteBufferVectorAccess.arrayCopy           1024  avgt   10    14.517 ?  0.103  ns/op
  ByteBufferVectorAccess.directBuffers       1024  avgt   10    12.140 ?  0.134  ns/op
  ByteBufferVectorAccess.pollutedBuffers2    1024  avgt   10    34.582 ?  0.250  ns/op
  ByteBufferVectorAccess.pollutedBuffers3    1024  avgt   10    69.405 ?  0.845  ns/op
  ByteBufferVectorAccess.pollutedBuffers4    1024  avgt   10    58.719 ?  0.491  ns/op
  ByteBufferVectorAccess.pollutedBuffers5    1024  avgt   10    60.044 ?  0.338  ns/op
  ```
  plus heap buff which sometimes executes slower...
  ```
  ByteBufferVectorAccess.heapBuffers    1024  avgt   10  15.878 ? 0.423  ns/op
  ```

-------------

Changes:
  - all: https://git.openjdk.java.net/panama-vector/pull/104/files
  - new: https://git.openjdk.java.net/panama-vector/pull/104/files/ed6c744d..4852ea23

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=panama-vector&pr=104&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=panama-vector&pr=104&range=02-03

  Stats: 403 lines in 11 files changed: 287 ins; 0 del; 116 mod
  Patch: https://git.openjdk.java.net/panama-vector/pull/104.diff
  Fetch: git fetch https://git.openjdk.java.net/panama-vector pull/104/head:pull/104

PR: https://git.openjdk.java.net/panama-vector/pull/104


More information about the panama-dev mailing list