[foreign-memaccess+abi] RFR: Add benchmark on ResourceScope::close
Maurizio Cimadamore
mcimadamore at openjdk.java.net
Fri Apr 16 15:26:33 UTC 2021
This patch adds a new benchamrk for ResourceScope::close. I think it's interesting to measure the tradeoffs provided by the various configurations; we test 4 diufferent configurations:
* confined scope
* shared scope
* implicit scope
* implicit scope, with periodic calls to System::gc
For each configuration, we create a scope, allocate a segment with it (of fixed size) and then close the scope.
The benchmark supports a number of stress modes:
* NONE - e.g. no extra work, just benchmark is ran
* MEMORY - this puts additional memory pressure by creating lots of small byte arrays at the start of the benchmark; designed to disrupt implicit scopes
* THREADS - this creates extra threads which are spinning in a busy loop; designed to disrupt explicit shared scopes
Results are as follows:
Benchmark (mode) Mode Cnt Score Error Units
ResourceScopeClose.confined_close NONE avgt 30 0.107 ? 0.002 us/op
ResourceScopeClose.confined_close:?gc.time NONE avgt 30 39.000 ms
ResourceScopeClose.confined_close MEMORY avgt 30 0.114 ? 0.006 us/op
ResourceScopeClose.confined_close:?gc.time MEMORY avgt 30 12.000 ms
ResourceScopeClose.confined_close THREADS avgt 30 0.115 ? 0.002 us/op
ResourceScopeClose.confined_close:?gc.time THREADS avgt 30 30.000 ms
ResourceScopeClose.implicit_close NONE avgt 30 5.654 ? 9.617 us/op
ResourceScopeClose.implicit_close:?gc.time NONE avgt 30 15540.000 ms
ResourceScopeClose.implicit_close MEMORY avgt 30 4.085 ? 4.375 us/op
ResourceScopeClose.implicit_close:?gc.time MEMORY avgt 30 23126.000 ms
ResourceScopeClose.implicit_close THREADS avgt 30 2.380 ? 2.585 us/op
ResourceScopeClose.implicit_close:?gc.time THREADS avgt 30 15940.000 ms
ResourceScopeClose.implicit_close_systemgc NONE avgt 30 31.301 ? 0.667 us/op
ResourceScopeClose.implicit_close_systemgc:?gc.time NONE avgt 30 14520.000 ms
ResourceScopeClose.implicit_close_systemgc MEMORY avgt 30 1502.083 ? 28.460 us/op
ResourceScopeClose.implicit_close_systemgc:?gc.time MEMORY avgt 30 23087.000 ms
ResourceScopeClose.implicit_close_systemgc THREADS avgt 30 30.733 ? 0.704 us/op
ResourceScopeClose.implicit_close_systemgc:?gc.time THREADS avgt 30 14551.000 ms
ResourceScopeClose.shared_close NONE avgt 30 8.850 ? 0.936 us/op
ResourceScopeClose.shared_close:?gc.time NONE avgt 30 6.000 ms
ResourceScopeClose.shared_close MEMORY avgt 30 8.401 ? 0.506 us/op
ResourceScopeClose.shared_close THREADS avgt 30 10.966 ? 0.349 us/op
ResourceScopeClose.shared_close:?gc.time THREADS avgt 30 4.000 ms
Of course the confined case comes out on top; very little GC activity there, very good perf, and very low variance.
Shared scopes is second best - performances are not quite as good as with confined case (the close is ~10x slower) - but low variance, low GC activity.
Then there is implicit scopes. In the version without System::gc calls, if we look at scores we might be tricked into thinking that the results are good. In reality, if you look at GC activity, you see that there is a huge amount of time (~15s !!) spent on GC. What's worse, and what cannot be appreaciated here (as I didn't find a way to let JMH spit the data), is that, by observing the benchmark process with `top` in the implcit case w/o calls to `System::gc` we see peaks of resident memory up to 10g (!!).
The version which periodically calls `System::gc` helps with keeping resident memory under control (never exceeds 1g that way) - but as you can see, if we go for explicit calls to System::gc, the cost gets higher the more heap is used (look at the MEMORY stress mode).
Overall, confined and shared segments are very good, compared to what is possible to achieve using cleaners; closing a shared scope is more expensive, yes (while we don't think we can improve these numbers much, we'll keep looking for opportunities), but there's none of the system thrashing that occurs when the benchmark relies only on the GC. Also, this benchmark is rather unrealistic since it basically does nothing with the segment created from the scope; as soon as some real code is added there, the additional cost for closing the segment will likely be washed away in many cases.
-------------
Commit messages:
- Add benchmark on ResourceScope::close
Changes: https://git.openjdk.java.net/panama-foreign/pull/508/files
Webrev: https://webrevs.openjdk.java.net/?repo=panama-foreign&pr=508&range=00
Stats: 132 lines in 1 file changed: 132 ins; 0 del; 0 mod
Patch: https://git.openjdk.java.net/panama-foreign/pull/508.diff
Fetch: git fetch https://git.openjdk.java.net/panama-foreign pull/508/head:pull/508
PR: https://git.openjdk.java.net/panama-foreign/pull/508
More information about the panama-dev
mailing list