RFR(L): 8186027: C2: loop strip mining

Nils Eliasson nils.eliasson at oracle.com
Thu Nov 23 21:59:48 UTC 2017


Hi,


On 2017-11-23 15:18, Roland Westrelin wrote:
> Hi Vladimir,
>
>> I am running testing again. But if this will repeat and presence of this
>> Sparse.small regression suggesting to me that may be we should keep this
>> optimization off by default - keep UseCountedLoopSafepoints false.
>>
>> We may switch it on later with additional changes which address regressions.
>>
>> What do you think?
> If the inner loop runs for a small number of iterations and the compiler
> can't statically prove it, I don't see a way to remove the overhead of
> loop strip mining entirely. So I'm not optimistic the regression can be
> fixed.
Agreed. In other words: Loop strip mining adds a guarantee that 
time-to-safepoint won't be too long, and that has a small cost

The current situation is that we have some extra performance with 
UseCountedLoopSafepoints default off, but let some users have a bad 
experience when they encounter long time-to-safepoint times or failures 
(https://bugs.openjdk.java.net/browse/JDK-5014723). I rather turn the 
table and have loop strip mining on, and let the power users experiment 
with turning it off for any uncertain performance boost.

> If loop strip mining defaults to false, would there we be any regular
> testing on your side?
We would have to add some.
>
> It seems to me that it would make sense to enable loop strip mining
> depending on what GC is used: it makes little sense for parallel gc but
> we'll want it enabled for Shenandoah for instance. Where does G1 fit? I
> can't really say and I don't have a strong opinion. But as I understand,
> G1 was made default under the assumption that users would be ok trading
> throughput for better latency. Maybe, that same reasoning applies to
> loop strip mining?

Scimark.sparse.small show a regression, but having long 
time-to-safepoint has a throughput cost in some settings like the 
companion benchmark scimark.sparse.large. Numbers using G1:

-XX:-UseCountedLoopSafepoints (default) ~86 ops/m
-XX:+UseCountedLoopSafepoints ~106 ops/m
-XX:+UseCountedLoopSafepoints -XX:LoopStripMiningIter=1000 ~111 ops/m

I would prefer having it on by default, at least in G1. Let's ask the G1 
GC-team on their opinion.

// Nils

>
> Roland.



More information about the hotspot-compiler-dev mailing list