Intrinsic methods and time to safepoint

Fri Sep 17 08:52:13 UTC 2021

On 9/16/21 1:28 PM, Roland Westrelin wrote:
> 
>> I believe we should have a policy to cover how long an intrinsic can
>> delay without responding to a safepoint, and that it should be in the
>> millisecond range. It would make almost no difference to the
>> performance of encryption if chunks handles by a fast intrinsic were,
>> say, about a megabyte. The difference in performance is so small as to
>> be immeasurable, and the improvement in the performance of other threads
>> is vast.
> 
> I agree with you (seems like a no brainer) but I have a couple comments
> about implementation details.
> 
> Those intrinsics usually call some stub. It's not possible AFAICT, to
> have the safepoint in the stub itself. So we need some loop that
> repeatedly calls the stub. That loop can either be added 1) by the JIT
> as IR when the intrinsic is expanded 2) in java code, that is java
> library code needs to be refactored.
> 
> 2) would seem much easier to implement and would work for both c1 and c2
> (if some of these intrinsics end up implemented by c1).

OK. I guess the problem is that the call to the stub doesn't have an oop
map.

The tricky cases seem to be in the crypto code, which is already rather
fiddly. It's usually simple enough for intrinsics to return the amount of
work left to do, I guess.

> Also a note of caution about loop strip mining: it doesn't have a model
> for what the loop body costs. So it blindly assumes all loop bodies can
> be run for 1000 iterations (by default) between safepoints. Unless I'm
> missing something, with a stub running for 1ms, delays between safepoint
> could still be 1s.

It'd be interesting to do the experiment. I might try that, just for grins.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671