RFR(XS): JDK-8199780: SetMemory0 and CopyMemory0 in unsafe.cpp need to resolve their operands

Tue Mar 20 16:25:26 UTC 2018

Hi Roman,

On 2018-03-20 16:40, Roman Kennke wrote:
> Am 20.03.2018 um 12:00 schrieb Erik Österlund:
>> Hi Roman,
>>
>> On 2018-03-20 11:36, Roman Kennke wrote:
>>> Same reason as splitting resolve -> resolve_for_read/resolve_for_write
>>> in other routines: being able to distinguish read and write access.
>>> Also, I'd rather be careful to put this stuff in central places that
>>> might over-cover it.
>> It sounds like the motivation for this in my opinion more fragile call
>> site chasing code is optimization.
>> What is the performance difference? Has this showed up in any profiles?
>> Whenever robustness is traded for performance, it would be great to have
>> some understanding about how much performance was lost.
> I don't have numbers. But it is not hard to see that copying potentially
> large arrays twice has some impact. It may only really matter in
> interpreter and C1, because C2 would most likely intrinsify anything
> that would show up in profiles, but this would still amount to startup
> time penalty I would think. I don't really intend to trade robustness
> for performance: my goal is to make a robust API that also allows GCs to
> be efficient.

Conversely, I would be surprised if there was a considerable difference 
to startup due to hitting an unnecessary write barrier for an arraycopy 
during startup, happening precisely while concurrent relocation is going 
on and the object has been previously unmodified since before relocation 
started. I think that if you want to change the API to something in my 
opinion more fragile purely for optimization purposes, I think it would 
be appropriate to at least measure if it makes a difference or not so 
that we get a good understanding about why we are doing this.

>>> I've missed Unsafe_CopySwapMemory0, good find (has this been added
>>> recently?). I'll add Access calls there, and meditate a little bit how
>>> to put this into a more central place to avoid having to fix this for
>>> every code change in unsafe.cpp ;-)
>> No, this has been around for at least 2 years. In this file, address
>> resolution is consistently done with index_oop_from_field_offset_long,
>> so I think if you want to play it safe (and I know I would), then I
>> would put the Access<>::resolve in there, and leave the rest of the call
>> sites the way they are.
> It seems that index_oop_from_field_offset_long() is otherwise only used
> for the non-heap paths (i.e. when p == NULL). Well I guess it's ok to
> put the resolve() there for now. Are those methods also meant to do
> native memory copies/fills when getting src/dst == NULL ? If so, then
> this new revision fixes the resolve() for those cases as well.

Indeed.

> Diff:
> http://cr.openjdk.java.net/~rkennke/JDK-8199780/webrev.01.diff/
> Full:
> http://cr.openjdk.java.net/~rkennke/JDK-8199780/webrev.01/
>
> Better now?

Yes, much better. It looks good now. Thank you.

> Assuming we can reach an agreement about JDK-8199801: Finer grained
> primitive arrays bulk access barriers, I'd probably also split
> index_oop_from_field_offset_long() into versions for read and write.
> Might that be acceptable?

As I said earlier, the motivator for introducing this seems to be to 
optimize. I would feel more comfortable to introduce these news 
concepts, if the performance benefits are better understood.

Thanks,
/Erik

> Roman
>