[foreign-memaccess] RFR 8231402: layout API implementation is not constant enough

Wed Sep 25 13:28:03 UTC 2019

On 25/09/2019 14:09, Maurizio Cimadamore wrote:
>
> On 25/09/2019 12:53, Jorn Vernee wrote:
>> Hi,
>>
>> Patch looks good!
>>
>> Some minor comments:
>>
>> MemorySegment.java
>>   - Minor typo @58: "using the one of the provided factory methods"
>>   - You removed the SecurityManager check from ofByteBuffer. Isn't it 
>> needed?
>
> my feeling/reasoning on the security check - the buffer has already 
> been created - how creating another view is more 'insecure' ?

Ok. I was just wondering if this was something that got removed during 
testing and then forgotten to be added back :)

Jorn

>
> Maurizio
>
>>
>> Cheers,
>> Jorn
>>
>> On 24/09/2019 16:06, Maurizio Cimadamore wrote:
>>> Hi,
>>> as the subject says, the implementation classes of the layout API do 
>>> not always store their properties into final fields, and they resort 
>>> to lazy computation, etc. This negatively impacts C2 scrutability of 
>>> same data structures.
>>>
>>> This patch fixes this situation, by changing size/alignment to be 
>>> final fields in AbstractLayout - so that they will have to be 
>>> provided before hand. I've added, for clarity, and extra 'default' 
>>> constructor to all layout implementation classes which allows to 
>>> create a layout with standard alignment and empty name.
>>>
>>> There are also few minor changes:
>>>
>>> * I've tweaked VM to also trust final fields in the layout package
>>>
>>> * I've rearranged some some scope classes so that their creation is 
>>> less straightforward, more transparent and requires less reflective 
>>> checks. This is particularly evident in HeapScope and BufferScope. 
>>> Note that I also changed the public API of MemorySegment::ofArray 
>>> and replaced that with multiple overloads (one per primitive array). 
>>> This is good because it makes the code more 'static' and also 
>>> because it removes the possibility for the user to pass in a wrong 
>>> array type.
>>>
>>> * I've re-ordered the way in which scope vs. segment is created - 
>>> that is, instead of this
>>>
>>> new XYZSegment(.., ..., ..., new XYZScope(...))
>>>
>>> We now do this:
>>>
>>> XYZScope scope = new XYZScope(...);
>>> new XYZSegment(.., ..., ..., scope)
>>>
>>> As this makes a difference for C2 (Vlad pointed this out).
>>>
>>> Webrev:
>>>
>>> http://cr.openjdk.java.net/~mcimadamore/panama/8231402_v2/
>>>
>>>
>>> With this patch, the level of performances of the memory access API 
>>> is virtually on par with Unsafe in our suite of synthetic 
>>> benchmarks, at least when using the Graal compiler (*). With C2 
>>> there are still issues which have to do mainly with (i) escape 
>>> analysis not being aggressive enough (a VM patch is required) and 
>>> (ii) inlining not working well in relatively 'cold' code (e.g. 
>>> segment creation/closure), so that some manual sprinkling of 
>>> @ForceInline annotations is required. I will pursue these in a 
>>> follow up patch.
>>>
>>> Maurizio
>>>
>>> (*) the only exception to this is a test which performs indexed 
>>> access, in which Graal compiler is not able to vectorize the loop 
>>> when using the memory access API (because of the presence of address 
>>> operation on longs); that said, performance of code compiled by the 
>>> Graal compiler with the memory access API is still superior than 
>>> that of C2 using unsafe (I'm also following up with the Graal 
>>> compiler team on this issue). There also seems to be an issue with 
>>> the liveness check which is never hoisted out of hot loops - leading 
>>> to slightly slower performances; in C2 we have a similar issue and 
>>> it's caused by the fact that the VM puts a memory barrier after 
>>> Unsafe memory access calls (since the Unsafe call could touch the 
>>> loop invariant itself!). We obviously need to relax some of these 
>>> checks if the Unsafe call occurs from within the memory access API 
>>> (which does not allow arbitrary read/write of Java fields).
>>>
>>>
>>>