[foreign-memaccess] RFR 8231402: layout API implementation is not constant enough
Jorn Vernee
jorn.vernee at oracle.com
Wed Sep 25 11:53:21 UTC 2019
Hi,
Patch looks good!
Some minor comments:
MemorySegment.java
- Minor typo @58: "using the one of the provided factory methods"
- You removed the SecurityManager check from ofByteBuffer. Isn't it
needed?
Cheers,
Jorn
On 24/09/2019 16:06, Maurizio Cimadamore wrote:
> Hi,
> as the subject says, the implementation classes of the layout API do
> not always store their properties into final fields, and they resort
> to lazy computation, etc. This negatively impacts C2 scrutability of
> same data structures.
>
> This patch fixes this situation, by changing size/alignment to be
> final fields in AbstractLayout - so that they will have to be provided
> before hand. I've added, for clarity, and extra 'default' constructor
> to all layout implementation classes which allows to create a layout
> with standard alignment and empty name.
>
> There are also few minor changes:
>
> * I've tweaked VM to also trust final fields in the layout package
>
> * I've rearranged some some scope classes so that their creation is
> less straightforward, more transparent and requires less reflective
> checks. This is particularly evident in HeapScope and BufferScope.
> Note that I also changed the public API of MemorySegment::ofArray and
> replaced that with multiple overloads (one per primitive array). This
> is good because it makes the code more 'static' and also because it
> removes the possibility for the user to pass in a wrong array type.
>
> * I've re-ordered the way in which scope vs. segment is created - that
> is, instead of this
>
> new XYZSegment(.., ..., ..., new XYZScope(...))
>
> We now do this:
>
> XYZScope scope = new XYZScope(...);
> new XYZSegment(.., ..., ..., scope)
>
> As this makes a difference for C2 (Vlad pointed this out).
>
> Webrev:
>
> http://cr.openjdk.java.net/~mcimadamore/panama/8231402_v2/
>
>
> With this patch, the level of performances of the memory access API is
> virtually on par with Unsafe in our suite of synthetic benchmarks, at
> least when using the Graal compiler (*). With C2 there are still
> issues which have to do mainly with (i) escape analysis not being
> aggressive enough (a VM patch is required) and (ii) inlining not
> working well in relatively 'cold' code (e.g. segment
> creation/closure), so that some manual sprinkling of @ForceInline
> annotations is required. I will pursue these in a follow up patch.
>
> Maurizio
>
> (*) the only exception to this is a test which performs indexed
> access, in which Graal compiler is not able to vectorize the loop when
> using the memory access API (because of the presence of address
> operation on longs); that said, performance of code compiled by the
> Graal compiler with the memory access API is still superior than that
> of C2 using unsafe (I'm also following up with the Graal compiler team
> on this issue). There also seems to be an issue with the liveness
> check which is never hoisted out of hot loops - leading to slightly
> slower performances; in C2 we have a similar issue and it's caused by
> the fact that the VM puts a memory barrier after Unsafe memory access
> calls (since the Unsafe call could touch the loop invariant itself!).
> We obviously need to relax some of these checks if the Unsafe call
> occurs from within the memory access API (which does not allow
> arbitrary read/write of Java fields).
>
>
>
More information about the panama-dev
mailing list