[foreign] RFR 8222765: Implement foreign memory access through VarHandle

Fri Apr 19 15:46:15 UTC 2019

- In GensrcVarHandles.gmk you seem to have a mistake in the generation 
command:

  274 # List the types to generate source for, with capitalized first 
letter
  275 VARHANDLES_MEMORY_ADDRESS_TYPES := Byte Short Char Int Long Float 
Double
  276 $(foreach t, $(VARHANDLES_MEMORY_ADDRESS_TYPES), \
  277   $(eval $(call 
GenerateVarHandleMemoryAddress,VAR_HANDLE_BYTE_ARRAY_$t,$t)))

You still have to change the name VAR_HANDLE_BYTE_ARRAY_ to something 
else (probably forgotten when copy pasting?)

- I agree that MemoryAddress should be split, but I don't think we can 
"just" add a length() method to the MemoryRegion class, since not all 
memory regions have a length. E.g. when a pointer is returned by a 
native call, we don't know any memory bounds. I think the same problem 
also exists when deserializing a pointer. Is it okay if a length() 
method would throw UOE in those cases?

- Also in MemoryAddress, I think the name 'limit' instead of 'narrow' to 
shrink a region is a little more colloquial. (e.g. looking at 
Buffer/Stream)

- I'd also like to advocate for using an enum to represent the 
MemoryScope characteristics, instead of a long. The advantage is that 
you know exactly which constants are acceptable to pass there, as well 
as IDE autocomplete being able to help you out better. If the parameter 
type is changed to Set<Characteristics>, this can easily be passed 
ad-hoc using Set.of, or EnumSet.of.

- Should the generated template classes be added to the patch as well? 
(I've seen that done in some of the vector API patches).

* Also, since you mention performance and liveliness checks not being 
hoist-able. Just out of curiosity; I'm wondering why the change to 
ciField doesn't take care of that already?

Jorn

Maurizio Cimadamore schreef op 2019-04-19 16:52:
> On 19/04/2019 15:25, Brian Goetz wrote:
>> # java.foreign.memory
>> 
>> ## MemoryAddress
>> 
>> It’s a little hard to tell exactly what this abstraction is.  The 
>> class doc says “temporal and spatial bounds”; presumably the temporal 
>> bounds are determined by the scope.  The methods `narrow()` and 
>> `offset()` hint at a spatial bound, but there’s no way to query the 
>> bound directly (no `size()` or `limit()` method), nor does it say what 
>> happens when offset/narrow violates the spatial bounds.
>> 
>> Is the model here that there is a hidden limit?  And one can narrow it 
>> so long as the new limit is less than the current limit, and offset so 
>> long as we stay under the limit?  And presumably, one can bind a 
>> layout to it, so long as there is room between the pointer and the 
>> limit for the layout?
>> 
>> I basically cannot tell whether this is the abstraction for “pointer” 
>> (subject to some bounds constraint) or “region” (describing a block of 
>> memory.), but it doesn’t quite feel like “address”.  Maybe a less 
>> loaded name, like “MemoryRef”?
> 
> I think the main issue here is that we're using the same abstraction
> to model both regions and addresses. This is covered in the document:
> 
> file:///home/maurizio/Documents/text/2019/memaccess.html
> 
> (see section region vs. address duality).
> 
> I think splitting should clarify the API quite a bit, and we can add
> the bound methods where they really belong. Changing name can help a
> bit, but I think the overlapping between the two nature of the
> MemoryAddress/MemoryRef/whatever we call it will remain confusing.
> 
>> 
>> ## MemoryScope
>> 
>> This feels mostly like an abstraction that does one thing, except for 
>> the method `confinementThread()`; this one definitely sticks out as 
>> “what’s this doing here.”  I think I get how you got here — you don’t 
>> want to have a complex hierarchy of Scope types — and this was the one 
>> bit of extra information you needed to complete the story. But 
>> something feels a little “crammed” here.
> Right, we can also create a ConfinedScope, no problem there.
>> 
>> Scope needs a method to expose its closed-state; presumably allocate() 
>> will fail if the scope is closed, but how do I check if it is closed 
>> before allocations?
> Right we have an internal 'checkAlive' method, but I have not added it
> to the public API. Will do.
>> 
>> There must be some rules for merging characteristics; presumably if 
>> the parent scope is confined, so must the child scopes.
>> 
>> # java.foreign.layout
>> 
>> General: the names for most of these classes are just too general — 
>> Address, Value, Sequence — users will curse us for polluting the 
>> namespace with things that look like what they want, but are not.  
>> (Just like java.awt.List.). There’s an easy fix: append Layout to all 
>> the concrete subtypes of Layout (SequenceLayout, GroupLayout, etc.). 
>> Similarly, Descriptor should be something like NativeDescriptor.
> 
> RIght, appending Layout is what my not so secret plan has been all 
> along.
> 
> Also, I think in this patch we can probably remove:
> 
> Descriptor
> Function
> Unresolved
> 
> Layout::isPartial
> Address:function
> Address::ofFunction
> 
> Since this stuff is mostly needed for foreign function calls interop,
> which will come later.
> 
>> 
>> I get the sense all the types in this package are intended to be 
>> immutable / value-based.  Should say that.
> Yep.
>> 
>> ## Descriptor
>> 
>> Is there a better name than “annotations”?
> Uhmm will think. It's true that this is a problem, as it always
> creates ambiguities with Java annotations.
>> 
>> `stripAnnotations` sounds like a mutative method, but I think you 
>> intend implementations to be immutable?  Maybe `withoutAnnotations`.
>> 
>> ## LayoutPath
>> 
>> I think what this abstraction is, is a path from the root of a Layout 
>> to a particular sub-layout?  And the enclosing() method provides the 
>> trail of breadcrumbs back to the root?
>> 
>> I wonder if there’s an easier way to express this.  I assume what I’m 
>> trying to do is navigate to foo.bar[n].baz.  But I don’t see how 
>> things like sequence indexes play into the story.
> 
> Not sure I get what you mean here; the general idea is that, once you
> have a path to a specific layout element, you have all the info you
> need to generate a VarHandle which can be used to navigate an address
> from a root to the desired layout element.
> 
> Any sequence layout that is traversed (like bar[n]), will add an extra
> 'long' index dimension to the resulting VarHandle.
> 
>> 
>> ## Layout
>> 
>> Having some trouble inferring the model from the API.  Can we document 
>> the model separately?
> 
> The layout model is documented in great details here:
> 
> http://cr.openjdk.java.net/~mcimadamore/panama/panama-binder-v3.html
> 
> The main idea is that we have 'value layouts' (for leaves) and 'group
> layouts' (for composite stuff). Sequence is a particular kind of a
> Group which is basically a repetition of the same layout element over
> and over (up to a number of times). Address is a special Value that is
> used to encode... well, addresses. Since we want to be friendly to
> protocol schemas, an address can be encoded with whatever size,
> endianness, etc. - in other words, it's just a special kind of value,
> that, in addition tells you the Layout that it's pointing into.
> 
> Maurizio
> 
>> 
>> 
>> 
>>> On Apr 19, 2019, at 9:32 AM, Maurizio Cimadamore 
>>> <maurizio.cimadamore at oracle.com> wrote:
>>> 
>>> And here's a javadoc link in case somebody wants to do a pass over 
>>> the API (30% feedback please :-))
>>> 
>>> http://cr.openjdk.java.net/~mcimadamore/panama/memaccess_javadoc_v1/
>>> 
>>> Maurizio
>>> 
>>> On 19/04/2019 14:25, Maurizio Cimadamore wrote:
>>>> Hi,
>>>> this patch implements the MemoryAddress/MemoryScope abstractions 
>>>> described in the document I've sent recently [1]:
>>>> 
>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8222765/
>>>> 
>>>> This patch will go on new branch of Panama, called 
>>>> "foreign-memaccess", so the patch is relative to the 'default' 
>>>> branch.
>>>> 
>>>> Most of the classes added by this patch are the Layout classes we 
>>>> already have in the existing 'foreign' branch.
>>>> 
>>>> There are however a bunch of new classes: LayoutPath, MemoryAddress 
>>>> and MemoryScope (and their respective implementations).
>>>> 
>>>> In order to generate VarHandle with the right number of dimensions, 
>>>> we use a template (so that we can automatically generate all 
>>>> variants of VarHandle for all carriers). Then, we also need some 
>>>> bytecode spinning, since a concrete VarHandle might have a number of 
>>>> components indices which depend on the layout being accessed (more 
>>>> specifically, for each traversed array, there's an extra long 
>>>> component).
>>>> 
>>>> The spinning is done in the new AddressVarHandleGenerator class.
>>>> 
>>>> I've added a smoke test for the various memory mode accesses; we 
>>>> should of course add more tests e.g. to check that all scope options 
>>>> are properly enforced (such as confinement).
>>>> 
>>>> With this patch, performances of VH access is approx 15% slower than 
>>>> a plain Unsafe call. The main issue is that the scope liveness check 
>>>> is not hoisted outside of hot loops from hotspot (since the scope 
>>>> state is a mutable field which could be changed by a different 
>>>> thread). Vlad is working on the JIT optimization story for this; the 
>>>> idea is that when access is confined, JIT will be able to trust the 
>>>> fact that it's seeing all accesses to the scope it needs to and, 
>>>> provided the scope never escapes the inlined method, the liveness 
>>>> check can be hoisted, effectively bringing us on par with Unsafe.
>>>> 
>>>> Cheers
>>>> Maurizio
>>>> 
>>>> [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html
>>>>