[foreign] RFR 8222765: Implement foreign memory access through VarHandle

Fri Apr 19 16:42:38 UTC 2019

On 19/04/2019 16:46, Jorn Vernee wrote:
> - In GensrcVarHandles.gmk you seem to have a mistake in the generation 
> command:
>
>  274 # List the types to generate source for, with capitalized first 
> letter
>  275 VARHANDLES_MEMORY_ADDRESS_TYPES := Byte Short Char Int Long Float 
> Double
>  276 $(foreach t, $(VARHANDLES_MEMORY_ADDRESS_TYPES), \
>  277   $(eval $(call 
> GenerateVarHandleMemoryAddress,VAR_HANDLE_BYTE_ARRAY_$t,$t)))
>
> You still have to change the name VAR_HANDLE_BYTE_ARRAY_ to something 
> else (probably forgotten when copy pasting?)
Good point, I wonder why it  still works??? I'll take another look. The 
right classes are definitively generated.
>
> - I agree that MemoryAddress should be split, but I don't think we can 
> "just" add a length() method to the MemoryRegion class, since not all 
> memory regions have a length. E.g. when a pointer is returned by a 
> native call, we don't know any memory bounds. I think the same problem 
> also exists when deserializing a pointer. Is it okay if a length() 
> method would throw UOE in those cases?
Or OptionalLong
>
> - Also in MemoryAddress, I think the name 'limit' instead of 'narrow' 
> to shrink a region is a little more colloquial. (e.g. looking at 
> Buffer/Stream)
>
> - I'd also like to advocate for using an enum to represent the 
> MemoryScope characteristics, instead of a long. The advantage is that 
> you know exactly which constants are acceptable to pass there, as well 
> as IDE autocomplete being able to help you out better. If the 
> parameter type is changed to Set<Characteristics>, this can easily be 
> passed ad-hoc using Set.of, or EnumSet.of.
This is more for API review kind of thing. In the JDK we are generally 
thorn between use of enums and use of longs, both have pros and cons. 
Since this is a low level thing, I opted for longs (same decision made 
for e.g. spliterators) we can of course change later on, if needed.
>
> - Should the generated template classes be added to the patch as well? 
> (I've seen that done in some of the vector API patches).
No, that's the same approach as byte buffer handles, they are generated 
by the build and put in the gensrc folder, but never added explicitly.
>
> * Also, since you mention performance and liveliness checks not being 
> hoist-able. Just out of curiosity; I'm wondering why the change to 
> ciField doesn't take care of that already?

That's just to 'trust' final fields; w/o that not even the bound check 
(which only handles immutable value) would work :-)

Liveness check is a different can of worms...

Maurizio

>
> Jorn
>
> Maurizio Cimadamore schreef op 2019-04-19 16:52:
>> On 19/04/2019 15:25, Brian Goetz wrote:
>>> # java.foreign.memory
>>>
>>> ## MemoryAddress
>>>
>>> It’s a little hard to tell exactly what this abstraction is. The 
>>> class doc says “temporal and spatial bounds”; presumably the 
>>> temporal bounds are determined by the scope.  The methods `narrow()` 
>>> and `offset()` hint at a spatial bound, but there’s no way to query 
>>> the bound directly (no `size()` or `limit()` method), nor does it 
>>> say what happens when offset/narrow violates the spatial bounds.
>>>
>>> Is the model here that there is a hidden limit?  And one can narrow 
>>> it so long as the new limit is less than the current limit, and 
>>> offset so long as we stay under the limit?  And presumably, one can 
>>> bind a layout to it, so long as there is room between the pointer 
>>> and the limit for the layout?
>>>
>>> I basically cannot tell whether this is the abstraction for 
>>> “pointer” (subject to some bounds constraint) or “region” 
>>> (describing a block of memory.), but it doesn’t quite feel like 
>>> “address”.  Maybe a less loaded name, like “MemoryRef”?
>>
>> I think the main issue here is that we're using the same abstraction
>> to model both regions and addresses. This is covered in the document:
>>
>> file:///home/maurizio/Documents/text/2019/memaccess.html
>>
>> (see section region vs. address duality).
>>
>> I think splitting should clarify the API quite a bit, and we can add
>> the bound methods where they really belong. Changing name can help a
>> bit, but I think the overlapping between the two nature of the
>> MemoryAddress/MemoryRef/whatever we call it will remain confusing.
>>
>>>
>>> ## MemoryScope
>>>
>>> This feels mostly like an abstraction that does one thing, except 
>>> for the method `confinementThread()`; this one definitely sticks out 
>>> as “what’s this doing here.”  I think I get how you got here — you 
>>> don’t want to have a complex hierarchy of Scope types — and this was 
>>> the one bit of extra information you needed to complete the story. 
>>> But something feels a little “crammed” here.
>> Right, we can also create a ConfinedScope, no problem there.
>>>
>>> Scope needs a method to expose its closed-state; presumably 
>>> allocate() will fail if the scope is closed, but how do I check if 
>>> it is closed before allocations?
>> Right we have an internal 'checkAlive' method, but I have not added it
>> to the public API. Will do.
>>>
>>> There must be some rules for merging characteristics; presumably if 
>>> the parent scope is confined, so must the child scopes.
>>>
>>> # java.foreign.layout
>>>
>>> General: the names for most of these classes are just too general — 
>>> Address, Value, Sequence — users will curse us for polluting the 
>>> namespace with things that look like what they want, but are not.  
>>> (Just like java.awt.List.). There’s an easy fix: append Layout to 
>>> all the concrete subtypes of Layout (SequenceLayout, GroupLayout, 
>>> etc.). Similarly, Descriptor should be something like NativeDescriptor.
>>
>> RIght, appending Layout is what my not so secret plan has been all 
>> along.
>>
>> Also, I think in this patch we can probably remove:
>>
>> Descriptor
>> Function
>> Unresolved
>>
>> Layout::isPartial
>> Address:function
>> Address::ofFunction
>>
>> Since this stuff is mostly needed for foreign function calls interop,
>> which will come later.
>>
>>>
>>> I get the sense all the types in this package are intended to be 
>>> immutable / value-based.  Should say that.
>> Yep.
>>>
>>> ## Descriptor
>>>
>>> Is there a better name than “annotations”?
>> Uhmm will think. It's true that this is a problem, as it always
>> creates ambiguities with Java annotations.
>>>
>>> `stripAnnotations` sounds like a mutative method, but I think you 
>>> intend implementations to be immutable?  Maybe `withoutAnnotations`.
>>>
>>> ## LayoutPath
>>>
>>> I think what this abstraction is, is a path from the root of a 
>>> Layout to a particular sub-layout?  And the enclosing() method 
>>> provides the trail of breadcrumbs back to the root?
>>>
>>> I wonder if there’s an easier way to express this.  I assume what 
>>> I’m trying to do is navigate to foo.bar[n].baz.  But I don’t see how 
>>> things like sequence indexes play into the story.
>>
>> Not sure I get what you mean here; the general idea is that, once you
>> have a path to a specific layout element, you have all the info you
>> need to generate a VarHandle which can be used to navigate an address
>> from a root to the desired layout element.
>>
>> Any sequence layout that is traversed (like bar[n]), will add an extra
>> 'long' index dimension to the resulting VarHandle.
>>
>>>
>>> ## Layout
>>>
>>> Having some trouble inferring the model from the API.  Can we 
>>> document the model separately?
>>
>> The layout model is documented in great details here:
>>
>> http://cr.openjdk.java.net/~mcimadamore/panama/panama-binder-v3.html
>>
>> The main idea is that we have 'value layouts' (for leaves) and 'group
>> layouts' (for composite stuff). Sequence is a particular kind of a
>> Group which is basically a repetition of the same layout element over
>> and over (up to a number of times). Address is a special Value that is
>> used to encode... well, addresses. Since we want to be friendly to
>> protocol schemas, an address can be encoded with whatever size,
>> endianness, etc. - in other words, it's just a special kind of value,
>> that, in addition tells you the Layout that it's pointing into.
>>
>> Maurizio
>>
>>>
>>>
>>>
>>>> On Apr 19, 2019, at 9:32 AM, Maurizio Cimadamore 
>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>
>>>> And here's a javadoc link in case somebody wants to do a pass over 
>>>> the API (30% feedback please :-))
>>>>
>>>> http://cr.openjdk.java.net/~mcimadamore/panama/memaccess_javadoc_v1/
>>>>
>>>> Maurizio
>>>>
>>>> On 19/04/2019 14:25, Maurizio Cimadamore wrote:
>>>>> Hi,
>>>>> this patch implements the MemoryAddress/MemoryScope abstractions 
>>>>> described in the document I've sent recently [1]:
>>>>>
>>>>> http://cr.openjdk.java.net/~mcimadamore/panama/8222765/
>>>>>
>>>>> This patch will go on new branch of Panama, called 
>>>>> "foreign-memaccess", so the patch is relative to the 'default' 
>>>>> branch.
>>>>>
>>>>> Most of the classes added by this patch are the Layout classes we 
>>>>> already have in the existing 'foreign' branch.
>>>>>
>>>>> There are however a bunch of new classes: LayoutPath, 
>>>>> MemoryAddress and MemoryScope (and their respective implementations).
>>>>>
>>>>> In order to generate VarHandle with the right number of 
>>>>> dimensions, we use a template (so that we can automatically 
>>>>> generate all variants of VarHandle for all carriers). Then, we 
>>>>> also need some bytecode spinning, since a concrete VarHandle might 
>>>>> have a number of components indices which depend on the layout 
>>>>> being accessed (more specifically, for each traversed array, 
>>>>> there's an extra long component).
>>>>>
>>>>> The spinning is done in the new AddressVarHandleGenerator class.
>>>>>
>>>>> I've added a smoke test for the various memory mode accesses; we 
>>>>> should of course add more tests e.g. to check that all scope 
>>>>> options are properly enforced (such as confinement).
>>>>>
>>>>> With this patch, performances of VH access is approx 15% slower 
>>>>> than a plain Unsafe call. The main issue is that the scope 
>>>>> liveness check is not hoisted outside of hot loops from hotspot 
>>>>> (since the scope state is a mutable field which could be changed 
>>>>> by a different thread). Vlad is working on the JIT optimization 
>>>>> story for this; the idea is that when access is confined, JIT will 
>>>>> be able to trust the fact that it's seeing all accesses to the 
>>>>> scope it needs to and, provided the scope never escapes the 
>>>>> inlined method, the liveness check can be hoisted, effectively 
>>>>> bringing us on par with Unsafe.
>>>>>
>>>>> Cheers
>>>>> Maurizio
>>>>>
>>>>> [1] - http://cr.openjdk.java.net/~mcimadamore/panama/memaccess.html
>>>>>