RFC: Refactoring SystemABI

Mon Dec 10 11:16:24 UTC 2018

This is looking better thanks.

I think JavaSymbol is still doing more than it should, and a sign of 
that is the fact that now you need an hash table to keep track of the 
allocated stubs in order to find them. This is a new requirement that 
was not present in the old code, and is introduced by the fact that, I 
think, we are trying to look 'too hard' for whether a pointer 
corresponds to an existing JavaSymbol.

That is, when the binder extracts a function pointer from native code, 
it needs to call CallbackImpl::new passing just a pointer, and at that 
point we call JavaSymbol::findSymbol to see if there was a stub already 
at that address.

To me this is a proof that the distinction between JavaSymbol and 
SimpleSymbol is a bit artificial. Sure, we could keep exact track in the 
binder of all Java pointers, but that will not scale (we need an 
hashmap, lookups, what about concurrency?).

The idea I tried to push forward was that we simply passed symbols down 
to the SystemABI. This simplifies all the binder code which doesn't have 
to do heroics to guess whether a symbol is Java or not. Then, when 
SystemABI sees a call to a symbol whose pointer is a VM CodeBlob (and 
there's a way to check that), we can decide whether to go downcall or to 
remain above the water.

I think all this exploration around JavaSymbols has, in some way , been 
triggered by the fact that we did not want to lookup for a VM stub on 
every downcall. To this I say:

1) creating the downcall handle doesn't have to be uber-fast. Invoking 
the resulting method handle has!

2) There are in general three categories of function pointers that need 
to be deal with:

a) definitively native - those come from Library::lookup; these will 
always require to go native
b) definitively Java - those come from SystemABI::upcallHandle and can 
always be optimized away
c) unchecked - these come from interacting with native code; there's not 
enough trail of crumbs which tells us which is which; for this you 
either need to keep track of all symbols (using a map... but I'd prefer 
not) or check if it's a VM blob (preferred)

My hope is that (i) as I said, performance of downcall creation is not 
all that important and (ii) even if it is, 90% cases can be optimized 
since they fall in either (a) or (b) which the ABI can special case.

So, how about having a SINGLE Symbol implementation (e.g. SymbolImpl) 
which goes like this:

class SymbolImpl implements Symbol {
    enum Kind {
        NATIVE,
        JAVA,
        UNCHECKED;
   }

   public Kind kind() { ... }

   @Override
   public String getName() { ... }

   @Override
   public String getAddress() { ... }
}

Something like this would be enough for SystemABI::downcallHandle to 
optimize as much as possible:

switch (symbol.kind()) {
     case JAVA:
         //downcast to some well known ABI-specific Symbol subclass, and 
bypass native call
    case NATIVE:
        //do a native call
    case UNCHECKED:
       //try to retrieve upcall handler from underlying VM blob - if 
succeeds bypass the native call, otherwise do a native one
}

Note that this would merely be an optimization. So it's also totally ok 
to punt on UNCHECKED for now.

In terms of where we go from here, I think we have a choice; we can go 
ahead with what you have, and then I can have a stub at the above 
simplification, or you can do it as part of this review. Your choice. 
But overall this is a very solid piece of work, thanks!

Cheers
Maurizio

On 08/12/2018 06:17, Henry Jen wrote:
> Please review webrev.03[1].
>
> Revert changes in NMT and Symbol, SystemABI upcall/downcall use Symbol.
>
> JavaSymbol remains as a construct that extract  MH from various forms, call SystemABI.upcallStub to obtain a symbol, wrap it up with better name if available, and publish itself in the Pointer->Symbol registry so Callback can recover the stub from a pointer.
>
> Cheers,
> Henry
>
> [1] http://cr.openjdk.java.net/~henryjen/panama/SystemABI/webrev.03/webrev/
>
>
>
>> On Dec 7, 2018, at 11:34 AM, Henry Jen <henry.jen at oracle.com> wrote:
>>
>> On Dec 7, 2018, at 9:34 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
>>> Summarizing from the offline discussion:
>>>
>>> 1) Regarding safety, you raise a broad (and legitimate) point which seems to be: SystemABI is too unsafe, we need some intermediate API in the middle (such as Symbol). But once you can turn a pointer into executable code (which you can do with both approaches), safety is already down on the drain. Example: what if I pass a 'getter' NativeMethodType to Symbol::methodType when in reality the symbol describes a function? Whoops.
>>>
>> True, my other concern is that SystemABI is more abroad if we make that public, thing like Layout and CallingConvention. Anyway, guess that’s no harm anyway.
>>
>>> 2) Symbol::methodHandle seems to distort the NativeMethodType API, which should be a mere collection of layout types and now has become something more. Vararg-ness is part of the signature; being a getter/setter is not - it's a semantic property of what the function does. So by doing that move we are conflating the static description with the dynamic one. Which is why I believe the simpler approach described with SystemABI::upcall/Downcall accepting/returning Symbol felt a good compromise.
>>>
>>> 3) The binder, that is, high level classes such as CallbackImpl should *not* rely on SystemABI to output things like JavaSymbols. A symbol is an opaque representation of some native entry point - and it's up to the ABI to cough up one; from the binder perspective, a symbol is just a symbol - if the binder wants to call it, only option is via SystemABI::downcallHandle.
>>>
>>> Considering this, I suggest we go back to what has been discussed few days ago:
>>>
>>> - use Symbol in the signatures of SystemABI::upcall/downcallHandle
>>> - remove Kind from NMT
>>> - remove Symbol::methodHandle
>>> - tweak the implementation so that the binder doesn't depend on the particular shape of Symbol produced by the ABI
>>> - also tweak freeUpcallStub to take a Symbol
>>>
>> Agreed.
>>
>> Cheers,
>> Henry
>>
>>
>>> Thanks
>>> Maurizio
>>>
>>>
>>> On 07/12/2018 16:24, Henry Jen wrote:
>>>> On Dec 7, 2018, at 3:45 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
>>>>> I'm lost :-(
>>>>>
>>>>> I think we agreed to use Symbol in both upcall/downcall methods...
>>>>>
>>>>> Having downcall handle accept just any pointer is IMHO a step back. It's much easier to shoot yourself in the foot. Symbol was fine here.
>>>>>
>>>>> We also discussed how, for consistency, upcallStub needed to return a symbol too - if internally you want to call it JavaSymbol, fine - but I think from a public API perspective the two methods should be:
>>>>>
>>>>> public MethodHandle downcallHandle(CallingConvention cc, Symbol s, NativeMethodType nmt);
>>>>> public Symbol downcallHandle(CallingConvention cc, MethodHandle target, NativeMethodType nmt);
>>>>>
>>>>>
>>>> Pointer is the lowest level and proper for SystemABI as where it sit.
>>>> Symbol is a level higher. It’s a matter of whether we want to separate those two layers or not.
>>>>
>>>>
>>>>>
>>>>> Let's also try to keep this simple - putting the MethodHandle logic in Symbol is something that might have some value, but (1) is non essential (given the above API you can always get a MethodHandle for a Symbol using SystemABI::downcallHandle) and (2) has not been discussed in this thread.
>>>>>
>>>> Public API is only JavaExports and Symbol, everything else is internal as SystemABI, JavaSymbol.
>>>> As I said before, I don’t see why SystemABI should be external, as that’s a very important piece and likely only builder of JDK will use. Unless it is as an service API, which I don’t think is needed.
>>>>
>>>> Instead, Symbol is exposed as public API to dealing with calls, either it’s upcall or downcall.
>>>>
>>>>> Also, we should try to reduce the surface area in terms of public API and this patch seems to go in the direction of adding two new types (not yet public but...): JavaSymbol and JavaExports, both of which I don't think pass the test for 'need to be in public API'. I think a lot of the rationale behind these classes is to avoid creating a Callback wrapper in cases where the underlying symbol is coming from java - and I think it's not worth making the API more complex to solve a problem that never comes up in practice (yes, you can call the callback stub you have created from Java, but in reality, how frequent that would be?).
>>>>>
>>>>>
>>>>> In other words:
>>>>>
>>>>> 1) SystemABI::upcallStub returns a Symbol
>>>>> 2) CallbackImpl saves that symbol somewhere
>>>>> 3) CallbackImpl::asFunction creates a callback wrapper (using CallbackImplGenerator) and pass it the symbol it has
>>>>> 4) The implementation generated by CallbackImplGenerator will take care to do the downcall using SystemABI (passing the symbol). There will be two cases here:
>>>>> 4a) the Symbol is the one obtained from SystemABI::upcallHandle - in that case the downcall logic will do something special and bypass the native call
>>>>> 4b) the Symbol is some opaque entry point created by the binder (e.g. when reading a function pointer field in a struct); in that case a full downcall happens
>>>>>
>>>>> I think this is self-consistent, and does not require any extra API. What's better, we completely delegate to the ABI the task of recognizing 'its own' upcall symbols, which means that each ABI implementation could have a different notion of upcall symbols - and some ABIs might even decide not to optimize (4a) and always go native.
>>>>>
>>>>>
>>>> Sure we can combine those two layers and have SystemABI instead. With Symbol as the public API make it’s more friendly to add constructs for different sources. While SystemABI can only take MH and will need other helpers. Also the main benefit to have meaningful names, rather than make up in upcallHandle.
>>>>
>>>>> Also, please no Kind on NativeMethodType - this should be about types - and the new kind turns it more into a kind of 'method handle'.
>>>>>
>>>>>
>>>> Sort of, but I don’t feel it’s too bad. It’s describing a native method, Kind is a metadata just as varargs. We simply expand it to cover variable.
>>>>
>>>>> To summarize, it feels (probably because of the back and forth made during the discussion last week, and that's my fault too!) that a bad turn has been made somewhere. I suggest we go back to a simpler state and simply define the ABI upcall/downcall methods with the signatures shown above.
>>>>>
>>>>>
>>>>>
>>>>> Thanks - I know this feedback will probably sound harsh, and I also know you put a lot of effort into this; but we need this API to settle into a low energy state where SystemABI only depends on a bunch of public types (Pointer, Symbol, NativeMethodType, LayoutType). I think this is crucial to get this API right - the more API points we introduce, the more dependencies between the ABI and the binder, and we want to keep those to a minimum!
>>>>>
>>>> Are you talking about internal implementation or public API? I don’t think we have more public API than choose to expose SystemABI, it actually is a subset. Again, I don’t think SystemABI should be exposed as public API.
>>>>
>>>> Cheers,
>>>> Henry
>>>>
>>>>> Maurizio
>>>>>
>>>>>
>>>>>
>>>>> On 07/12/2018 08:03, Henry Jen wrote:
>>>>>> Please have a look at updated webrev.02[1]
>>>>>>
>>>>>> Quick summary:
>>>>>>
>>>>>> 1. SystemABI upcall/downcall are dealing with Pointer, not Symbol
>>>>>> 2. abi.sysv.x64 package is abi.x64.sysv as suggested by Jorn
>>>>>> 3. Library.Symbol becomes the public API presenting a consistent view for native and java symbol. From which, one can get  MH for java call and Pointer for native function pointer.
>>>>>> 4. I haven’t move invokers into abi specific folder as I think most of them would be share code, let me know particular file you think should be moving, we can always do that later though.
>>>>>>
>>>>>> There are still thing to improve, but the Symbol approach come together nicely. I haven’t expose JavaSymbol just yet, it is now to be managed via Scope/Callback which manage the life-cycle.
>>>>>>
>>>>>> In a way, JavaSymbol is the UpcallHandle we had earlier.
>>>>>>
>>>>>> JavaExports was created to be the public API to obtain JavaSymbol, with API for each form of JavaSymbol construction with unbounded pointer and user will need to manage the lifecycle of the symbol. I moved them back into as JavaSymbol.of(..). If we like this, then we can open it up.
>>>>>>
>>>>>> It currently serve the global registry of native address to JavaSymbol lookup, which I am debating whether should be move into CallbackImpl.
>>>>>>
>>>>>> Let me know what you think. There are still some cleanup can be done, also if we like the Symbol approach, we should add some test cases.
>>>>>>
>>>>>> Cheers,
>>>>>> Henry
>>>>>>
>>>>>> [1]
>>>>>> https://cr.openjdk.java.net/~henryjen/panama/SystemABI/webrev.02/webrev/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Dec 4, 2018, at 10:43 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 04/12/2018 15:51, Jorn Vernee wrote:
>>>>>>>
>>>>>>>> Maybe this concern is being raised prematurely. I think we can see how the current plans pans out and then look at this again if needed.
>>>>>>>>
>>>>>>>>
>>>>>>> Yes, let's revisit this later. I have a couple of different ideas on how to get there, but it doesn't make a lot of sense to discuss them here, as it's probably too early - and I don't want to add to the confusion.
>>>>>>>
>>>>>>> To recap (for Henry): let's have SystemABI::upcallStub return a Symbol and take it from there.
>>>>>>>
>>>>>>> Thanks for your comments, as usual.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>>
>>>>>>>> Jorn
>>>>>>>>
>>>>>>>> Jorn Vernee schreef op 2018-12-04 16:34:
>>>>>>>>
>>>>>>>>> Maurizio Cimadamore schreef op 2018-12-04 16:07:
>>>>>>>>>
>>>>>>>>>> On 04/12/2018 14:23, Jorn Vernee wrote:
>>>>>>>>>>
>>>>>>>>>>> We have Scope::allocateCallback to replace step 2. and 3., but that requires an annotated functional interface object. What if a user just has a MethodHandle?
>>>>>>>>>>>
>>>>>>>>>> Not sure I follow - a Callback is based on the idea that you _do have_
>>>>>>>>>> a functional interface.
>>>>>>>>>>
>>>>>>>>> The scenario I'm talking about still uses a functional interface
>>>>>>>>> _type_. But the Scope::allocateCallback api expects _an instance_ of
>>>>>>>>> such a functional interface type; So if you just have a MethodHandle
>>>>>>>>> you would first have to turn it into an instance of this functional
>>>>>>>>> interface type and then call Scope::allocateCallback. This adds a
>>>>>>>>> level of indirection, and AFAIK there's no straight forward API for
>>>>>>>>> wrapping a MethodHandle in a functional interface in the JDK
>>>>>>>>> (MethodHandleProxies comes close but it relies on reflection, so it's
>>>>>>>>> slow).
>>>>>>>>>
>>>>>>>>> So what I'm talking about is that a user can use System::upcallStub to
>>>>>>>>> make the MethodHandle callable from native, but if they have some
>>>>>>>>> other method that accepts a Callback there still needs to be a way to
>>>>>>>>> convert the value returned from upcallStub into a Callback.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I don't think it makes sense to mix the abstraction levels - the
>>>>>>>>>> binder will represent all callbacks using Callback, so I don't expect
>>>>>>>>>> a user of the high-level binder to mess with SystemABI interface to
>>>>>>>>>> allocate stubs.
>>>>>>>>>>
>>>>>>>>> I was thinking about the inverse; we have a user who is using the
>>>>>>>>> low-level SystemABI, and wants to have the high-level addition of
>>>>>>>>> Callback. This also makes sense where low-level code has to interact
>>>>>>>>> with high-level code.
>>>>>>>>>
>>>>>>>>> Jorn