Funneling Objects through void*

Thu Jun 15 23:29:56 UTC 2023

On 16-Jun-23 1:08, Maurizio Cimadamore wrote:
> Hi,
> I wanted to reply on this point, since this is also an important one.
> 
> On 15/06/2023 20:30, Johannes Kuhn wrote:
>> WRT Exceptions. I did notice that throwing an exception in an upcall 
>> crashes the JVM. Not so great. Means I need to do some extra work 
>> catching them, checking for pending exceptions (there should be none, 
>> but you never know) and return the appropriate "abort enumerating" 
>> constant.
>>
>> Not sure if this could be improved by specifying an "exception return 
>> value" when creating the upcall and then rethrow the exception when 
>> the underlying downcall returns. 
> 
> So, what should happen when an upcall fails with exceptional state? This 
> is a question we asked ourselves many (many) times. There are many 
> options, but we seem to always keep coming back to the same answer. It 
> might be worth, nevertheless, expand a bit on what the rationale behind 
> the current decision is.
> 
> The reason we terminate the JVM when there's an unhandled exception in 
> an upcall is that if we didn't, we would have to pop all the native 
> frames until we reached some Java frame, and then rethrow the exception 
> there. While this gives a relatively intuitive behavior (the exception 
> thrown in the upcall can be caught by the code that initiated the 
> downcall to which the upcall was passed), there is a biggie issue: what 
> happens to the native code that was popped? We were, after all, in the 
> middle of executing code inside some function. Perhaps the function 
> needed, after the upcall, to reset some state for the next iteration. 
> Sadly, by simply stepping over the native code, we now have left the 
> native library in an inconsistent state. If you catch the exception and 
> try to call the same function again, there's no way to predict what's 
> going to happen.
> 
> The C language (with the notable exception of Windows [1]) doesn't have 
> a concept of stack unwinding (unlike C++). That said, stack unwinding is 
> extremely platform/compiler-dependent. In some platfroms (notably, 
> Windows) there is a clearly supported API to do it, in some others less 
> so (Linux does have libunwind [2]). But even if we did unwinding 
> correctly, while that might save C++ code that got caught in the middle 
> of an exception, that would still do nothing for C code.
> 
> Overall, there's no silver bullet here. To recap:
> 
> * Admit that we don't know how to proceed, and terminate the JVM (what 
> we do today);
> * Pop all the native frames, leaving the native library in some 
> undefined state;
> * Attempt unwinding (assuming we can in all platforms), rescuing some 
> C++ cases, but still leaving C code in undefined state.
> 
> What we do today is blunt, but honest. Other options involving 
> popping/unwinding are possible, but the reality is, the underlying 
> native library has potential to misbehave from the point at which the 
> exception occurred.
> 
> In some APIs (e.g. libclang cursor visitors [3] work like that), an 
> upcall can return a value which says "please visit no more". If this 
> idiom was frequent enough, perhaps we could provide an upcall Linker 
> option which defines a "fallback" return value, to be used in case of 
> exceptions. For primitive returns that's a possibility, but if the 
> return value is a pointer or a struct things are more difficult: the 
> fallback segment could no longer be alive by the time it is accessd from 
> the native code surrounding the upcall (which would lead, again, to a 
> crash).

EnumWindows also works this way - return FALSE to stop enumerating.

But IMHO this does not really matter - if an exception is pending, it 
could "just return the exception value". If the C code makes an other 
upcall, it gets the same value back. (Which was hopefully chosen in a 
way to get quickly back to the downcall.)

The value to return would be specified when the upcall is created, and 
should also be kept alive as long as the upcall is alive.

It is already possible to misuse the API and obtain the address to some 
segment, close it, and then pass the address down (bad idea, but you can 
only provide safe patterns to avoid that).

I have no idea how returning structs work at the ABI level, so I can't 
be confident that this is a workable approach - I try to avoid structs 
as parameters/return values, and prefer pointers to them instead.

- Johannes

> 
> Overall, whenever we discussed this, the general feeling was always that 
> there's no "right" or "wrong" answer: handling exceptions thrown from 
> upcalls is a matter of "choosing the right policy" for the particular 
> kind of upcall one needs to define. Sometimes it might be convenient to 
> "return 0", sometimes to just crash, other times to pop all native 
> frames, etc.
> 
> Cheers
> Maurizio
> 
> [1] - 
> https://learn.microsoft.com/en-us/cpp/cpp/structured-exception-handling-c-cpp?view=msvc-170
> [2] - https://github.com/libunwind/libunwind
> [3] - 
> https://clang.llvm.org/doxygen/group__CINDEX__CURSOR__TRAVERSAL.html#ga99a9058656e696b622fbefaf5207d715