OpenSSL and panama-foreign
Jorn Vernee
jorn.vernee at oracle.com
Wed Nov 10 14:20:28 UTC 2021
Actually, after hitting send on my last email I noticed the method is
also `synchronized` which I think will keep the cleanup action from
running until the monitor is unlocked (not sure here though). So, maybe
that's not it.
Jorn
On 10/11/2021 14:42, Jorn Vernee wrote:
> Looking at this, it seems there is a race between the cleaner running
> and the SSL_shutdown call (i.e. undefined behaviour). Note that the
> `this` object (which the cleanup action is registered on) can be
> collected as soon as it's unused in closeOutbound, and SSL_shutdown
> doesn't use it.
>
> You'd at least need `Reference.reachabilityFence(this)` after the call
> to SSL_shutdown.
>
> But, I'd suggest doing what Maurizio suggested instead. In the
> constructor of OpenSSLEngine:
>
> - Use ResourceScope.newImplicitScope() instead to create the scope [1]
> - Attach both 'ssl' and 'networkBIO' to the scope by turning them into
> memory segments using MemorySegment::ofAddressNative [2]
> - Then add a close action to the scope which frees your pointers [3]
>
> The call will keep the scope alive, which makes sure the cleanup isn't
> done during, or before, the native call.
>
> HTH,
> Jorn
>
> [1] :
> https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/ResourceScope.java#L247
> [2] :
> https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/MemorySegment.java#L791
> [3] :
> https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/ResourceScope.java#L190
>
> On 10/11/2021 13:53, Maurizio Cimadamore wrote:
>> Looking more at your code, I noticed you use cleaners:
>>
>> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/net/openssl/panama/OpenSSLEngine.java#L275
>>
>>
>> That is, when the engine is unreacheable, you have some cleanup actions:
>>
>> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/net/openssl/panama/OpenSSLEngine.java#L1868
>>
>>
>> Note that you can probably achieve the same effect by registering the
>> scope with a cleaner, and then add a "close action" on the cleaner
>> itself, to close the scope. But this is an optional suggestion.
>>
>> What I find more disturbing is that the addresses you generate, e.g.
>> this:
>>
>> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/openssl/openssl_h.java#L1327
>>
>>
>> This `ssl` address is just a raw memory address - it doesn't have any
>> real temporal bounds. It is not attached to the scope you create in
>> any way.
>>
>> So, when you do this:
>>
>> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/net/openssl/panama/OpenSSLEngine.java#L766
>>
>>
>> The linker won't really protect that call in any way. There's no way
>> for the linker to know whether the address is still available or not.
>> Which might be ok if the memory is managed by SSL anyway.
>>
>> But the issue here, as your debugging session shows, is that,
>> somehow, you end up calling SSL_shutdown on a bad SSL pointer (e.g.
>> NULL). Which looks like a client error (e.g. how can SSL be NULL ? )
>>
>> Maurizio
>>
>>
>>
>>
>> On 10/11/2021 09:34, Maurizio Cimadamore wrote:
>>> Hi,
>>> thanks for sharing the feedback! I'm glad you got something working.
>>> Hopefully we can fix the rest :-)
>>>
>>> On 10/11/2021 09:20, Rémy Maucherat wrote:
>>>> However, I am running into cores (apparently caused by memory
>>>> corruption) with the panama-foreign branch, while the Java 17 version
>>>> seems solid.
>>>
>>> That is interesting information. So, you say that the Java 17
>>> version works fine, but when building (latest) panama-foreign, you
>>> get some VM crash.
>>>
>>> Now, looking at the crash, it seems like it occurs in the middle of
>>> a native call:
>>>
>>> ```
>>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>>> v ~RuntimeStub::nep_invoker_blob <------------
>>> J 5738 c1
>>> org.apache.tomcat.util.openssl.openssl_h.SSL_shutdown(Ljdk/incubator/foreign/Addressable;)I
>>> (31 bytes) @ 0x00007faf82340d2c [0x00007faf8233cc20+0x000000000000410c]
>>> J 5482 c1
>>> org.apache.tomcat.util.net.openssl.panama.OpenSSLEngine.closeOutbound()V
>>> (76 bytes) @ 0x00007faf81a934e4 [0x00007faf81a93280+0x0000000000000264]
>>> ```
>>>
>>> This area changed quite a bit recently, as we are refactoring and
>>> consolidating the linking functionalities to make it simpler for
>>> developers to port the CLInker implementation to other platforms. I
>>> wonder if a regression sneaked in (possible, the refactoring were
>>> quite big).
>>>
>>> In terms of support for shared scopes, the one thing that has
>>> changed from 16 to 17 is that now if you pass arguments by reference
>>> to functions, those references' scopes are "acquired" on function
>>> enter and "released" on function exit - which means it is not
>>> possible, even in a shared context, for e.g. the target address of a
>>> native function to be unloaded in the middle of a native call. But
>>> this should add _more_ safety, not less.
>>>
>>> I suppose that you tested Java 17 by using the jextract available in
>>> the Panama EA binaries; and that you tested the bits in
>>> panama-foreign by manually building them, and using the jextract you
>>> obtained as a result (e.g. using the EA jextract against the latest
>>> panama-foreign API will NOT work).
>>>
>>> Thanks
>>> Maurizio
>>>
More information about the panama-dev
mailing list