OpenSSL and panama-foreign
Rémy Maucherat
remm at apache.org
Wed Nov 10 14:54:18 UTC 2021
On Wed, Nov 10, 2021 at 3:20 PM Jorn Vernee <jorn.vernee at oracle.com> wrote:
>
> Actually, after hitting send on my last email I noticed the method is
> also `synchronized` which I think will keep the cleanup action from
> running until the monitor is unlocked (not sure here though). So, maybe
> that's not it.
Thanks a lot for the great design ideas. Keeping around MemorySegment
instead of MemoryAddress is a much better design. I'll also use the
scope as the controller for closing everything.
Notes: The cleaner comes from a Java 11 finalize refactoring. The sync
model was also there before with the JNI code, it worked. Overall this
is a translation of the JNI calling code, the ssl there was simply a
long (so not very safe ...).
Rémy
>
> Jorn
>
> On 10/11/2021 14:42, Jorn Vernee wrote:
> > Looking at this, it seems there is a race between the cleaner running
> > and the SSL_shutdown call (i.e. undefined behaviour). Note that the
> > `this` object (which the cleanup action is registered on) can be
> > collected as soon as it's unused in closeOutbound, and SSL_shutdown
> > doesn't use it.
> >
> > You'd at least need `Reference.reachabilityFence(this)` after the call
> > to SSL_shutdown.
> >
> > But, I'd suggest doing what Maurizio suggested instead. In the
> > constructor of OpenSSLEngine:
> >
> > - Use ResourceScope.newImplicitScope() instead to create the scope [1]
> > - Attach both 'ssl' and 'networkBIO' to the scope by turning them into
> > memory segments using MemorySegment::ofAddressNative [2]
> > - Then add a close action to the scope which frees your pointers [3]
> >
> > The call will keep the scope alive, which makes sure the cleanup isn't
> > done during, or before, the native call.
> >
> > HTH,
> > Jorn
> >
> > [1] :
> > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/ResourceScope.java#L247
> > [2] :
> > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/MemorySegment.java#L791
> > [3] :
> > https://github.com/openjdk/panama-foreign/blob/foreign-jextract/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/ResourceScope.java#L190
> >
> > On 10/11/2021 13:53, Maurizio Cimadamore wrote:
> >> Looking more at your code, I noticed you use cleaners:
> >>
> >> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/net/openssl/panama/OpenSSLEngine.java#L275
> >>
> >>
> >> That is, when the engine is unreacheable, you have some cleanup actions:
> >>
> >> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/net/openssl/panama/OpenSSLEngine.java#L1868
> >>
> >>
> >> Note that you can probably achieve the same effect by registering the
> >> scope with a cleaner, and then add a "close action" on the cleaner
> >> itself, to close the scope. But this is an optional suggestion.
> >>
> >> What I find more disturbing is that the addresses you generate, e.g.
> >> this:
> >>
> >> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/openssl/openssl_h.java#L1327
> >>
> >>
> >> This `ssl` address is just a raw memory address - it doesn't have any
> >> real temporal bounds. It is not attached to the scope you create in
> >> any way.
> >>
> >> So, when you do this:
> >>
> >> https://github.com/rmaucher/openssl-panama-foreign/blob/5869df2c2e58a0a4aa8e0883c1a896cd49e96146/src/main/java/org/apache/tomcat/util/net/openssl/panama/OpenSSLEngine.java#L766
> >>
> >>
> >> The linker won't really protect that call in any way. There's no way
> >> for the linker to know whether the address is still available or not.
> >> Which might be ok if the memory is managed by SSL anyway.
> >>
> >> But the issue here, as your debugging session shows, is that,
> >> somehow, you end up calling SSL_shutdown on a bad SSL pointer (e.g.
> >> NULL). Which looks like a client error (e.g. how can SSL be NULL ? )
> >>
> >> Maurizio
> >>
> >>
> >>
> >>
> >> On 10/11/2021 09:34, Maurizio Cimadamore wrote:
> >>> Hi,
> >>> thanks for sharing the feedback! I'm glad you got something working.
> >>> Hopefully we can fix the rest :-)
> >>>
> >>> On 10/11/2021 09:20, Rémy Maucherat wrote:
> >>>> However, I am running into cores (apparently caused by memory
> >>>> corruption) with the panama-foreign branch, while the Java 17 version
> >>>> seems solid.
> >>>
> >>> That is interesting information. So, you say that the Java 17
> >>> version works fine, but when building (latest) panama-foreign, you
> >>> get some VM crash.
> >>>
> >>> Now, looking at the crash, it seems like it occurs in the middle of
> >>> a native call:
> >>>
> >>> ```
> >>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> >>> v ~RuntimeStub::nep_invoker_blob <------------
> >>> J 5738 c1
> >>> org.apache.tomcat.util.openssl.openssl_h.SSL_shutdown(Ljdk/incubator/foreign/Addressable;)I
> >>> (31 bytes) @ 0x00007faf82340d2c [0x00007faf8233cc20+0x000000000000410c]
> >>> J 5482 c1
> >>> org.apache.tomcat.util.net.openssl.panama.OpenSSLEngine.closeOutbound()V
> >>> (76 bytes) @ 0x00007faf81a934e4 [0x00007faf81a93280+0x0000000000000264]
> >>> ```
> >>>
> >>> This area changed quite a bit recently, as we are refactoring and
> >>> consolidating the linking functionalities to make it simpler for
> >>> developers to port the CLInker implementation to other platforms. I
> >>> wonder if a regression sneaked in (possible, the refactoring were
> >>> quite big).
> >>>
> >>> In terms of support for shared scopes, the one thing that has
> >>> changed from 16 to 17 is that now if you pass arguments by reference
> >>> to functions, those references' scopes are "acquired" on function
> >>> enter and "released" on function exit - which means it is not
> >>> possible, even in a shared context, for e.g. the target address of a
> >>> native function to be unloaded in the middle of a native call. But
> >>> this should add _more_ safety, not less.
> >>>
> >>> I suppose that you tested Java 17 by using the jextract available in
> >>> the Panama EA binaries; and that you tested the bits in
> >>> panama-foreign by manually building them, and using the jextract you
> >>> obtained as a result (e.g. using the EA jextract against the latest
> >>> panama-foreign API will NOT work).
> >>>
> >>> Thanks
> >>> Maurizio
> >>>
More information about the panama-dev
mailing list