MemorySegment JVM memory leak
Uwe Schindler
uschindler at apache.org
Thu Apr 23 12:32:53 UTC 2020
Hi,
> Thinking more about this, I think I stand by the existing API.
That's fine. The little bit of "thread hacks" is easy for us as a first step. I can work on testing it with some live indexes. The only thing that would really be helpful - as noted before - should be to add offset to the MemorySegment::mapFromPath(). This would make it symmetric to what you can do with ByteBuffers.
> Short-term is going to be a little inconvenient, yes, because to
> map/unmap files you basically need to go native (or to call close() on
> the original segment and accept some thread dances)
That's my initital plan for a mockup until native APIs can be called with ABI integration.
> But not so long-term there's gonna be the next chapter in the story -
> the ABI integration - which will let you create native method handles;
> that is method handles targeting native library functions. Once this
> capability comes in (and you can already try it out in the Panama repo),
> it is relatively easy to map/unmap a file directly, w/o using JNI code
> or any kind of external dll/so lib.
That's something where Elasticsearch is already waiting for. It currently uses JNA for locking pages or install a seccomp syscall filter on startup (https://www.elastic.co/de/blog/seccomp-in-the-elastic-stack, https://fosdem.org/2020/schedule/event/security_seccomp/attachments/slides/3881/export/events/attachments/security_seccomp/slides/3881/seccomp.pdf)
> For instance you could create a truly unsafe segment w/o even using the
> standard MemorySegment::mapFromFile; below is a simple Gist which shows
> how you can do something like that using the ABI support:
>
> https://gist.github.com/mcimadamore/128ee904157bb6c729a10596e69edffd
>
> (it is less than 100 LoC - half of it is the test logic).
That's indeed cool. The actual implementation will be a bit more hard, as you need to add a Windows part, too. But in general one could live with that.
This would also help to add fadvise() for standard iostream-based APIs like when you copy files (sse my other mail). Sometimes we are reading files to just copy/transform the contents one time and they get deleted afterwards. Kicking all already cached pages out of the FS cache, just because you load a huge file where you know that you never read it again is really something you should not do. So we can improve that, too. Although for some use-cases it would be cool if the standard NIO File API sjpuld provide ways to handle file caching, too. Because as soon as you change to native APIs, you have to also implement your own InputStreams (as the underlying file descriptor is unreachable). I hope Panama will allow to call methods and take a java.io.FileDescriptor or a Path object and it is somehow magically converted to a integer file descriptor.
> I think that, for power-users like you, this way of doing things is
> probably more direct than bending MappedByteBuffer, or
> MappedMemorySegment exactly the way you want it. At some point you are
> gonna need some extra customization (you mentioned about
> MADV_DONTNEED,
> other people mentioned MADV_REMOVE) which might make difference in your
> case; while in some cases some additional API points will be added to
> the JDK, we can't expect the JDK to support all possible ways in which a
> client might wish to interact with memory mapped files. But with custom
> memory segments + ABI support you don't need to wait on the JDK to give
> you the knobs you want - you can just reach for them directly.
I agree with that.
> I think that's a much saner way to get things done - in a way, the whole
> mappedXYZ business is a big workaround for the fact that we have no
> other ways in Java to reason about memory mapped files (which are
> useful!) but their behavior is ultimately platform-dependent, hence some
> of the APIs in MappedByteBuffer and MappedMemorySegment are "best
> effort" or simply do nothing on certain platforms (e.g. Windows).
MappedByteBuffer works very well on Windows. The only problems are the usual shit like you can't delete the file while a byte buffer is still alive!
> So, maybe what you need, ultimately, is your own custom segment factory.
> All still written in Java - but in a "different kind" of Java.
+1
> Maurizio
Thanks,
Uwe
More information about the panama-dev
mailing list