MemorySegment JVM memory leak

Uwe Schindler uschindler at apache.org
Thu Apr 23 08:47:35 UTC 2020


Hi,

> On 23/04/2020 01:34, Maurizio Cimadamore wrote:
> > No. But that doesn't seem to be a very important issue? If you don't
> > need the original safe segment, you can just ditch it and leave it to
> > the GC?
> 
> Ok, I see what you mean now.
> 
> You basically want that calling close() on the unsafe segment calls the
> cleanup action (unmap) of the original segment. And, it seems you don't
> want to write native code for it either, so you have no way to write a
> Runnable that does what you want.

The problem with that is, that the close() call must come from the same thread that originally created the segment.

I don't care about calling close on the unsafe segments, we can leave that implementation out, we just want to call close on the original segment when all is done (e.g, after days of Lucene usage), but we don't know which thread was originally opening it. So my idea would be:
Use a single-thread threadpool to submit tasks to solely handle the allocations/mmaps and later unmaps in a single thread (maybe a bottleneck). If Lucene needs a new mmapped segment for a file, send a task to this pool to mmap the original segments and create the unsafe one on top of it and return it using the task's Future. The original "safe" one is kept in that buffer pool, manages only by that single thread. When we want to unmap, we just send another task to the threadpool, which takes the original buffer out of its pool and frees it. By this the thread confinement of the original "safe" buffer is ensured.

The unsafe clones are never closed or garbage collected. That's by the way how Lucene currently work, so this is perfectly in line with current infrastructure. We create many duplicates of the original one and use them for searching logic, the original bytebuffer is kept.

> When I issued the patch originally, I allowed the possibility of passing
> an AutoCloseable as a cleanup action (not a Runnable), which meant you
> could pass the _safe_ segment as the cleanup action, and the cleanup
> action would have been called regardless of confinement.
> 
> During the review process, it was observed that this was perhaps a bit
> too magic - e.g. calling close() would leave the original segment in an
> unconsistent state (e.g. isAlive() == true, but memory freed).

Yeah. This is why I thought about the workaround. No need to unmap the unsafe closes, let's just keep the original one.

> I'll think a bit more whether something like that could be achieved in a
> way that doesn't add too much complexity to the API, I think I see maybe
> a path to get there (or maybe is the late hour :-) ).

Thanks. I have a plan for now and I can try it once there is something to test.

> > - Can you mapFromPath using a custom offset? Of course you canmap the
> whole file and create a slice, but you are wasting address space if you don't
> need it. The code behind MappedMemorySegmentImpl uses FileChannel, but
> always passes 0L as startOffset. Can we add the offset too? Should not be to
> hard to add this missing parameter.
> We could add an extra parameter - but also see above -
> ofNativeRestricted can be used to create a segment out of _any_ native
> address you might get (e.g. from a JNI call). You don't really need to
> be constrained by what's available in the API - although I agree that if
> something makes generally sense (and I think the offset parameter is in
> this category), then it should be added.

We don't want to use native code for mapping files (Lucene should be kept completely free of native code), so it would really be good to allow mmapping a part of a file instead of the whole one. If you always have to map the whole file, you may consume a lot of virtual address space (output in TOP in column VIRT, which isn't bad by design, but as most CPUs only have a 48 bits of address space and can't use everything for mapping, you are stuck with like 43 bits). So as you can give a length, why not allow to pass an offset, too - like in original FileChannel.map() call. Should I open a feature request? The implementation behind seems to support this already: https://tinyurl.com/y8pmz7kp (see the hardcoded 0L parameter in the FileChannelImpl.mapInternal call). I can make a pull request to add the offset to the API (I already signed the CLA).

Thanks,
Uwe



More information about the panama-dev mailing list