memorySegment.asByteBuffer().duplicate() causes multi-MB allocations
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Oct 26 10:17:59 UTC 2022
On 26/10/2022 10:56, Sebastian Stenzel wrote:
> Ok, let's consider this a measurement error for now. I will keep an
> eye on it and report if I get a reliably reproducible example. So
> hopefully never 😉
Heh.
The strangest thing is if you noticed a difference between 18 and 19.
The impl code has not changed at all, so I can't really explain how 19
would generate a spike in allocations.
Also, as we discussed yesterday, the `duplicate` method is really just
creating a dumb view over existing memory, so to get to 740Mb only with
these views you'll need lots of objects :-)
And... I also believe that the memory view you are looking at is just
looking at "shallow" size of an object. Which means you really have
740Mb only for the DirectByteBuffer instances, _excluding_ any on-heap
or off-heap data they might refer to.
This seems...... excessive.
But, I can see the stack trace now, and it does point at
DirectByteBuffer::duplicate - which makes sense: this method *is*
creating a new view. But, JMC says there's 72 calls to `duplicate`, and
740MB of memory. 10Mb per direct buffer doesn't make any sense
(considering this is shallow).
Another strange thing is that if you click on DirectByteBuffer, in
"before" we see a massive spike at the beginning, then nothing. In
"after" we see lots of allocation spread throughout the applications.
These allocations come from MemorySegment::asByteBuffer, so, a different
place.
My best explanation is that this method is invoked a lot, and all the
buffer views created are short-lived, so they do not impact GC activity
in any significant way. After a while, better inlining kicks in, and
allocations are removed entirely, as DirectByteBuffer::duplicate is
inlined and its result likely scalarized.
All this seems very very strange - almost like you are comparing
different runs doing different things. Or, perhaps, some measurement error.
Maurizio
>
>> On 26. Oct 2022, at 11:51, Maurizio Cimadamore
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>> Ok, my JMC was out of date. Now I see what you see :-)
>>
>> It is indeed odd - and also I note that all the allocations occur at
>> the start of the recording. I wonder if maybe these are just direct
>> buffer instances that have been created by some other bits of the JDK
>> (several buffers are created at initialization time).
>>
>> Without the ability to look at where the allocations came from it is
>> a bit hard to guess. The fact that you are having troubles
>> reproducing it makes me think that you got lucky/unlucky and the
>> profiler happened to stumble on some burst direct buffer allocation,
>> presumably at JVM startup.
>>
>> Maybe a good experiment would be to lower the polling granularity,
>> and see if you can spot the allocation on all the runs?
>>
>> Maurizio
>>
>>
>> On 26/10/2022 10:18, Maurizio Cimadamore wrote:
>>>
>>> Uhm I don't have all that data that you seem to have in your image :-(
>>>
>>> My memory view looks empty. There is no allocation, and if I
>>> explicitly go and look I see 6-7 allocations in total :-)
>>>
>>> Is the recording you shared the same you are looking at?
>>>
>>> Maurizio
>>>
>>> On 26/10/2022 10:11, Sebastian Stenzel wrote:
>>>> Thanks for taking a look.
>>>>
>>>> Please note that I'm not looking for a memory leak here: The
>>>> objects are short-living (within the scope of a method) and no two
>>>> instances exist at the same time, hence no increase in heap size.
>>>>
>>>> Instead, what got me wondering was the mismatch between the amount
>>>> of memory allocated for a rather small number of objects (740 MiB
>>>> for 141 DirectByteBuffers). At least the record only contains these
>>>> 72 allocations via `ByteBuffer.duplicate()` + 69
>>>> `NativeMemorySegmentImpl.makeByteBuffer()`. I'm looking at this
>>>> chart: https://static.cryptomator.org/tmp/before890035c.png
>>>> <https://urldefense.com/v3/__https://static.cryptomator.org/tmp/before890035c.png__;!!ACWV5N9M2RV99hQ!JckaBSJm5So1bCqFKc-DKIJwby_3IL0NrwJF28uq06RV8KF5cODe9zClnOFenKodB7r-jU1KwoEwrH3lNJ9ee622dbMqw0eIuQ$>
>>>>
>>>> I have very limited experience with JFR and JMC and don't know if
>>>> it is profiling of sampling allocations. But maybe the events are
>>>> incomplete and I'm in fact dealing with thousands of instances?
>>>>
>>>>
>>>>> On 26. Oct 2022, at 10:53, Maurizio Cimadamore
>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>
>>>>> Looking at this recordings with JMC is inconclusive.
>>>>>
>>>>> All I see is that your app seems to be using up to 20MB heap,
>>>>> which seems rather low.
>>>>>
>>>>> The number of live objects, at least according to the tool is a
>>>>> handful (< 10).
>>>>>
>>>>> I don't see anything here indicating a problem? (but maybe I'm
>>>>> looking in the wrong place?)
>>>>>
>>>>> Maurizio
>>>>>
>>>>> On 25/10/2022 15:15, Sebastian Stenzel wrote:
>>>>>> Ok so here are the recordings:
>>>>>>
>>>>>> with duplicate():
>>>>>> https://static.cryptomator.org/tmp/before890035c.jfr
>>>>>> <https://urldefense.com/v3/__https://static.cryptomator.org/tmp/before890035c.jfr__;!!ACWV5N9M2RV99hQ!JDzHI5oiHSZJaLRKXbpqU6kqSKHSD0yZEIW8lQT4wRJq0M-_iS6Kz59xcSpcX_sMupZlUpzmv9uLpTNulFGUEIRVe93lmH_Uzw$>
>>>>>> without: https://static.cryptomator.org/tmp/after890035c.jfr
>>>>>> <https://urldefense.com/v3/__https://static.cryptomator.org/tmp/after890035c.jfr__;!!ACWV5N9M2RV99hQ!JDzHI5oiHSZJaLRKXbpqU6kqSKHSD0yZEIW8lQT4wRJq0M-_iS6Kz59xcSpcX_sMupZlUpzmv9uLpTNulFGUEIRVe93JaaOc7g$>
>>>>>>
>>>>>> This is the diff in code:
>>>>>> https://github.com/cryptomator/jfuse/commit/890035c702489cdb2e094a1fc767e75b15b04891
>>>>>> <https://urldefense.com/v3/__https://github.com/cryptomator/jfuse/commit/890035c702489cdb2e094a1fc767e75b15b04891__;!!ACWV5N9M2RV99hQ!JDzHI5oiHSZJaLRKXbpqU6kqSKHSD0yZEIW8lQT4wRJq0M-_iS6Kz59xcSpcX_sMupZlUpzmv9uLpTNulFGUEIRVe92yJr55Hg$>
>>>>>>
>>>>>> I was testing the `read` method in this class. `write` (with its
>>>>>> `.asReadOnlyBuffer()`) did not draw my attention.
>>>>>>
>>>>>>> On 25. Oct 2022, at 15:03, Maurizio Cimadamore
>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>
>>>>>>> No worries.
>>>>>>>
>>>>>>> I guess having a look at jfr file would be useful. But note that
>>>>>>> I don't think you can send that to the mailing list - AFAIK,
>>>>>>> attachments are dropped.
>>>>>>>
>>>>>>> Also, forgot to ask - is this something for which you saw a
>>>>>>> difference between 18 and 19? Because that would be odd too.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 25/10/2022 13:26, Sebastian Stenzel wrote:
>>>>>>>> I can't reproduce it any longer. 😖
>>>>>>>>
>>>>>>>> I swear I would not have consulted the mailing list, if the
>>>>>>>> observation had not been that spooky. Especially regarding said
>>>>>>>> discrepency between `asReadOnlyBuffer()` and `duplicate()`.
>>>>>>>>
>>>>>>>> I still have the .jfr file, though. Is this of any use (other
>>>>>>>> than proving I'm not seeing ghosts)?
>>>>>>>>
>>>>>>>>> On 25. Oct 2022, at 13:15, Maurizio Cimadamore
>>>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 25/10/2022 12:04, Maurizio Cimadamore wrote:
>>>>>>>>>> ```
>>>>>>>>>> public MappedByteBuffer duplicate() {
>>>>>>>>>> return new DirectByteBuffer(this,
>>>>>>>>>> this.markValue(),
>>>>>>>>>> this.position(),
>>>>>>>>>> this.limit(),
>>>>>>>>>> this.capacity(),
>>>>>>>>>> 0,
>>>>>>>>>>
>>>>>>>>>> fileDescriptor(),
>>>>>>>>>> isSync(),
>>>>>>>>>>
>>>>>>>>>> segment);
>>>>>>>>>> }
>>>>>>>>>> ```
>>>>>>>>> Btw, the implementation of `adReadOnlyBuffer` is virtually
>>>>>>>>> identical to the above - the only difference is that
>>>>>>>>> DirectBuffer::asReadOnlyBuffer creates a new DIrectBufferR
>>>>>>>>> (where R stands for Read Only), not just another DirectBuffer.
>>>>>>>>> On read-only direct buffers (DirectBufferR), asReadOnly buffer
>>>>>>>>> just calls duplicate().
>>>>>>>>>
>>>>>>>>> So, I think it's very strange that one one of them misbehave,
>>>>>>>>> since the implementation is so close.
>>>>>>>>>
>>>>>>>>> What do you use to monitor the allocations? One theory is that
>>>>>>>>> perhaps one version of the code "escapes" better than the
>>>>>>>>> other, so the numbers you see are skewed, and in some cases
>>>>>>>>> you see the actual BB allocations on the heap, whereas in the
>>>>>>>>> other case you don't see anything because the JVM scalarizes
>>>>>>>>> the BB views. But again, the `duplicate` and
>>>>>>>>> `adreadOnlyBuffer` are so close to each other that it's hard
>>>>>>>>> to imagine meaningful differences in terms of escape analysis
>>>>>>>>> (even though escape analsyis can sometimes act in surprising
>>>>>>>>> ways).
>>>>>>>>>
>>>>>>>>> Maurizio
>>>>>>>>>
>>>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20221026/47d5568d/attachment-0001.htm>
More information about the panama-dev
mailing list