memorySegment.asByteBuffer().duplicate() causes multi-MB allocations

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed Oct 26 10:17:59 UTC 2022


On 26/10/2022 10:56, Sebastian Stenzel wrote:
> Ok, let's consider this a measurement error for now. I will keep an 
> eye on it and report if I get a reliably reproducible example. So 
> hopefully never 😉

Heh.

The strangest thing is if you noticed a difference between 18 and 19. 
The impl code has not changed at all, so I can't really explain how 19 
would generate a spike in allocations.

Also, as we discussed yesterday, the `duplicate` method is really just 
creating a dumb view over existing memory, so to get to 740Mb only with 
these views you'll need lots of objects :-)

And... I also believe that the memory view you are looking at is just 
looking at "shallow" size of an object. Which means you really have 
740Mb only for the DirectByteBuffer instances, _excluding_ any on-heap 
or off-heap data they might refer to.

This seems...... excessive.

But, I can see the stack trace now, and it does point at 
DirectByteBuffer::duplicate - which makes sense: this method *is* 
creating a new view. But, JMC says there's 72 calls to `duplicate`, and 
740MB of memory. 10Mb per direct buffer doesn't make any sense 
(considering this is shallow).

Another strange thing is that if you click on DirectByteBuffer, in 
"before" we see a massive spike at the beginning, then nothing. In 
"after" we see lots of allocation spread throughout the applications. 
These allocations come from MemorySegment::asByteBuffer, so, a different 
place.

My best explanation is that this method is invoked a lot, and all the 
buffer views created are short-lived, so they do not impact GC activity 
in any significant way. After a while, better inlining kicks in, and 
allocations are removed entirely, as DirectByteBuffer::duplicate is 
inlined and its result likely scalarized.

All this seems very very strange - almost like you are comparing 
different runs doing different things. Or, perhaps, some measurement error.

Maurizio


>
>> On 26. Oct 2022, at 11:51, Maurizio Cimadamore 
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>> Ok, my JMC was out of date. Now I see what you see :-)
>>
>> It is indeed odd - and also I note that all the allocations occur at 
>> the start of the recording. I wonder if maybe these are just direct 
>> buffer instances that have been created by some other bits of the JDK 
>> (several buffers are created at initialization time).
>>
>> Without the ability to look at where the allocations came from it is 
>> a bit hard to guess. The fact that you are having troubles 
>> reproducing it makes me think that you got lucky/unlucky and the 
>> profiler happened to stumble on some burst direct buffer allocation, 
>> presumably at JVM startup.
>>
>> Maybe a good experiment would be to lower the polling granularity, 
>> and see if you can spot the allocation on all the runs?
>>
>> Maurizio
>>
>>
>> On 26/10/2022 10:18, Maurizio Cimadamore wrote:
>>>
>>> Uhm I don't have all that data that you seem to have in your image :-(
>>>
>>> My memory view looks empty. There is no allocation, and if I 
>>> explicitly go and look I see 6-7 allocations in total :-)
>>>
>>> Is the recording you shared the same you are looking at?
>>>
>>> Maurizio
>>>
>>> On 26/10/2022 10:11, Sebastian Stenzel wrote:
>>>> Thanks for taking a look.
>>>>
>>>> Please note that I'm not looking for a memory leak here: The 
>>>> objects are short-living (within the scope of a method) and no two 
>>>> instances exist at the same time, hence no increase in heap size.
>>>>
>>>> Instead, what got me wondering was the mismatch between the amount 
>>>> of memory allocated for a rather small number of objects (740 MiB 
>>>> for 141 DirectByteBuffers). At least the record only contains these 
>>>> 72 allocations via `ByteBuffer.duplicate()` + 69 
>>>> `NativeMemorySegmentImpl.makeByteBuffer()`. I'm looking at this 
>>>> chart: https://static.cryptomator.org/tmp/before890035c.png 
>>>> <https://urldefense.com/v3/__https://static.cryptomator.org/tmp/before890035c.png__;!!ACWV5N9M2RV99hQ!JckaBSJm5So1bCqFKc-DKIJwby_3IL0NrwJF28uq06RV8KF5cODe9zClnOFenKodB7r-jU1KwoEwrH3lNJ9ee622dbMqw0eIuQ$>
>>>>
>>>> I have very limited experience with JFR and JMC and don't know if 
>>>> it is profiling of sampling allocations. But maybe the events are 
>>>> incomplete and I'm in fact dealing with thousands of instances?
>>>>
>>>>
>>>>> On 26. Oct 2022, at 10:53, Maurizio Cimadamore 
>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>
>>>>> Looking at this recordings with JMC is inconclusive.
>>>>>
>>>>> All I see is that your app seems to be using up to 20MB heap, 
>>>>> which seems rather low.
>>>>>
>>>>> The number of live objects, at least according to the tool is a 
>>>>> handful (< 10).
>>>>>
>>>>> I don't see anything here indicating a problem? (but maybe I'm 
>>>>> looking in the wrong place?)
>>>>>
>>>>> Maurizio
>>>>>
>>>>> On 25/10/2022 15:15, Sebastian Stenzel wrote:
>>>>>> Ok so here are the recordings:
>>>>>>
>>>>>> with duplicate(): 
>>>>>> https://static.cryptomator.org/tmp/before890035c.jfr 
>>>>>> <https://urldefense.com/v3/__https://static.cryptomator.org/tmp/before890035c.jfr__;!!ACWV5N9M2RV99hQ!JDzHI5oiHSZJaLRKXbpqU6kqSKHSD0yZEIW8lQT4wRJq0M-_iS6Kz59xcSpcX_sMupZlUpzmv9uLpTNulFGUEIRVe93lmH_Uzw$>
>>>>>> without: https://static.cryptomator.org/tmp/after890035c.jfr 
>>>>>> <https://urldefense.com/v3/__https://static.cryptomator.org/tmp/after890035c.jfr__;!!ACWV5N9M2RV99hQ!JDzHI5oiHSZJaLRKXbpqU6kqSKHSD0yZEIW8lQT4wRJq0M-_iS6Kz59xcSpcX_sMupZlUpzmv9uLpTNulFGUEIRVe93JaaOc7g$>
>>>>>>
>>>>>> This is the diff in code: 
>>>>>> https://github.com/cryptomator/jfuse/commit/890035c702489cdb2e094a1fc767e75b15b04891 
>>>>>> <https://urldefense.com/v3/__https://github.com/cryptomator/jfuse/commit/890035c702489cdb2e094a1fc767e75b15b04891__;!!ACWV5N9M2RV99hQ!JDzHI5oiHSZJaLRKXbpqU6kqSKHSD0yZEIW8lQT4wRJq0M-_iS6Kz59xcSpcX_sMupZlUpzmv9uLpTNulFGUEIRVe92yJr55Hg$>
>>>>>>
>>>>>> I was testing the `read` method in this class. `write` (with its 
>>>>>> `.asReadOnlyBuffer()`) did not draw my attention.
>>>>>>
>>>>>>> On 25. Oct 2022, at 15:03, Maurizio Cimadamore 
>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>
>>>>>>> No worries.
>>>>>>>
>>>>>>> I guess having a look at jfr file would be useful. But note that 
>>>>>>> I don't think you can send that to the mailing list - AFAIK, 
>>>>>>> attachments are dropped.
>>>>>>>
>>>>>>> Also, forgot to ask - is this something for which you saw a 
>>>>>>> difference between 18 and 19? Because that would be odd too.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 25/10/2022 13:26, Sebastian Stenzel wrote:
>>>>>>>> I can't reproduce it any longer. 😖
>>>>>>>>
>>>>>>>> I swear I would not have consulted the mailing list, if the 
>>>>>>>> observation had not been that spooky. Especially regarding said 
>>>>>>>> discrepency between `asReadOnlyBuffer()` and `duplicate()`.
>>>>>>>>
>>>>>>>> I still have the .jfr file, though. Is this of any use (other 
>>>>>>>> than proving I'm not seeing ghosts)?
>>>>>>>>
>>>>>>>>> On 25. Oct 2022, at 13:15, Maurizio Cimadamore 
>>>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 25/10/2022 12:04, Maurizio Cimadamore wrote:
>>>>>>>>>> ```
>>>>>>>>>> public MappedByteBuffer duplicate() {
>>>>>>>>>>         return new DirectByteBuffer(this,
>>>>>>>>>>                                               this.markValue(),
>>>>>>>>>>                                               this.position(),
>>>>>>>>>>                                               this.limit(),
>>>>>>>>>>                                               this.capacity(),
>>>>>>>>>>                                               0,
>>>>>>>>>>
>>>>>>>>>>                                               fileDescriptor(),
>>>>>>>>>>                                               isSync(),
>>>>>>>>>>
>>>>>>>>>>                                               segment);
>>>>>>>>>>     }
>>>>>>>>>> ```
>>>>>>>>> Btw, the implementation of `adReadOnlyBuffer` is virtually 
>>>>>>>>> identical to the above - the only difference is that 
>>>>>>>>> DirectBuffer::asReadOnlyBuffer creates a new DIrectBufferR 
>>>>>>>>> (where R stands for Read Only), not just another DirectBuffer. 
>>>>>>>>> On read-only direct buffers (DirectBufferR), asReadOnly buffer 
>>>>>>>>> just calls duplicate().
>>>>>>>>>
>>>>>>>>> So, I think it's very strange that one one of them misbehave, 
>>>>>>>>> since the implementation is so close.
>>>>>>>>>
>>>>>>>>> What do you use to monitor the allocations? One theory is that 
>>>>>>>>> perhaps one version of the code "escapes" better than the 
>>>>>>>>> other, so the numbers you see are skewed, and in some cases 
>>>>>>>>> you see the actual BB allocations on the heap, whereas in the 
>>>>>>>>> other case you don't see anything because the JVM scalarizes 
>>>>>>>>> the BB views. But again, the `duplicate` and 
>>>>>>>>> `adreadOnlyBuffer` are so close to each other that it's hard 
>>>>>>>>> to imagine meaningful differences in terms of escape analysis 
>>>>>>>>> (even though escape analsyis can sometimes act in surprising 
>>>>>>>>> ways).
>>>>>>>>>
>>>>>>>>> Maurizio
>>>>>>>>>
>>>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20221026/47d5568d/attachment-0001.htm>


More information about the panama-dev mailing list