RFR Bug-pending: Enable Hotspot to Track Native Memory Usage for Direct Byte Buffers
Hi All, Native memory allocation for DBBs is tracked in java.nio.Bits, but that only includes what the user thinks they are allocating. When the VM adds extra memory to the allocation amount this extra bit is not represented in the Bits total. A cursory glance shows, minimum, that we round the requested memory quantity up to the heap word size in the Unsafe.allocateMemory code, and something to do with nmt_header_size in os:malloc() (os.cpp) too. On its own, and in small quantities, align_up(sz, HeapWordSize) isn't that big of an issue. But when you allocate a lot of DBBs, and coupled with the nmt_header_size business, it makes the Bits values wrong. The more DBB allocations, the more inaccurate those numbers will be. To get the "+X", it seems to me that the best option would be to introduce an native method in Bits that fetches "X" directly from Hotspot, using the same code that Hotspot uses (so we'd have to abstract-out the Hotspot logic that adds X to the memory quantity). This way, anyone modifying the Hotspot logic won't risk rendering the Bits logic wrong again. That's only one way to fix the accuracy problem here though. Suggestions welcome. Best Regards Adam Farley Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
On Tue, Jun 5, 2018 at 3:46 PM, Adam Farley8 <adam.farley@uk.ibm.com> wrote:
Hi All,
Native memory allocation for DBBs is tracked in java.nio.Bits, but that only includes what the user thinks they are allocating.
Which is exactly what I would expect as a user...
When the VM adds extra memory to the allocation amount this extra bit is not represented in the Bits total. A cursory glance shows, minimum, that we round the requested memory quantity up to the heap word size in the Unsafe.allocateMemory code
which I do not understand either - why do we do this? After all, normal allocations from inside hotspot do not get aligned up in size, and the java doc to Unsafe allocateMemory does not state anything about the size being aligned. In addition to questioning the align up of the user requested size, I would be in favor of adding a new NMT tag for these, maybe "mtUnsafe"? That would be an easy fix.
, and something to do with nmt_header_size in os:malloc() (os.cpp) too.
That is mighty unspecific and also wrong. The align-up mentioned above goes into the size reported by Bits; the nmt header size does not.
On its own, and in small quantities, align_up(sz, HeapWordSize) isn't that big of an issue. But when you allocate a lot of DBBs, and coupled with the nmt_header_size business, it makes the Bits values wrong. The more DBB allocations, the more inaccurate those numbers will be.
To be annoyingly precise, it will never be more wrong than 1:7 on 64bit machines :) - if all memory requested via Unsafe.allocateMemory would be of size 1 byte.
To get the "+X", it seems to me that the best option would be to introduce an native method in Bits that fetches "X" directly from Hotspot, using the same code that Hotspot uses (so we'd have to abstract-out the Hotspot logic that adds X to the memory quantity). This way, anyone modifying the Hotspot logic won't risk rendering the Bits logic wrong again.
I don't follow that.
That's only one way to fix the accuracy problem here though. Suggestions welcome.
You are throwing two effects together: - As mentioned above, I consider the align-up of the user requested size to be at least questionable. It shows up as user size in NMT which should not be. I also fail to see a compelling reason for it, but maybe someone else can enlighten me. - But anything else - NMT headers, overwriter guards, etc added by the VM I consider in the same class as any other overhead incurred e.g. by the CRT or the OS when calling malloc (e.g. malloc allocator bucket size). Basically, rss will go up by more than size requested by malloc. Something maybe worth noting, but IMHO not as part of the numbers returned by java.nio.Bits. Just my 2 cents. Best Regards, Thomas
Best Regards
Adam Farley Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
On 06/05/2018 12:10 PM, Thomas Stüfe wrote:> On Tue, Jun 5, 2018 at 3:46 PM, Adam Farley8 <adam.farley@uk.ibm.com> wrote:
Hi All,
Native memory allocation for DBBs is tracked in java.nio.Bits, but that only includes what the user thinks they are allocating.
Which is exactly what I would expect as a user...
I agree with Thomas, there is no point for a user to aware of tracking overhead, and the overhead only incurs when native memory tracking is on. As a matter of fact, it can really confuse user that values can be varied, depending on whether native memory tracking is on. Thanks, -Zhengyu
When the VM adds extra memory to the allocation amount this extra bit is not represented in the Bits total. A cursory glance shows, minimum, that we round the requested memory quantity up to the heap word size in the Unsafe.allocateMemory code
which I do not understand either - why do we do this? After all, normal allocations from inside hotspot do not get aligned up in size, and the java doc to Unsafe allocateMemory does not state anything about the size being aligned.
In addition to questioning the align up of the user requested size, I would be in favor of adding a new NMT tag for these, maybe "mtUnsafe"? That would be an easy fix.
, and something to do with nmt_header_size in os:malloc() (os.cpp) too.
That is mighty unspecific and also wrong. The align-up mentioned above goes into the size reported by Bits; the nmt header size does not.
On its own, and in small quantities, align_up(sz, HeapWordSize) isn't that big of an issue. But when you allocate a lot of DBBs, and coupled with the nmt_header_size business, it makes the Bits values wrong. The more DBB allocations, the more inaccurate those numbers will be.
To be annoyingly precise, it will never be more wrong than 1:7 on 64bit machines :) - if all memory requested via Unsafe.allocateMemory would be of size 1 byte.
To get the "+X", it seems to me that the best option would be to introduce an native method in Bits that fetches "X" directly from Hotspot, using the same code that Hotspot uses (so we'd have to abstract-out the Hotspot logic that adds X to the memory quantity). This way, anyone modifying the Hotspot logic won't risk rendering the Bits logic wrong again.
I don't follow that.
That's only one way to fix the accuracy problem here though. Suggestions welcome.
You are throwing two effects together:
- As mentioned above, I consider the align-up of the user requested size to be at least questionable. It shows up as user size in NMT which should not be. I also fail to see a compelling reason for it, but maybe someone else can enlighten me.
- But anything else - NMT headers, overwriter guards, etc added by the VM I consider in the same class as any other overhead incurred e.g. by the CRT or the OS when calling malloc (e.g. malloc allocator bucket size). Basically, rss will go up by more than size requested by malloc. Something maybe worth noting, but IMHO not as part of the numbers returned by java.nio.Bits.
Just my 2 cents.
Best Regards, Thomas
Best Regards
Adam Farley Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Hi Folks,
Zhengyu Gu <zgu@redhat.com> wrote on 06/06/2018 01:58:18:
From: Zhengyu Gu <zgu@redhat.com> To: "Thomas Stüfe" <thomas.stuefe@gmail.com>, Adam Farley8 <adam.farley@uk.ibm.com> Cc: "hotspot-dev@openjdk.java.net developers" <hotspot- dev@openjdk.java.net>, core-libs-dev <core-libs-dev@openjdk.java.net> Date: 06/06/2018 01:58 Subject: Re: RFR Bug-pending: Enable Hotspot to Track Native Memory Usage for Direct Byte Buffers
On 06/05/2018 12:10 PM, Thomas Stüfe wrote:> On Tue, Jun 5, 2018 at 3:46
PM, Adam Farley8 <adam.farley@uk.ibm.com> wrote:
Hi All,
Native memory allocation for DBBs is tracked in java.nio.Bits, but that only includes what the user thinks they are allocating.
Which is exactly what I would expect as a user...
A debugger poring over the values in the file may prefer total accuracy in at least one of the variables, so they can find out where all their memory went in the event of a leak. Even if this variable is not accessible via a getter method, and is only readable via a system core.
I agree with Thomas, there is no point for a user to aware of tracking overhead, and the overhead only incurs when native memory tracking is on. As a matter of fact, it can really confuse user that values can be varied, depending on whether native memory tracking is on.
Thanks,
-Zhengyu
I agree that the casual user shouldn't have to worry. This accuracy would be for analysis after the fact, via system cores. The standard variables can stay as they are. I suggest the addition of a single AtomicLong that shows an accurate value for the sole purpose of aiding debugging.
When the VM adds extra memory to the allocation amount this extra bit
is
not represented in the Bits total. A cursory glance shows, minimum, that we round the requested memory quantity up to the heap word size in the Unsafe.allocateMemory code
which I do not understand either - why do we do this? After all, normal allocations from inside hotspot do not get aligned up in size, and the java doc to Unsafe allocateMemory does not state anything about the size being aligned.
In addition to questioning the align up of the user requested size, I would be in favor of adding a new NMT tag for these, maybe "mtUnsafe"? That would be an easy fix.
, and something to do with nmt_header_size in os:malloc() (os.cpp) too.
That is mighty unspecific and also wrong. The align-up mentioned above goes into the size reported by Bits; the nmt header size does not.
I believe we agree here too. My point is that, for the sake of accuracy, we *should* have this information in Bits. This is part of the debugger-aid change that I am suggesting.
On its own, and in small quantities, align_up(sz, HeapWordSize) isn't
that
big of an issue. But when you allocate a lot of DBBs, and coupled with the nmt_header_size business, it makes the Bits values wrong. The more DBB allocations, the more inaccurate those numbers will be.
To be annoyingly precise, it will never be more wrong than 1:7 on 64bit machines :) - if all memory requested via Unsafe.allocateMemory would be of size 1 byte.
Sounds like the sort of thing I'd do. Once a stress tester, always a stress tester. :)
To get the "+X", it seems to me that the best option would be to
introduce
an native method in Bits that fetches "X" directly from Hotspot, using the same code that Hotspot uses (so we'd have to abstract-out the Hotspot logic that adds X to the memory quantity). This way, anyone modifying the Hotspot logic won't risk rendering the Bits logic wrong again.
I don't follow that.
I was trying to describe one method to enable the VM to tell Bits how much memory will actually be reserved for a given amount of DBB. E.g. Bits says it has a DBB 7 bytes in size, and it tells the VM. The VM replies with "OK, if you came to me and asked for 7 bytes, I'd reserve 8.", and then Bits can update that debugging variable I mentioned. If we abstract out the logic, then Bits and the VM would be using the exact same code when telling Bits how much memory will *hypothetically* be added, as we do when the VM is determining how much overhead it needs when actually reserving the memory. Is that clearer?
That's only one way to fix the accuracy problem here though.
Suggestions
welcome.
You are throwing two effects together:
- As mentioned above, I consider the align-up of the user requested size to be at least questionable. It shows up as user size in NMT which should not be. I also fail to see a compelling reason for it, but maybe someone else can enlighten me.
Well, if we got rid of it then that's one way to make the Bits variables accurate. :)
- But anything else - NMT headers, overwriter guards, etc added by the VM I consider in the same class as any other overhead incurred e.g. by the CRT or the OS when calling malloc (e.g. malloc allocator bucket size). Basically, rss will go up by more than size requested by malloc. Something maybe worth noting, but IMHO not as part of the numbers returned by java.nio.Bits.
We agree again. No need to confuse things by altering the return values. Simply store the accurate information internally as a debugging aid.
Just my 2 cents.
And they are appreciated. Apologies for the delay in my response. - Adam
Best Regards, Thomas
Best Regards
Adam Farley Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with
number
741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (3)
-
Adam Farley8
-
Thomas Stüfe
-
Zhengyu Gu