RFR: JDK-8312018: Improve reservation of class space and CDS [v4]

Ioi Lam iklam at openjdk.org
Mon Aug 28 16:35:11 UTC 2023


On Mon, 28 Aug 2023 15:40:54 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> I think the API looks OK. I need to spend more time looking at the algorithm itself.
>
>> @iklam is there anything missing from your point of view?
> 
> I just realized this -- for the above 32GB allocations, do we need to use the new algorithm for all platforms? As far as I know, only aarch64 and ppc64 need it because they want to use a single "load immediate" instruction.
> 
> For the other CPUs, we can just ask the OS. That will be faster, always succeed, and be at the "right" location as decided by the OS.

> > > @iklam is there anything missing from your point of view?
> > 
> > 
> > I just realized this -- for the above 32GB allocations, do we need to use the new algorithm for all platforms? As far as I know, only aarch64 and ppc64 need it because they want to use a single "load immediate" instruction.
> > For the other CPUs, we can just ask the OS. That will be faster, always succeed, and be at the "right" location as decided by the OS.
> 
> The argument for doing it on the remaining platforms (x64 and risc) would be that those, too, could profit from using 16-bit moves and short immediates, instead of - e.g. in the case of x64 - always emitting a giant 8-byte immediate for addi. 

Do you mean this?


  0x00007f13e73204e9:   mov    0x8(%rax),%ebx               ;; 1141:   __ load_klass(rbx, rax, rscratch1);
  0x00007f13e73204ec:   movabs $0x800000000,%r10
  0x00007f13e73204f6:   add    %r10,%rbx


I am not familiar with x64 instructions. I thought 64-bit immediate moves to a register must be 10 bytes (8 byte immediate value), if the value is larger than 32 bits. So you can't make the `movabs` instruction any shorter.


> And that the code would be better tested, since all platforms run through it.
> 
> OTOH, this could also be done in a follow-up. So, if you prefer it that way, I make that section aarch/ppc only.


For this PR, I would prefer doing it only on aarch64/ppc for the above 32GB allocations (otherwise we will have a regression for the other platforms -- there's now a chance of failure, at least theoretically).

The algorithm is still used on all plaforms for the lower allocations, right? So we will get some test mileage that way.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15041#issuecomment-1695997730


More information about the hotspot-runtime-dev mailing list