Hi All I am currently working on JDK-8216554 , which is related to optimise loading of constants in c2 compilation by determining the size of TOC in post_alloc stage. This is for PowerPC Currently there is a default code that emits combination of addis+ld instruction even in possible cases where only ld instruction can be emitted. There is an alternate code based on small size of TOC that emits only ld, but we are not taking it. I tried to do a gdb debug, i am unable to get the constant table size at post_alloc stage. It is only after the emit stage we get the size of TOC. Does that mean we cannot ever get accurate size in post_alloc stage ? I also tried a test case and forced the c2 compiler emit ld only. Interestingly, the global size of TOC was large and calculated much later, and macroassembler puts in addis+addi . So the case of large offset, even if miscalculated in post_alloc stage , is handled in some way. Does that mean we can just keep the conservative path of emitting Addis+addi ? Or maybe we can even remove the helper function in post_alloc stage. Open to your suggestions. Will provide a few debug logs too as per your request. Thanks Suchismith Roy
If I understand what you are trying to do, then I suggest to delay the decision until the emit stage, when the offset is known. So instead of deciding early in loadConLNodesTuple_create, have it create a "smart" loadConLNode that can emit either addis+ld or *ld*, depending on the offset. dl On 2/13/26 4:52 AM, Suchismith Roy wrote:
Hi All
I am currently working on JDK-8216554 , which is related to optimise loading of constants in c2 compilation by determining the size of TOC in post_alloc stage. This is for PowerPC Currently there is a default code that emits combination of addis+ld instruction even in possible cases where only ld instruction can be emitted. There is an alternate code based on small size of TOC that emits only ld, but we are not taking it.
I tried to do a gdb debug, i am unable to get the constant table size at post_alloc stage. It is only after the emit stage we get the size of TOC.
Does that mean we cannot ever get accurate size in post_alloc stage ?
I also tried a test case and forced the c2 compiler emit ld only.
Interestingly, the global size of TOC was large and calculated much later, and macroassembler puts in addis+addi .
So the case of large offset, even if miscalculated in post_alloc stage , is handled in some way.
Does that mean we can just keep the conservative path of emitting Addis+addi ? Or maybe we can even remove the helper function in post_alloc stage.
Open to your suggestions. Will provide a few debug logs too as per your request.
Thanks Suchismith Roy
Hi Dean Thanks for your suggestions. I created the smart loadConLNode which decide s on set of instruction in the emit stage. During emit stage I used the global offset TOC address to decide on set of instructions, because the the code in macro assembler does this. However this caused a bunch of issues for other methods as they were dependent on method TOC and not global TOC. So currently in the emit stage we decide based on method TOC offset. ((loadConLNode*)this)->_cbuf_insts_offset = __ offset(); int toc_offset = __ offset_to_method_toc(const_toc_addr); if (Assembler::is_simm(toc_offset, 16)) { __ ld($dst$$Register, toc_offset, $toc$$Register); __ nop(); } else { __ addis($dst$$Register, $toc$$Register, MacroAssembler::largeoffset_si16_si16_hi(toc_offset)); __ ld($dst$$Register, MacroAssembler::largeoffset_si16_si16_lo(toc_offset), $dst$$Register); } Is this a valid thing to do ? The reason I ask this is because, the offsets are usually small and then ultimately the global TOC calculation in macro assembler overrides it. Thanks Suchismith Roy From: ppc-aix-port-dev <ppc-aix-port-dev-retn@openjdk.org> on behalf of Dean Long <dean.long@oracle.com> Date: Saturday, 14 February 2026 at 5:53 AM To: ppc-aix-port-dev@openjdk.org <ppc-aix-port-dev@openjdk.org> Subject: [EXTERNAL] Re: JDK-8216554 This Message Is From an External Sender This message came from outside your organization. Report Suspicious<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/AdhS1Rd-!99FR2Px40g57f1bTxERKUtfKywbjgTITBgtVdMCRSrX8uQfPfPcntpmyitbZgoe2PhXCXBOZWP1HiFguEjvgfyaqicyagvyQmwt5$> If I understand what you are trying to do, then I suggest to delay the decision until the emit stage, when the offset is known. So instead of deciding early in loadConLNodesTuple_create, have it create a "smart" loadConLNode that can emit either addis+ld or ld, depending on the offset. dl On 2/13/26 4:52 AM, Suchismith Roy wrote: Hi All I am currently working on JDK-8216554 , which is related to optimise loading of constants in c2 compilation by determining the size of TOC in post_alloc stage. This is for PowerPC Currently there is a default code that emits combination of addis+ld instruction even in possible cases where only ld instruction can be emitted. There is an alternate code based on small size of TOC that emits only ld, but we are not taking it. I tried to do a gdb debug, i am unable to get the constant table size at post_alloc stage. It is only after the emit stage we get the size of TOC. Does that mean we cannot ever get accurate size in post_alloc stage ? I also tried a test case and forced the c2 compiler emit ld only. Interestingly, the global size of TOC was large and calculated much later, and macroassembler puts in addis+addi . So the case of large offset, even if miscalculated in post_alloc stage , is handled in some way. Does that mean we can just keep the conservative path of emitting Addis+addi ? Or maybe we can even remove the helper function in post_alloc stage. Open to your suggestions. Will provide a few debug logs too as per your request. Thanks Suchismith Roy
Yes, that's pretty much the pattern I was suggesting, but I'm not a PPC expert, so I don't understand the difference between a method TOC and a global TOC here, or what the macro assembler is overwriting. But what you have seems consistent with the existing enc_load_long_constL and enc_load_long_constL_hi, so I would expect it to work. dl On 2/22/26 11:45 PM, Suchismith Roy wrote:
Hi Dean Thanks for your suggestions.
I created the smart loadConLNode which decide s on set of instruction in the emit stage.
During emit stage I used the global offset TOC address to decide on set of instructions, because the the code in macro assembler does this. However this caused a bunch of issues for other methods as they were dependent on method TOC and not global TOC.
So currently in the emit stage we decide based on method TOC offset.
((loadConLNode*)this)->_cbuf_insts_offset = __ offset();
int toc_offset = __ offset_to_method_toc(const_toc_addr);
if (Assembler::is_simm(toc_offset, 16)) {
__ ld($dst$$Register, toc_offset, $toc$$Register);
__ nop();
} else {
__ addis($dst$$Register, $toc$$Register, MacroAssembler::largeoffset_si16_si16_hi(toc_offset));
__ ld($dst$$Register, MacroAssembler::largeoffset_si16_si16_lo(toc_offset), $dst$$Register);
}
Is this a valid thing to do ? The reason I ask this is because, the offsets are usually small and then ultimately the global TOC calculation in macro assembler overrides it.
Thanks Suchismith Roy
*From: *ppc-aix-port-dev <ppc-aix-port-dev-retn@openjdk.org> on behalf of Dean Long <dean.long@oracle.com> *Date: *Saturday, 14 February 2026 at 5:53 AM *To: *ppc-aix-port-dev@openjdk.org <ppc-aix-port-dev@openjdk.org> *Subject: *[EXTERNAL] Re: JDK-8216554
This Message Is From an External Sender This message came from outside your organization. Report Suspicious <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/AdhS1Rd-!99FR2Px40g57f1bTxERKUtfKywbjgTITBgtVdMCRSrX8uQfPfPcntpmyitbZgoe2PhXCXBOZWP1HiFguEjvgfyaqicyagvyQmwt5$>
If I understand what you are trying to do, then I suggest to delay the decision until the emit stage, when the offset is known. So instead of deciding early in loadConLNodesTuple_create, have it create a "smart" loadConLNode that can emit either addis+ld or *ld*, depending on the offset.
dl
On 2/13/26 4:52 AM, Suchismith Roy wrote:
Hi All
I am currently working on JDK-8216554 , which is related to optimise loading of constants in c2 compilation by determining the size of TOC in post_alloc stage. This is for PowerPC Currently there is a default code that emits combination of addis+ld instruction even in possible cases where only ld instruction can be emitted. There is an alternate code based on small size of TOC that emits only ld, but we are not taking it.
I tried to do a gdb debug, i am unable to get the constant table size at post_alloc stage. It is only after the emit stage we get the size of TOC.
Does that mean we cannot ever get accurate size in post_alloc stage ?
I also tried a test case and forced the c2 compiler emit ld only.
Interestingly, the global size of TOC was large and calculated much later, and macroassembler puts in addis+addi .
So the case of large offset, even if miscalculated in post_alloc stage , is handled in some way.
Does that mean we can just keep the conservative path of emitting Addis+addi ? Or maybe we can even remove the helper function in post_alloc stage.
Open to your suggestions. Will provide a few debug logs too as per your request.
Thanks Suchismith Roy
Hello Suchismith, The question doesn't look related to building the JDK. I think hotspot-dev mailing list might be a better place to ask this question https://mail.openjdk.org/mailman/listinfo/hotspot-dev -Jaikiran On 13/02/26 6:22 pm, Suchismith Roy wrote:
Hi All
I am currently working on JDK-8216554 , which is related to optimise loading of constants in c2 compilation by determining the size of TOC in post_alloc stage. This is for PowerPC Currently there is a default code that emits combination of addis+ld instruction even in possible cases where only ld instruction can be emitted. There is an alternate code based on small size of TOC that emits only ld, but we are not taking it.
I tried to do a gdb debug, i am unable to get the constant table size at post_alloc stage. It is only after the emit stage we get the size of TOC.
Does that mean we cannot ever get accurate size in post_alloc stage ?
I also tried a test case and forced the c2 compiler emit ld only.
Interestingly, the global size of TOC was large and calculated much later, and macroassembler puts in addis+addi .
So the case of large offset, even if miscalculated in post_alloc stage , is handled in some way.
Does that mean we can just keep the conservative path of emitting Addis+addi ? Or maybe we can even remove the helper function in post_alloc stage.
Open to your suggestions. Will provide a few debug logs too as per your request.
Thanks Suchismith Roy
participants (3)
-
Dean Long
-
Jaikiran Pai
-
Suchismith Roy