RFR (XS) JDK-8006280 - structure packing for 64-bit (resend)

Ioi Lam ioi.lam at oracle.com
Wed Jan 23 21:42:19 PST 2013


On 01/23/2013 08:21 PM, David Holmes wrote:
> Hi Ioi,
>
> On 24/01/2013 1:07 PM, Ioi Lam wrote:
>> (re-sending with public-accessible bug URL).
>>
>> WebRev:
>> http://cr.openjdk.java.net/~iklam/8006280/reorder_metadata_001/
>>
>> Bug: Need to reorder metadata structures to reduce size (64-bit)
>>       http://bugs.sun.com/view_bug.do?bug_id=8006280
>>
>> Sponsor: Jiangli
>>
>> Summary:
>>
>> On x64:
>>
>>    + Klass: 8 byte reduction
>
> I assume this is only in product mode? And we gain an extra 4 bytes as 
> there is no need to padd out to 64-bit alignment.
>
>>    + InstanceKlass: 8 byte reduction (in addition to reduction in Klass)
>
> Ditto re padding.
>
>>    + Method: 8 byte reduction
>
> Ditto re padding.
>

Hi David,

I measured only in product mode. In debug builds there are some 
additional fields mixed in the middle of the structures, so this patch 
may or may not help, but I guess that doesn't matter.

In the case of Klass and Method, there were two holes like this:

     u4 XXX; // followed by hole
     u8 YYY
     u4 ZZZ // followed by hole
     u8 YYY

so moving the two u4s together eliminates the two holes, reducing by 8 
bytes.

In the case of InstanceClass, it was like you said, I moved one u4 to 
fill a hole, and also eliminated a 4 byte padding at the end of the 
structure as a side effect.

>>    This is verified by using the "pahole" script (as described in bug 
>> report)
>>
>> On ARM-EABI:
>>
>>    I also manually verify that this change for ARM-EABI (which requires
>>    8-byte alignment for jlong but not pointers):
>
> 32-bit ARM requires 4 byte pointer alignment. I'm not sure what point 
> you are making here. Is gcc performing automatic 8-byte alignment for 
> "jlong" class members? And is this different to other platforms?
>
ARM EABI ("Extended ABI") requires 64-bit fields (jlong and double) to 
be aligned on 8-byte boundaries, so that they can be loaded using the 
LDRD (load double word) instruction. This makes it possible to fetch the 
jlong into two 32-bit registers in a single CPU cycle. GCC targeting ARM 
EABI will do the paddings automatically.

All modern ARM platforms that we care about use EABI.

I am not sure if other 32-bit platforms have similar alignment 
requirements. (Perhaps doubles on 32-bit SPARC?)

>>    + Klass: 8 byte reduction
>>    + InstanceKlass: no change (apart from Klass reduction) -- no 
>> jlong fields
>
> What changes that you made to instanceKlass relate to jlong fields ??
>
In InstanceKlass, I didn't make changes related to jlongs. I meant to 
say: since InstanceKlass doesn't have any 8-byte fields on ARM EABI, my 
patch will not have any impact on its size.
>>    + Method: no change -- no jlong fields
>
> Only because TIERED is not enabled?
>
Yes, I only looked at embedded ARM HotSpot build, which has disabled 
TIERED. I took a quick look and it seemed if TIERED were enabled, the 
patch probably would save 8 bytes.

> Are the C++ compilers actually required to layout class instances in 
> declaration order? I thought they might optimize some of this 
> themselves :(
>
That seems to be the case, but then I am not a C++ lawyer: 
http://stackoverflow.com/questions/289559/class-layout-in-c-why-are-members-sometimes-ordered.

What I did in this patch just happens to work (for now). Once people 
start putting in new fields new holes will appear.

With the different layout policies -- "usual 32-bit", "ARM EABI", 
"64-bit", it's hard to know what happens when you randomly insert a 
field (which is what people seem to be doing ...)

I think our best hope is to manually order all the fields in C++ by 
ascending size, just like in Java. Pointers should be placed between the 
4-byte and 8-byte numerals, like

class ProperlyLaidOutKlass {
     u2 a;
     u2 b;
     u4 c;
     u4 d;
     u4 e;
     void * f;
     void * g;
     u8  h;
};

That should minimize the amount of holes on all platforms.

Thanks
- Ioi

> Thanks,
> David
>
>
>>    Before: (linux_armvfp_2.6-productEmb/jre/lib/arm/minimal/libjvm.diz
>>
>>      (gdb) p&((Klass*)0)->_alloc_count
>>      $1 = (juint *) 0x58
>>      (gdb) p&((Klass*)0)->_last_biased_lock_bulk_revocation_time
>>      $2 = (jlong *) 0x60
>>      (gdb) p sizeof(Klass)
>>      $3 = 120
>>
>>      The "hole" was at address 0x5c
>>
>>    After:
>>
>>      (gdb) p&((Klass*)0)->_access_flags
>>      $7 = (AccessFlags *) 0x54
>>      (gdb) p&((Klass*)0)->_last_biased_lock_bulk_revocation_time
>>      $8 = (jlong *) 0x58
>>      (gdb) p sizeof(Klass)
>>      $6 = 112
>>
>> Tests run:
>>
>> + JPRT -- (hotspot only, on hotspotwest queue, including embedded 
>> builds)
>>
>> + UTE/vm.quick.testlist, 100% passed
>>
>>
>> Thanks
>> Ioi
>>
>>
>>
>>
>>
>>
>>



More information about the hotspot-runtime-dev mailing list