RFR (XS) JDK-8006280 - structure packing for 64-bit (resend)
Ioi Lam
ioi.lam at oracle.com
Wed Jan 23 21:42:19 PST 2013
On 01/23/2013 08:21 PM, David Holmes wrote:
> Hi Ioi,
>
> On 24/01/2013 1:07 PM, Ioi Lam wrote:
>> (re-sending with public-accessible bug URL).
>>
>> WebRev:
>> http://cr.openjdk.java.net/~iklam/8006280/reorder_metadata_001/
>>
>> Bug: Need to reorder metadata structures to reduce size (64-bit)
>> http://bugs.sun.com/view_bug.do?bug_id=8006280
>>
>> Sponsor: Jiangli
>>
>> Summary:
>>
>> On x64:
>>
>> + Klass: 8 byte reduction
>
> I assume this is only in product mode? And we gain an extra 4 bytes as
> there is no need to padd out to 64-bit alignment.
>
>> + InstanceKlass: 8 byte reduction (in addition to reduction in Klass)
>
> Ditto re padding.
>
>> + Method: 8 byte reduction
>
> Ditto re padding.
>
Hi David,
I measured only in product mode. In debug builds there are some
additional fields mixed in the middle of the structures, so this patch
may or may not help, but I guess that doesn't matter.
In the case of Klass and Method, there were two holes like this:
u4 XXX; // followed by hole
u8 YYY
u4 ZZZ // followed by hole
u8 YYY
so moving the two u4s together eliminates the two holes, reducing by 8
bytes.
In the case of InstanceClass, it was like you said, I moved one u4 to
fill a hole, and also eliminated a 4 byte padding at the end of the
structure as a side effect.
>> This is verified by using the "pahole" script (as described in bug
>> report)
>>
>> On ARM-EABI:
>>
>> I also manually verify that this change for ARM-EABI (which requires
>> 8-byte alignment for jlong but not pointers):
>
> 32-bit ARM requires 4 byte pointer alignment. I'm not sure what point
> you are making here. Is gcc performing automatic 8-byte alignment for
> "jlong" class members? And is this different to other platforms?
>
ARM EABI ("Extended ABI") requires 64-bit fields (jlong and double) to
be aligned on 8-byte boundaries, so that they can be loaded using the
LDRD (load double word) instruction. This makes it possible to fetch the
jlong into two 32-bit registers in a single CPU cycle. GCC targeting ARM
EABI will do the paddings automatically.
All modern ARM platforms that we care about use EABI.
I am not sure if other 32-bit platforms have similar alignment
requirements. (Perhaps doubles on 32-bit SPARC?)
>> + Klass: 8 byte reduction
>> + InstanceKlass: no change (apart from Klass reduction) -- no
>> jlong fields
>
> What changes that you made to instanceKlass relate to jlong fields ??
>
In InstanceKlass, I didn't make changes related to jlongs. I meant to
say: since InstanceKlass doesn't have any 8-byte fields on ARM EABI, my
patch will not have any impact on its size.
>> + Method: no change -- no jlong fields
>
> Only because TIERED is not enabled?
>
Yes, I only looked at embedded ARM HotSpot build, which has disabled
TIERED. I took a quick look and it seemed if TIERED were enabled, the
patch probably would save 8 bytes.
> Are the C++ compilers actually required to layout class instances in
> declaration order? I thought they might optimize some of this
> themselves :(
>
That seems to be the case, but then I am not a C++ lawyer:
http://stackoverflow.com/questions/289559/class-layout-in-c-why-are-members-sometimes-ordered.
What I did in this patch just happens to work (for now). Once people
start putting in new fields new holes will appear.
With the different layout policies -- "usual 32-bit", "ARM EABI",
"64-bit", it's hard to know what happens when you randomly insert a
field (which is what people seem to be doing ...)
I think our best hope is to manually order all the fields in C++ by
ascending size, just like in Java. Pointers should be placed between the
4-byte and 8-byte numerals, like
class ProperlyLaidOutKlass {
u2 a;
u2 b;
u4 c;
u4 d;
u4 e;
void * f;
void * g;
u8 h;
};
That should minimize the amount of holes on all platforms.
Thanks
- Ioi
> Thanks,
> David
>
>
>> Before: (linux_armvfp_2.6-productEmb/jre/lib/arm/minimal/libjvm.diz
>>
>> (gdb) p&((Klass*)0)->_alloc_count
>> $1 = (juint *) 0x58
>> (gdb) p&((Klass*)0)->_last_biased_lock_bulk_revocation_time
>> $2 = (jlong *) 0x60
>> (gdb) p sizeof(Klass)
>> $3 = 120
>>
>> The "hole" was at address 0x5c
>>
>> After:
>>
>> (gdb) p&((Klass*)0)->_access_flags
>> $7 = (AccessFlags *) 0x54
>> (gdb) p&((Klass*)0)->_last_biased_lock_bulk_revocation_time
>> $8 = (jlong *) 0x58
>> (gdb) p sizeof(Klass)
>> $6 = 112
>>
>> Tests run:
>>
>> + JPRT -- (hotspot only, on hotspotwest queue, including embedded
>> builds)
>>
>> + UTE/vm.quick.testlist, 100% passed
>>
>>
>> Thanks
>> Ioi
>>
>>
>>
>>
>>
>>
>>
More information about the hotspot-runtime-dev
mailing list