[RFR] (M) 8143608: Don't 64-bit align start of InstanceKlass vtable, itable, and nonstatic_oopmap on 32-bit systems

Mon Dec 7 04:48:57 UTC 2015

On 12/6/15 6:30 PM, Chris Plummer wrote:
> On 12/6/15 2:15 PM, David Holmes wrote:
>> Hi Chris,
>>
>> On 5/12/2015 8:02 AM, Chris Plummer wrote:
>>> Hello,
>>>
>>> Please review the following:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8143608
>>> http://cr.openjdk.java.net/~cjplummer/8143608/webrev.00/webrev.hotspot/
>>>
>>> A bit of background would help. The InstanceKlass object has a 
>>> number of
>>> variable length fields that are laid out after the declared fields. 
>>> When
>>> an InstanceKlass object is allocated, extra memory is allocated for it
>>> to leave room for these fields. The first three of these fields are
>>> vtable, itable, and nonstatic_oopmap. They are all arrays of HeapWord
>>> sized values, which means void* size, which means they only need 32-bit
>>> alignment on 32-bit systems. However, they have always been 64-bit
>>> aligned. This webrev removes the forced 64-bit alignment on 32-bit
>>> systems, saving footprint.
>>
>> Theory sounds fine.
>>
>> Not so sure about the practice though. In instanceKlass.hpp you have 
>> removed a lot of alignment adjustments and I can't be sure they are 
>> all necessarily correct. In particular:
>>
>> 934   static int header_size()            { return 
>> sizeof(InstanceKlass)/HeapWordSize; }
>>
>> seems to me this will truncate the header size if 
>> sizeof(InstanceKlass) is not already 32-bit aligned. Are we assuming 
>> the compiler will always pad things such that this is true?
>>
>> In general I would have expected all the align_object_offset (which 
>> aligns to 64-bit) to be replaced with something that aligns to 
>> sizeof(void*). Even if we can assume a minimum of 32-bit alignment it 
>> isn't clear to me that that always converts to 64-bit alignment on 
>> 64-bit.
>>
> Hi David,
>
> I had looked into pre-existing alignment assumptions and made sure I 
> didn't add to the assumptions. It turns out the old alignment would 
> approach would fail if sizeof(InstanceKlass() is not already aligned 
> to sizeof(void*). Let's look at how things worked before any of my 
> changes. The code used to be:
>
>    static int header_size()            { return 
> align_object_offset(sizeof(InstanceKlass)/HeapWordSize); }
>
> Keep in mind that this is asking for 64-bit alignment on both 32-bit 
> and 64-bit systems. That's what align_object_offset() does, although 
> it's not really obvious from the name. So first a 32 bit example. 
> Let's assume sizeof(InstanceKlass) is 10 (not 32-bit aligned), so / by 
> HeapWordSize and you get 2.
>
>     align_object_offset(2) ==
>     align_size_up_(2, HeapWordsPerLong) ==
>     align_size_up(2, 2) ==
>     (2 + 1) & ~1 = 2
>
> This is the wrong answer. We would want an offset of 10 to align to 
> 64-bit align to 16, which is 4 words, not 2. Now see the difference if 
> sizeof(InstanceKlass) is already 32-bit aligned. Assume 
> sizeof(InstanceKlass) is 12:
>
>     align_object_offset(3) ==
>     align_size_up_(3, HeapWordsPerLong) ==
>     align_size_up(3, 2) ==
>     (3+ 1) & ~1 = 4
>

> Now we get the right answer. Now for 64-bit systems, using the 
> sizeof(InstanceKlass) == 10:
Hi David,

Below isn't correct. I was using 32-bit words, including for 
HeapWordSize.  I've corrected below for 64-bit words, but the results 
are still the same. The existing code is broken if sizeof(InstanceKlass) 
is not 64-bit aligned.
>
>     align_object_offset(2) ==
>     align_size_up_(2, HeapWordsPerLong) ==
>     align_size_up(2, 1) ==
>     (2 + 0) & ~0 = 2
     align_object_offset(1) ==
     align_size_up_(1, HeapWordsPerLong) ==
     align_size_up(1, 1) ==
     (1 + 0) & ~0 = 1

So it's 1 word, not 2, but it's a 64-bit word, not a 32-bit, so is still 
the wrong answer since it should be
>
> Wrong answer again. Now for sizeof(InstanceKlass) == 12, so it is 
> 32-bit aligned, but not 64-bit:
>
>     align_object_offset(3) ==
>     align_size_up_(3, HeapWordsPerLong) ==
>     align_size_up(3, 1) ==
>     (3+ 0) & ~0 = 3
     align_object_offset(1) ==
     align_size_up_(1, HeapWordsPerLong) ==
     align_size_up(1, 1) ==
     (1+ 0) & ~0 =1

12 bytes is no different than 10 bytes, and still the wrong answer.
>
> Again the wrong answer. Now sizeof(InstanceKlass) == 16:
>
>     align_object_offset(4) ==
>     align_size_up_(4, HeapWordsPerLong) ==
>     align_size_up(4, 1) ==
>     (4+ 0) & ~0 = 4
     align_object_offset(2) ==
     align_size_up_(2, HeapWordsPerLong) ==
     align_size_up(2, 1) ==
     (2+ 0) & ~0 = 2

2 words (16 bytes) is what we are looking for.

cheers,

Chris
>
> This is what we want. So my conclusion is that we already have a 
> requirement that sizeof(InstanceKlass) be 32-bit aligned on 32-bit 
> systems and 64-bit aligned on 64-bit systems. Also, it turns out that 
> align_object_offset() is a no-op on 64-bit systems. The call is just 
> there to 64-bit align on 32-bit systems.
>
> I would appreciate someone double checking my assumptions here.
>
> BTW, since the last InstanceKlass field is a pointer type, we at least 
> know that the current alignment assumption being made is true, 
> although a bit presumptive.
>
> thanks,
>
> Chris
>> Thanks,
>> David
>>
>>> This change affects all 32-bit platforms. It should have no net impact
>>> on 64-bit platforms since the fields remain (naturally) 64-bit aligned
>>> (unless of course I've introduced a bug). The intent is to not change
>>> what is done for 64-bit platforms.
>>>
>>> BTW, there is a change to AARCH64, which may seem odd at first. It just
>>> removes an "if" block where the condition should always have evaluated
>>> to false, so it should have no net affect.
>>>
>>> Tested with JPRT "-testset hotspot". Please let me know if you think
>>> there are any additional tests that should be run.
>>>
>>> thanks,
>>>
>>> Chris
>>>
>