compressed oops and 64-bit header words

Vladimir Kozlov Vladimir.Kozlov at Sun.COM
Mon May 5 17:22:56 PDT 2008


Dan,

Thank you for the paper.
I think, the benefit they have with the compressed header comes
mostly from a compressed vtable pointer. Which in our VM corresponds
to a klass pointer which is also compressed.
So in this sense we also have compressed header.

I can not say what the performance benefit we have now with
compressed oops since the generated code for a klass pointer
load/stores currently is not what we would like to have
(and we are working to improve it).

I doubt that the compressed markword will give big difference.
But I may be wrong.

Thanks,
Vladimir

Dan Grove wrote:
> Hi Colleen-
> 
> I'm not  worried about the shift instruction - I agree that it's
> unlikely to matter. What I am worried about is have the standard
> object header have 2 64-bit words in (well, 1 64-bit word, 1 32-bit
> word, and 32 bits of pad).
> 
> What I'm worried about is the increase in memory footprint and its
> impact on performance. I was pointed to
> http://ieeexplore.ieee.org/iel5/9012/28612/01281667.pdf?arnumber=1281667
> , which (conveniently) breaks out the performance impact of
> compressing the header versus compressing references versus both.
> 
> So what I would really be interested would be a way to have both the
> pointers/words in the header and the oops be 32 bits. I think this
> would be a good win, when coupled with the extra registers when using
> the 64-bit ABI.
> 
> Dan
> 
> On Mon, May 5, 2008 at 3:47 PM, Coleen Phillimore
> <Coleen.Phillimore at sun.com> wrote:
>>  Hi,
>>  It made sense when I first read it but in order to have 32 bit pointers in
>> #3, I can't imagine not having to encode and decode them by some heap base
>> in order to dereference these pointers, so the only difference between #2
>> and #3 is the shift instruction to get to 32G.  We didn't believe that the
>> shift causes much of a performance penalty so we didn't implement it this
>> way.  We would like to measure this at some point though, and if it is
>> faster could add this mode fairly easily.
>>
>>  thanks!
>>  Coleen
>>
>>
>>
>>  Dan Grove wrote:
>>
>>> Thanks Colleen and Vladimir-
>>>
>>> What I'm wondering is whether there could be a third mode:
>>>
>>> 1. > 32GB - uses uncompressed pointers
>>> 2. (something less than 4GB) < Xmx < 32GB - uses compressed pointers
>>> (along with 64-bit mark word), 64-bit ABI
>>> 3. whole app fits in 4GB - uses 32-bit pointers in heap, but 64-bit ABI.
>>>
>>> The idea here is that I'd prefer to pay no penalty over 32-bit when my
>>> app runs in 64-bit mode and the app fits in 4GB of memory (my reason
>>> for this is that I want to support our JNI libraries only in 64-bit
>>> mode, and deprecate the 32-bit JNI libraries).
>>>
>>> Does this make any sense to you?
>>>
>>> Dan
>>>
>>> On Mon, May 5, 2008 at 12:20 PM, Coleen Phillimore - Sun Microsystems
>>> <Coleen.Phillimore at sun.com> wrote:
>>>
>>>
>>>>  Actually, we are using the gap for a field and array length in the code
>>>> now, but the code Vladimir showed me makes the allocation code a lot
>> cleaner
>>>> for the instance field case.
>>>>
>>>>  In the array case in 64 bits, compressing the _klass pointer into 32
>> bits
>>>> allows us to move the _length field into the other 32 bits, which
>> because of
>>>> alignment saves 64 bits.  There was a 32 bit alignment gap after the
>> _length
>>>> field, if not compressed with the klass pointer.
>>>>
>>>>  The mark word can also contain a forwarding pointer used during GC, so
>>>> can't be 32 bits.
>>>>
>>>>  The compression that we use allows for 32G because we shift into the
>> least
>>>> significant bits - the algorithm is (ptr-heap_base)>>3.
>>>>
>>>>  Coleen
>>>>
>>>>
>>>>
>>>>  Vladimir Kozlov wrote:
>>>>
>>>>
>>>>
>>>>> Dan,
>>>>>
>>>>> Only the mark word is 64 bits. The klass pointer is 32-bits but
>>>>> in the current implementation the gap after klass is not used.
>>>>>
>>>>> I am working on to use the gap for a field or array's length.
>>>>>
>>>>> The mark word may contain a 64-bits tread pointer (for Biased
>> Locking).
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> Dan Grove wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hi-
>>>>>>
>>>>>> I talked some with the Nikolay Igotti about compressed oops in
>>>>>> OpenJDK7. He tells me that the mark word and class pointer remain 64
>>>>>> bits when compressed oops are being used. It seems that this leaves
>> a
>>>>>> fair amount of the bloat in place when moving from 32->64 bits.
>>>>>>
>>>>>> I'm interesting in deprecating 32-bit VM's at my employer at some
>>>>>> point. Doing this is going to require that 64-bit VM's have as
>> little
>>>>>> bloat as possible. Has there been any consideration of making the
>> mark
>>>>>> word and class pointer 32 bits in cases where the VM fits within
>> 4GB?
>>>>>> It seems like this would be a major win. A second benefit here is
>> that
>>>>>> the "add and shift" currently required on dereference of compressed
>>>>>> oops could be eliminated in cases where the VM fit inside 4GB.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>



More information about the hotspot-runtime-dev mailing list