compressed oops and 64-bit header words
Vladimir Kozlov
Vladimir.Kozlov at Sun.COM
Thu May 8 08:12:40 PDT 2008
Dan,
It is not 2 64-bits words, it is 1 and half :)
since klass is 32-bits and we use other 32-bits for a field.
So the overhead is only 4 bytes. Also don't forget that
all objects are aligned to 8 bytes in the heap even
in 32-bits VM. So the average overhead will be less.
I want to be clear that it is not that we totally against
your suggestion. It is resources we need to implement it
which we don't have currently.
On other hand, VM is open source now so you or your colleges
can do it and help us all.
Thanks,
Vladimir
Dan Grove wrote:
> Thanks Vladimir. I'm still worried about the memory bloat from having
> (effectively) 2 64-bit words in the object header, rather than 2 32-bit
> words. If we consider an average (non-array) object size around 30-40
> bytes, this is a significant overhead. It seems that if users were
> willing to declare that they were running inside a 4GB virtual address
> space (and in my case, users would be willing to do in order to avoid
> memory bloat), we should be able to do this.
>
> On linux, I believe that if the process were running with a "ulimit -v
> XXXX" shell, we could make guarantees that all address would fit in 32
> bits, even for a 64-bit VM. Do you agree that this would make sense?
>
> Dan
>
> 2008/5/5 Vladimir Kozlov <Vladimir.Kozlov at sun.com
> <mailto:Vladimir.Kozlov at sun.com>>:
> > Dan,
> >
> > Thank you for the paper.
> > I think, the benefit they have with the compressed header comes
> > mostly from a compressed vtable pointer. Which in our VM corresponds
> > to a klass pointer which is also compressed.
> > So in this sense we also have compressed header.
> >
> > I can not say what the performance benefit we have now with
> > compressed oops since the generated code for a klass pointer
> > load/stores currently is not what we would like to have
> > (and we are working to improve it).
> >
> > I doubt that the compressed markword will give big difference.
> > But I may be wrong.
> >
> >
> >
> > Thanks,
> > Vladimir
> >
> > Dan Grove wrote:
> >
> > > Hi Colleen-
> > >
> > > I'm not worried about the shift instruction - I agree that it's
> > > unlikely to matter. What I am worried about is have the standard
> > > object header have 2 64-bit words in (well, 1 64-bit word, 1 32-bit
> > > word, and 32 bits of pad).
> > >
> > > What I'm worried about is the increase in memory footprint and its
> > > impact on performance. I was pointed to
> > >
> http://ieeexplore.ieee.org/iel5/9012/28612/01281667.pdf?arnumber=1281667
> > > , which (conveniently) breaks out the performance impact of
> > > compressing the header versus compressing references versus both.
> > >
> > > So what I would really be interested would be a way to have both the
> > > pointers/words in the header and the oops be 32 bits. I think this
> > > would be a good win, when coupled with the extra registers when using
> > > the 64-bit ABI.
> > >
> > > Dan
> > >
> > > On Mon, May 5, 2008 at 3:47 PM, Coleen Phillimore
> > > <Coleen.Phillimore at sun.com <mailto:Coleen.Phillimore at sun.com>> wrote:
> > >
> > > > Hi,
> > > > It made sense when I first read it but in order to have 32 bit
> pointers in
> > > > #3, I can't imagine not having to encode and decode them by some
> heap base
> > > > in order to dereference these pointers, so the only difference
> between #2
> > > > and #3 is the shift instruction to get to 32G. We didn't believe
> that the
> > > > shift causes much of a performance penalty so we didn't implement
> it this
> > > > way. We would like to measure this at some point though, and if it is
> > > > faster could add this mode fairly easily.
> > > >
> > > > thanks!
> > > > Coleen
> > > >
> > > >
> > > >
> > > > Dan Grove wrote:
> > > >
> > > >
> > > > > Thanks Colleen and Vladimir-
> > > > >
> > > > > What I'm wondering is whether there could be a third mode:
> > > > >
> > > > > 1. > 32GB - uses uncompressed pointers
> > > > > 2. (something less than 4GB) < Xmx < 32GB - uses compressed
> pointers
> > > > > (along with 64-bit mark word), 64-bit ABI
> > > > > 3. whole app fits in 4GB - uses 32-bit pointers in heap, but
> 64-bit ABI.
> > > > >
> > > > > The idea here is that I'd prefer to pay no penalty over 32-bit
> when my
> > > > > app runs in 64-bit mode and the app fits in 4GB of memory (my
> reason
> > > > > for this is that I want to support our JNI libraries only in 64-bit
> > > > > mode, and deprecate the 32-bit JNI libraries).
> > > > >
> > > > > Does this make any sense to you?
> > > > >
> > > > > Dan
> > > > >
> > > > > On Mon, May 5, 2008 at 12:20 PM, Coleen Phillimore - Sun
> Microsystems
> > > > > <Coleen.Phillimore at sun.com <mailto:Coleen.Phillimore at sun.com>>
> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > Actually, we are using the gap for a field and array length
> in the code
> > > > > > now, but the code Vladimir showed me makes the allocation
> code a lot
> > > > > >
> > > > >
> > > > cleaner
> > > >
> > > > >
> > > > > > for the instance field case.
> > > > > >
> > > > > > In the array case in 64 bits, compressing the _klass pointer
> into 32
> > > > > >
> > > > >
> > > > bits
> > > >
> > > > >
> > > > > > allows us to move the _length field into the other 32 bits, which
> > > > > >
> > > > >
> > > > because of
> > > >
> > > > >
> > > > > > alignment saves 64 bits. There was a 32 bit alignment gap
> after the
> > > > > >
> > > > >
> > > > _length
> > > >
> > > > >
> > > > > > field, if not compressed with the klass pointer.
> > > > > >
> > > > > > The mark word can also contain a forwarding pointer used
> during GC, so
> > > > > > can't be 32 bits.
> > > > > >
> > > > > > The compression that we use allows for 32G because we shift
> into the
> > > > > >
> > > > >
> > > > least
> > > >
> > > > >
> > > > > > significant bits - the algorithm is (ptr-heap_base)>>3.
> > > > > >
> > > > > > Coleen
> > > > > >
> > > > > >
> > > > > >
> > > > > > Vladimir Kozlov wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Dan,
> > > > > > >
> > > > > > > Only the mark word is 64 bits. The klass pointer is 32-bits but
> > > > > > > in the current implementation the gap after klass is not used.
> > > > > > >
> > > > > > > I am working on to use the gap for a field or array's length.
> > > > > > >
> > > > > > > The mark word may contain a 64-bits tread pointer (for Biased
> > > > > > >
> > > > > >
> > > > >
> > > > Locking).
> > > >
> > > > >
> > > > > >
> > > > > > > Thanks,
> > > > > > > Vladimir
> > > > > > >
> > > > > > > Dan Grove wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Hi-
> > > > > > > >
> > > > > > > > I talked some with the Nikolay Igotti about compressed
> oops in
> > > > > > > > OpenJDK7. He tells me that the mark word and class
> pointer remain 64
> > > > > > > > bits when compressed oops are being used. It seems that
> this leaves
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > a
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > fair amount of the bloat in place when moving from 32->64
> bits.
> > > > > > > >
> > > > > > > > I'm interesting in deprecating 32-bit VM's at my employer
> at some
> > > > > > > > point. Doing this is going to require that 64-bit VM's
> have as
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > little
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > bloat as possible. Has there been any consideration of
> making the
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > mark
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > word and class pointer 32 bits in cases where the VM fits
> within
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > 4GB?
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > It seems like this would be a major win. A second benefit
> here is
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > that
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > the "add and shift" currently required on dereference of
> compressed
> > > > > > > > oops could be eliminated in cases where the VM fit inside
> 4GB.
> > > > > > > >
> > > > > > > > Dan
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
>
More information about the hotspot-runtime-dev
mailing list