compressed oops and 64-bit header words
Dan Grove
dgrove at google.com
Wed May 7 23:28:53 PDT 2008
Thanks Vladimir. I'm still worried about the memory bloat from having
(effectively) 2 64-bit words in the object header, rather than 2 32-bit
words. If we consider an average (non-array) object size around 30-40 bytes,
this is a significant overhead. It seems that if users were willing to
declare that they were running inside a 4GB virtual address space (and in my
case, users would be willing to do in order to avoid memory bloat), we
should be able to do this.
On linux, I believe that if the process were running with a "ulimit -v XXXX"
shell, we could make guarantees that all address would fit in 32 bits, even
for a 64-bit VM. Do you agree that this would make sense?
Dan
2008/5/5 Vladimir Kozlov <Vladimir.Kozlov at sun.com>:
> Dan,
>
> Thank you for the paper.
> I think, the benefit they have with the compressed header comes
> mostly from a compressed vtable pointer. Which in our VM corresponds
> to a klass pointer which is also compressed.
> So in this sense we also have compressed header.
>
> I can not say what the performance benefit we have now with
> compressed oops since the generated code for a klass pointer
> load/stores currently is not what we would like to have
> (and we are working to improve it).
>
> I doubt that the compressed markword will give big difference.
> But I may be wrong.
>
>
>
> Thanks,
> Vladimir
>
> Dan Grove wrote:
>
> > Hi Colleen-
> >
> > I'm not worried about the shift instruction - I agree that it's
> > unlikely to matter. What I am worried about is have the standard
> > object header have 2 64-bit words in (well, 1 64-bit word, 1 32-bit
> > word, and 32 bits of pad).
> >
> > What I'm worried about is the increase in memory footprint and its
> > impact on performance. I was pointed to
> > http://ieeexplore.ieee.org/iel5/9012/28612/01281667.pdf?arnumber=1281667
> > , which (conveniently) breaks out the performance impact of
> > compressing the header versus compressing references versus both.
> >
> > So what I would really be interested would be a way to have both the
> > pointers/words in the header and the oops be 32 bits. I think this
> > would be a good win, when coupled with the extra registers when using
> > the 64-bit ABI.
> >
> > Dan
> >
> > On Mon, May 5, 2008 at 3:47 PM, Coleen Phillimore
> > <Coleen.Phillimore at sun.com> wrote:
> >
> > > Hi,
> > > It made sense when I first read it but in order to have 32 bit
pointers in
> > > #3, I can't imagine not having to encode and decode them by some heap
base
> > > in order to dereference these pointers, so the only difference between
#2
> > > and #3 is the shift instruction to get to 32G. We didn't believe that
the
> > > shift causes much of a performance penalty so we didn't implement it
this
> > > way. We would like to measure this at some point though, and if it is
> > > faster could add this mode fairly easily.
> > >
> > > thanks!
> > > Coleen
> > >
> > >
> > >
> > > Dan Grove wrote:
> > >
> > >
> > > > Thanks Colleen and Vladimir-
> > > >
> > > > What I'm wondering is whether there could be a third mode:
> > > >
> > > > 1. > 32GB - uses uncompressed pointers
> > > > 2. (something less than 4GB) < Xmx < 32GB - uses compressed pointers
> > > > (along with 64-bit mark word), 64-bit ABI
> > > > 3. whole app fits in 4GB - uses 32-bit pointers in heap, but 64-bit
ABI.
> > > >
> > > > The idea here is that I'd prefer to pay no penalty over 32-bit when
my
> > > > app runs in 64-bit mode and the app fits in 4GB of memory (my reason
> > > > for this is that I want to support our JNI libraries only in 64-bit
> > > > mode, and deprecate the 32-bit JNI libraries).
> > > >
> > > > Does this make any sense to you?
> > > >
> > > > Dan
> > > >
> > > > On Mon, May 5, 2008 at 12:20 PM, Coleen Phillimore - Sun
Microsystems
> > > > <Coleen.Phillimore at sun.com> wrote:
> > > >
> > > >
> > > >
> > > > > Actually, we are using the gap for a field and array length in the
code
> > > > > now, but the code Vladimir showed me makes the allocation code a
lot
> > > > >
> > > >
> > > cleaner
> > >
> > > >
> > > > > for the instance field case.
> > > > >
> > > > > In the array case in 64 bits, compressing the _klass pointer into
32
> > > > >
> > > >
> > > bits
> > >
> > > >
> > > > > allows us to move the _length field into the other 32 bits, which
> > > > >
> > > >
> > > because of
> > >
> > > >
> > > > > alignment saves 64 bits. There was a 32 bit alignment gap after
the
> > > > >
> > > >
> > > _length
> > >
> > > >
> > > > > field, if not compressed with the klass pointer.
> > > > >
> > > > > The mark word can also contain a forwarding pointer used during
GC, so
> > > > > can't be 32 bits.
> > > > >
> > > > > The compression that we use allows for 32G because we shift into
the
> > > > >
> > > >
> > > least
> > >
> > > >
> > > > > significant bits - the algorithm is (ptr-heap_base)>>3.
> > > > >
> > > > > Coleen
> > > > >
> > > > >
> > > > >
> > > > > Vladimir Kozlov wrote:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > Dan,
> > > > > >
> > > > > > Only the mark word is 64 bits. The klass pointer is 32-bits but
> > > > > > in the current implementation the gap after klass is not used.
> > > > > >
> > > > > > I am working on to use the gap for a field or array's length.
> > > > > >
> > > > > > The mark word may contain a 64-bits tread pointer (for Biased
> > > > > >
> > > > >
> > > >
> > > Locking).
> > >
> > > >
> > > > >
> > > > > > Thanks,
> > > > > > Vladimir
> > > > > >
> > > > > > Dan Grove wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Hi-
> > > > > > >
> > > > > > > I talked some with the Nikolay Igotti about compressed oops in
> > > > > > > OpenJDK7. He tells me that the mark word and class pointer
remain 64
> > > > > > > bits when compressed oops are being used. It seems that this
leaves
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > a
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > fair amount of the bloat in place when moving from 32->64
bits.
> > > > > > >
> > > > > > > I'm interesting in deprecating 32-bit VM's at my employer at
some
> > > > > > > point. Doing this is going to require that 64-bit VM's have as
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > little
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > bloat as possible. Has there been any consideration of making
the
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > mark
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > word and class pointer 32 bits in cases where the VM fits
within
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > 4GB?
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > It seems like this would be a major win. A second benefit here
is
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > that
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > the "add and shift" currently required on dereference of
compressed
> > > > > > > oops could be eliminated in cases where the VM fit inside 4GB.
> > > > > > >
> > > > > > > Dan
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20080507/8dfd52ba/attachment.html
More information about the hotspot-runtime-dev
mailing list