8024927: Nashorn performance regression with CompressedOops

Fri Oct 25 15:13:07 PDT 2013

On Oct 25, 2013, at 5:22 AM, "Lindenmaier, Goetz" <goetz.lindenmaier at sap.com> wrote:

> I think there would be enough space for extending the compressed class space to higher addresses, if you  layout the spaces like this (assuming shift 3):

These are good thought experiments.  I'll contribute a few more.

I took a quick look at current size scales for the JDK.  There are about 30K classes and 230K methods in rt.jar (JDK 8).  This means that the basic JDK uses (to half an order of magnitude) 0.3 million metadata nodes.

(We never load it all at once, but let's use it as a proxy for complex application.)

Meanwhile, the jar size is about 67Mb uncompressed, which expands about 1.5x to 3x when loaded.  (Bytecodes don't expand, UTF8 strings contract about 2x due to interning, u2 indexes expand by 2x or 4x, and there are lots of extra headers and side tables.)  Call it 100Mb in memory.

That works out to 300 bytes per node; the methods probably dominate in size (as well as in number), followed by the constant pool part of each class.  (A pointer to actual figures would help but I don't have one; I'm working from memory here of various adventures in footprint analysis and jar packing.)

You can increase that size (of rt.jar) by 30x and still be able to address metadata with an unscaled 32-bit offset.  That would be about 1M classes.

You can increase the size almost arbitrarily by scaling the 32-bit offset.  I've heard reports (for other kinds of scaled pointers) of scales all the way up to cache line sizes being sometimes workable.  Say we allow up to 32x (5 bit positions).  That goes to (order of magnitude) 1000x the metadata contained in rt.jar, which would be something like 30M classes.

I've made the assumption so that all metadata pointers could be compressed.  By special-casing class pointers (which is what we do today anyway), we can represent up to 2**32 classes by using another level of indirection.  That's 100x the previous paragraph, surely more than anyone would ever need, right?  :-)

Compressing metadata pointers would allow metadata to get smaller, but would be somewhat difficult in our present code base.

Another reason to concentrate on class pointers (and not general metadata pointers) is that they are the connection between the Java heap and the metadata heap.  There are many ways to consider reducing header size.  One possible tactic is to reduce the size of the class reference (oopDesc::_klass) to something even smaller than 32 bits.  This would put a different inflection point on metadata costs, the number of classes at which an object header has to have some sort of expansion.  But a mere 20 bits would be enough to address the middle scaling point mentioned above, of 1M classes.

For a 64-bit uncompressed system, the same bits could be moved into every reference (and still leave 44 bits of dynamic range for heap addresses), allowing object headers to shrink to just a few bits (depending on what the mark word needs—that requirement can be tuned downward also).

Finally, there is another way that we might push toward a scale of (say) 1M classes, and that is by introducing "species" (split classes) to differentiate objects nominally of the same class but with some internal distinction.  Reified generics can be done this way, for example.  So can "hidden classes" a la Self or JavaScript, or just some general kind of typestate.

The effect of having split classes would be to increase the required dynamic range of the _klass field.  This could also create pressure to use the extra level of indirection.  (Coleen, we talked briefly about this last time we had lunch.)

The costs of the extra level of indirection can be (I think) more or less continuously tuned by copying more or less of the "far class" into the "near class" (or "species") structure stored in the array indexed by the class/species number.  In other words, if we are indexing classes, there's a big global array of them, and the array has a constant element size.  A larger element size allows more information (such as the first N vtable entries) to be made available in one indirection, while a smaller element size allows for less duplication or fragmentation.  An element size of several cache lines (esp. not an even-numbered "several") might be the sweet spot.

(And so on.  There's a different set of directions to go in when thinking about sharable images...)

— John