Enhance footprint for array initialization
Tom Rodriguez
Thomas.Rodriguez at Sun.COM
Wed Sep 3 09:01:17 PDT 2008
Well, your original question was about the code generated by javac so
that's why I suggested you ask on the javac list. They could possibly
generate more efficient code for the example you gave and it wouldn't
require any specification or JVM changes.
As far as the general issue of array initialization, you're not the
first to notice this. The idea of more efficient array initialization
using the constant pool has been bandied about probably since 1.0 but
it's always been pretty low on anyones radar. VM support for
something like this wouldn't be hard but the JVM spec would have to be
updated which would pretty much require a JSR. All the language and
JVM specification work is in a different group. I don't know if
there's a separate list for that but again the javac lists would
probably get your into a more productive conversation about such a
change.
There's an old trick for making more compact array initialization by
using Strings as data containers. Basically you encode your real data
as a String and then at init time you get the char[] data out and
transform it into whatever primitive array type you actually want.
It's a slightly dubious solution and I'd be surprised if anyone was
still using it.
Is there any reason the initialization code in charset.jar couldn't
just be written to be more efficient?
tom
On Sep 3, 2008, at 2:54 AM, Ulf Zibis wrote:
> Am 02.09.2008 17:32, Tom Rodriguez schrieb:
>>
>> I think you're on the wrong list. This is a javac code generation
>> issue. I think you want compiler-dev at openjdk.java.net.
>>
>> tom
>>
> Hi tom,
>
> >From the 1st view, you are right, javac list would be better, but
> knowledge about bytecode should exist in both lists, and I wanted to
> avoid to subscribe to an additional list.
> >From the 2nd view, my question is about if there is any reasonably
> way, a data table will be recognized in bytecode from the VM. If
> yes, the javac guys should be asked, to compile such tables to
> bytecode. If not, my post should be interpreted as an RFE to the VM
> guys to provide such a class file format.
>
> Especially this question is *not* about performance as you might
> have thought, it's about footprint of the class files. Performance
> here is not important, because as static initialisation this code is
> only executed once, so it isn't reasonable to Hotspot to optimize it
> first. Think about the charset.jar in JDK. Its about more than 6
> MBytes, full with repeating bytecode just for initializing static
> encoding tables. It could be much smaller.
>
> I'm worried the amount of bytecode instructions, needed to
> initialize a static array. I first thought, there must be a new
> bytecode to define table date, which could look like:
>
> static {};
> Code:
> 0: tabledef 950 // Start of table for 16-bit values,
> length 950, containing the pointers to the String constants
> 3: #39; //String 8859_1
> 5: #40; //String ISO8859_1
> 7: #115; //String iso_8859-1:1987
> 9: #40; //String ISO8859_1
> 11: // ...
>
> Yesterday I gave me 1 day, to study the VM Specification more
> deeper. So I came to the conclusion, that there is missing an array-
> type for the Constant Pool <http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#20080
> > in the class file format. This would reduce the footprint of class
> files significantly, because the repeated codes to initialize long
> arrays could be omitted.
> More worse it looks for byte arrays, than for string arrays:
>
> private static final byte[] allBytes = {
> (byte)0x00, (byte)0x01, (byte)0x02,
> // .....
> (byte)0xFF, (byte)0xFF
> }
> The bytecode will look like (256 times):
> 40: dup
> 41: bipush 8
> 43: bipush 8
> 45: bastore
>
> So the class file will need 6 times more footprint, than a simple
> "CONSTANT_array".
>
> Hopefully you now understand better the subject of my post,
>
> Regards,
> Ulf
>
>
More information about the hotspot-dev
mailing list