A question about bytecodes + unsigned load performance ./. add performace
Ulf Zibis
Ulf.Zibis at gmx.de
Sun Jan 11 11:09:53 PST 2009
Hi John,
thanks for your detailed explanation.
So I'm asking for any volunteer, who has enough knowledge about hotspot,
to check out, why that optimization does not appear to kick in, or
respectively can anybody force hotspot to optimize
{ char c = (char)((byte[])sa[sp] & 0xFF); }, { char c = (char)(inByte &
0xFF); },
{ char c = b2cMap[(byte[])sa[sp] & 0xFF]; } and { char c = b2cMap[inByte
& 0xFF]; }
accordingly ?
It should also inline { char c = decode((byte[])sa[sp]); } for (all
member variables are final !):
public char decode(final byte inByte) {
return (char)(inByte & 0xFF);
}
and
public char decode(final byte inByte) {
final char c = b2cMap[inByte & 0xFF];
if (c == REPLACE_CHAR)
throw UNMAPPABLE_EXCEPTION;
return c;
}
or
public char decode(final byte inByte) {
if (inByte >= 0)
return (char)inByte;
final char c = b2cMap[inByte & 0xFF];
if (c == REPLACE_CHAR)
throw UNMAPPABLE_EXCEPTION;
return c;
}
For encoders (only most complex example):
Execute { byte b = encode((char[])sa[sp]); } for (all member variables
are final !):
public byte encode(char inChar) throws RuntimeException,
UnmappableCharacterException {
if (inChar < directEnd)
return (byte)inChar;
final byte b;
if ((inChar -= map1Start) < c2bMap1.length)
b = c2bMap1[inChar];
else if ((inChar -= map2Offset) < c2bMap2.length)
b = c2bMap2[inChar];
// if calculated index runs out of the map's scope, an accordant
RuntimeException is thrown
else
b = c2bPagedMap[(inChar-=pagedMapOffset) >> shift][inChar &
mask]; // TODO: simplify for shift == 0
// If output byte is zero, character is unmappable
if (b == 0x00)
throw UNMAPPABLE_EXCEPTION;
return b;
}
should be optimized.
This would be very nice, because I have to decide weather I choose
[inByte + 0x80] or [inByte & 0xFF] in my code for enhancing charset coders.
Also I'm wondering, that hotspot doesn't seem to inline the tiny
de/encode() method, as I experience performance gain factor 2, if I
inline it manually in my source code.
See in lines 32 to 123:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/src/sun/nio/cs/SingleByteDecoder_new.java?rev=572&view=markup
See in lines 67 to 109 to 117:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/src/sun/nio/cs/SingleByteEncoder_new.java?rev=572&view=markup
Am 10.01.2009 19:39, John Rose schrieb:
> On Jan 9, 2009, at 6:55 PM, Ulf Zibis wrote:
>
>> Do you see any chance, that HotSpot optimizer could be enhanced in
>> that way, because the loop, we are speaking about, is the central
>> loop in all charset coders of the JVM.
>>
>
> It's already in there, to some degree, but hindered somehow by the
> peepholing problem. See 'instruct loadUB' around line 6406 of:
> http://hg.openjdk.java.net/jdk7/hotspot/hotspot/file/tip/src/cpu/x86/vm/x86_32.ad
>
> What that does is, when it is time to "match" (or lower) ideal to
> machine nodes in the IR graph, if a suitable AndI and LoadB are
> adjacent, and if the LoadB is unshared, they are coalesced into a
> loadUB machine node.
>
> It would be a detailed debugging exercise to find out why, in the case
> of your code, that optimization does not appear to kick in.
>
> -- John
More information about the hotspot-dev
mailing list