Short/Character.reverseBytes intrinsics
Ulf Zibis
Ulf.Zibis at gmx.de
Wed Apr 14 06:22:33 PDT 2010
Hi Roger,
not sure if I understand right. Is 'Atom' meant as an AMD/Intel
architecture ?
Movbe I found in the instruction set of Intel.
-Ulf
Am 14.04.2010 01:57, schrieb Ian Rogers:
> Hi Ulf,
>
> movbe is an Atom only instruction.
>
> Regards,
> Ian
>
> On 13 April 2010 13:50, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>
>> +1
>> I would like to see this enhancement.
>>
>> But we could do better on x86, as I guess those swap instructions would
>> likely come accompanied with a move:
>> 0x00b95d79: bswap %ebx ;*invokevirtual putInt
>> ...
>> 0x00b95d8d: mov %ebx,(%eax,%ecx,1) ;*invokevirtual putInt
>>
>> could be shorter:
>> ...
>> 0x00b95d8b: movbe %ebx,(%eax,%ecx,1) ;*invokevirtual putInt
>>
>> On char/short there could be an additional win:
>> swap(x) { return (char)(Integer.reverseBytes(x)>>> 16); }:
>> 0x00b8965d: bswap %edx
>> 0x00b8965f: shr $0x10,%edx
>> ...
>> 0x00b8966c: mov %dx,(%eax) ;*invokevirtual putChar
>> ; -
>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>
>> ... but best would be:
>> 0x00b89667: movbe %dx,(%eax) ;*invokevirtual putChar
>> ; -
>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>
>> Same thoughts on getInt, getChar/Short.
>> On SPARC I don't know.
>>
>> -Ulf
>>
>>
>> Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>>
>>> Hi there,
>>>
>>> I'd like to contribute this patch that implements the intrinsics for
>>> Short/Character.reverseBytes (in C2):
>>>
>>> http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
>>> (Patch 1)
>>>
>>> (Thanks to Chuck for reviewing it and creating the webrev on my behalf.)
>>>
>>> This adds new siblings for the existing Integer/Long.reverseBytes
>>> intrinsics. Note: I did my best for the sparc implementation
>>> (sparc.ad) but haven't been able to build or test it (I don't have
>>> access to a sparc machine.)
>>>
>>> An impact of this patch can be seen in the microbenchmark
>>> jdk/test/java/nio/Buffer/SwapMicroBenchmark (which was written by
>>> Martin) with an experimental patch that lets DirectByteBuffer use
>>> those intrinsics (instead of simple Java implementations) on
>>> non-native endian operations:
>>>
>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
>>> (Patch 2)
>>>
>>> This patch hasn't been checked in yet but is being worked on by Martin and
>>> Ulf.
>>>
>>> The numbers from my measurements on x86 32 bit follow. Note the
>>> numbers for BIG_ENDIAN.
>>>
>>> ----------Unmodified----------
>>> Method Millis Ratio
>>> swap char BIG_ENDIAN 64 1.000
>>> swap char LITTLE_ENDIAN 31 0.492
>>> swap short BIG_ENDIAN 75 1.176
>>> swap short LITTLE_ENDIAN 31 0.496
>>> swap int BIG_ENDIAN 45 0.711
>>> swap int LITTLE_ENDIAN 8 0.125
>>> swap long BIG_ENDIAN 72 1.131
>>> swap long LITTLE_ENDIAN 17 0.277
>>>
>>> ----------Modified (with Patches 1 and 2)----------
>>> Method Millis Ratio
>>> swap char BIG_ENDIAN 44 1.000
>>> swap char LITTLE_ENDIAN 31 0.709
>>> swap short BIG_ENDIAN 44 1.004
>>> swap short LITTLE_ENDIAN 31 0.708
>>> swap int BIG_ENDIAN 18 0.423
>>> swap int LITTLE_ENDIAN 8 0.180
>>> swap long BIG_ENDIAN 24 0.544
>>> swap long LITTLE_ENDIAN 17 0.400
>>>
>>> The speedups are clearly non-trivial. The speedup for int/long is due
>>> to the use of the existing Integer/Long.reverseBytes intrinsics in
>>> DirectByteBuffer (Patch 2). The speedup for short/char is due to the
>>> use of the new Character/Short.reverseBytes intrinsics in
>>> DirectByteBuffer (Patch 1) and Patch 2.
>>>
>>> Anyone willing to review it (Patch 1)?
>>>
>>> Thanks,
>>> Hiroshi
>>>
>>>
>>>
>>>
>>
>>
>
>
More information about the hotspot-compiler-dev
mailing list