Short/Character.reverseBytes intrinsics

Hiroshi Yamauchi yamauchi at google.com
Mon Apr 19 14:47:40 PDT 2010


It's a good point.

I'm just curious - is the movbe instruction supported on the Atom chip
only or all Intel processors released around or after 2008? What about
AMD?

Hiroshi

On Tue, Apr 13, 2010 at 1:50 PM, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> +1
> I would like to see this enhancement.
>
> But we could do better on x86, as I guess those swap instructions would
> likely come accompanied with a move:
>  0x00b95d79: bswap  %ebx               ;*invokevirtual putInt
>  ...
>  0x00b95d8d: mov    %ebx,(%eax,%ecx,1)  ;*invokevirtual putInt
>
> could be shorter:
>  ...
>  0x00b95d8b: movbe    %ebx,(%eax,%ecx,1)  ;*invokevirtual putInt
>
> On char/short there could be an additional win:
> swap(x) { return (char)(Integer.reverseBytes(x) >>> 16); }:
>  0x00b8965d: bswap  %edx
>  0x00b8965f: shr    $0x10,%edx
>  ...
>  0x00b8966c: mov    %dx,(%eax)         ;*invokevirtual putChar
>                                        ; -
> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>
> ... but best would be:
>  0x00b89667: movbe    %dx,(%eax)         ;*invokevirtual putChar
>                                        ; -
> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>
> Same thoughts on getInt, getChar/Short.
> On SPARC I don't know.
>
> -Ulf
>
>
> Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>>
>> Hi there,
>>
>> I'd like to contribute this patch that implements the intrinsics for
>> Short/Character.reverseBytes (in C2):
>>
>>   http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
>>  (Patch 1)
>>
>> (Thanks to Chuck for reviewing it and creating the webrev on my behalf.)
>>
>> This adds new siblings for the existing Integer/Long.reverseBytes
>> intrinsics. Note: I did my best for the sparc implementation
>> (sparc.ad) but haven't been able to build or test it (I don't have
>> access to a sparc machine.)
>>
>> An impact of this patch can be seen in the microbenchmark
>> jdk/test/java/nio/Buffer/SwapMicroBenchmark (which was written by
>> Martin) with an experimental patch that lets DirectByteBuffer use
>> those intrinsics (instead of simple Java implementations) on
>> non-native endian operations:
>>
>>   http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
>>       (Patch 2)
>>
>> This patch hasn't been checked in yet but is being worked on by Martin and
>> Ulf.
>>
>> The numbers from my measurements on x86 32 bit follow. Note the
>> numbers for BIG_ENDIAN.
>>
>> ----------Unmodified----------
>> Method                   Millis Ratio
>> swap char BIG_ENDIAN         64 1.000
>> swap char LITTLE_ENDIAN      31 0.492
>> swap short BIG_ENDIAN        75 1.176
>> swap short LITTLE_ENDIAN     31 0.496
>> swap int BIG_ENDIAN          45 0.711
>> swap int LITTLE_ENDIAN        8 0.125
>> swap long BIG_ENDIAN         72 1.131
>> swap long LITTLE_ENDIAN      17 0.277
>>
>> ----------Modified (with Patches 1 and 2)----------
>> Method                   Millis Ratio
>> swap char BIG_ENDIAN         44 1.000
>> swap char LITTLE_ENDIAN      31 0.709
>> swap short BIG_ENDIAN        44 1.004
>> swap short LITTLE_ENDIAN     31 0.708
>> swap int BIG_ENDIAN          18 0.423
>> swap int LITTLE_ENDIAN        8 0.180
>> swap long BIG_ENDIAN         24 0.544
>> swap long LITTLE_ENDIAN      17 0.400
>>
>> The speedups are clearly non-trivial. The speedup for int/long is due
>> to the use of the existing Integer/Long.reverseBytes intrinsics in
>> DirectByteBuffer (Patch 2). The speedup for short/char is due to the
>> use of the new Character/Short.reverseBytes intrinsics in
>> DirectByteBuffer (Patch 1) and Patch 2.
>>
>> Anyone willing to review it (Patch 1)?
>>
>> Thanks,
>> Hiroshi
>>
>>
>>
>
>


More information about the hotspot-compiler-dev mailing list