Short/Character.reverseBytes intrinsics

Ulf Zibis Ulf.Zibis at gmx.de
Wed Apr 14 06:22:33 PDT 2010


Hi Roger,

not sure if I understand right. Is 'Atom' meant as an AMD/Intel 
architecture ?

Movbe I found in the instruction set of Intel.

-Ulf


Am 14.04.2010 01:57, schrieb Ian Rogers:
> Hi Ulf,
>
> movbe is an Atom only instruction.
>
> Regards,
> Ian
>
> On 13 April 2010 13:50, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> +1
>> I would like to see this enhancement.
>>
>> But we could do better on x86, as I guess those swap instructions would
>> likely come accompanied with a move:
>>   0x00b95d79: bswap  %ebx               ;*invokevirtual putInt
>>   ...
>>   0x00b95d8d: mov    %ebx,(%eax,%ecx,1)  ;*invokevirtual putInt
>>
>> could be shorter:
>>   ...
>>   0x00b95d8b: movbe    %ebx,(%eax,%ecx,1)  ;*invokevirtual putInt
>>
>> On char/short there could be an additional win:
>> swap(x) { return (char)(Integer.reverseBytes(x)>>>  16); }:
>>   0x00b8965d: bswap  %edx
>>   0x00b8965f: shr    $0x10,%edx
>>   ...
>>   0x00b8966c: mov    %dx,(%eax)         ;*invokevirtual putChar
>>                                         ; -
>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>
>> ... but best would be:
>>   0x00b89667: movbe    %dx,(%eax)         ;*invokevirtual putChar
>>                                         ; -
>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>
>> Same thoughts on getInt, getChar/Short.
>> On SPARC I don't know.
>>
>> -Ulf
>>
>>
>> Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>>      
>>> Hi there,
>>>
>>> I'd like to contribute this patch that implements the intrinsics for
>>> Short/Character.reverseBytes (in C2):
>>>
>>>    http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
>>>   (Patch 1)
>>>
>>> (Thanks to Chuck for reviewing it and creating the webrev on my behalf.)
>>>
>>> This adds new siblings for the existing Integer/Long.reverseBytes
>>> intrinsics. Note: I did my best for the sparc implementation
>>> (sparc.ad) but haven't been able to build or test it (I don't have
>>> access to a sparc machine.)
>>>
>>> An impact of this patch can be seen in the microbenchmark
>>> jdk/test/java/nio/Buffer/SwapMicroBenchmark (which was written by
>>> Martin) with an experimental patch that lets DirectByteBuffer use
>>> those intrinsics (instead of simple Java implementations) on
>>> non-native endian operations:
>>>
>>>    http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
>>>        (Patch 2)
>>>
>>> This patch hasn't been checked in yet but is being worked on by Martin and
>>> Ulf.
>>>
>>> The numbers from my measurements on x86 32 bit follow. Note the
>>> numbers for BIG_ENDIAN.
>>>
>>> ----------Unmodified----------
>>> Method                   Millis Ratio
>>> swap char BIG_ENDIAN         64 1.000
>>> swap char LITTLE_ENDIAN      31 0.492
>>> swap short BIG_ENDIAN        75 1.176
>>> swap short LITTLE_ENDIAN     31 0.496
>>> swap int BIG_ENDIAN          45 0.711
>>> swap int LITTLE_ENDIAN        8 0.125
>>> swap long BIG_ENDIAN         72 1.131
>>> swap long LITTLE_ENDIAN      17 0.277
>>>
>>> ----------Modified (with Patches 1 and 2)----------
>>> Method                   Millis Ratio
>>> swap char BIG_ENDIAN         44 1.000
>>> swap char LITTLE_ENDIAN      31 0.709
>>> swap short BIG_ENDIAN        44 1.004
>>> swap short LITTLE_ENDIAN     31 0.708
>>> swap int BIG_ENDIAN          18 0.423
>>> swap int LITTLE_ENDIAN        8 0.180
>>> swap long BIG_ENDIAN         24 0.544
>>> swap long LITTLE_ENDIAN      17 0.400
>>>
>>> The speedups are clearly non-trivial. The speedup for int/long is due
>>> to the use of the existing Integer/Long.reverseBytes intrinsics in
>>> DirectByteBuffer (Patch 2). The speedup for short/char is due to the
>>> use of the new Character/Short.reverseBytes intrinsics in
>>> DirectByteBuffer (Patch 1) and Patch 2.
>>>
>>> Anyone willing to review it (Patch 1)?
>>>
>>> Thanks,
>>> Hiroshi
>>>
>>>
>>>
>>>        
>>
>>      
>
>    



More information about the hotspot-compiler-dev mailing list