Short/Character.reverseBytes intrinsics

Ulf Zibis Ulf.Zibis at gmx.de
Wed Apr 14 11:29:20 PDT 2010


Ian, thanks for your explanation.

Where you found this information about movbe?
I can't find any statement about this in the docs, listed here:
http://developer.intel.com/products/processor/manuals/index.htm

-Ulf


Am 14.04.2010 19:39, schrieb Ian Rogers:
> Hi,
>
> by Atom I mean the Intel Atom netbook processor. No other processor
> provides movbe at this moment (AMD or Intel).
>
> Regards,
> Ian
>
> On 14 April 2010 06:22, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Hi Roger,
>>
>> not sure if I understand right. Is 'Atom' meant as an AMD/Intel architecture
>> ?
>>
>> Movbe I found in the instruction set of Intel.
>>
>> -Ulf
>>
>>
>> Am 14.04.2010 01:57, schrieb Ian Rogers:
>>      
>>> Hi Ulf,
>>>
>>> movbe is an Atom only instruction.
>>>
>>> Regards,
>>> Ian
>>>
>>> On 13 April 2010 13:50, Ulf Zibis<Ulf.Zibis at gmx.de>    wrote:
>>>
>>>        
>>>> +1
>>>> I would like to see this enhancement.
>>>>
>>>> But we could do better on x86, as I guess those swap instructions would
>>>> likely come accompanied with a move:
>>>>   0x00b95d79: bswap  %ebx               ;*invokevirtual putInt
>>>>   ...
>>>>   0x00b95d8d: mov    %ebx,(%eax,%ecx,1)  ;*invokevirtual putInt
>>>>
>>>> could be shorter:
>>>>   ...
>>>>   0x00b95d8b: movbe    %ebx,(%eax,%ecx,1)  ;*invokevirtual putInt
>>>>
>>>> On char/short there could be an additional win:
>>>> swap(x) { return (char)(Integer.reverseBytes(x)>>>    16); }:
>>>>   0x00b8965d: bswap  %edx
>>>>   0x00b8965f: shr    $0x10,%edx
>>>>   ...
>>>>   0x00b8966c: mov    %dx,(%eax)         ;*invokevirtual putChar
>>>>                                         ; -
>>>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>>>
>>>> ... but best would be:
>>>>   0x00b89667: movbe    %dx,(%eax)         ;*invokevirtual putChar
>>>>                                         ; -
>>>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>>>
>>>> Same thoughts on getInt, getChar/Short.
>>>> On SPARC I don't know.
>>>>
>>>> -Ulf
>>>>
>>>>
>>>> Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>>>>
>>>>          
>>>>> Hi there,
>>>>>
>>>>> I'd like to contribute this patch that implements the intrinsics for
>>>>> Short/Character.reverseBytes (in C2):
>>>>>
>>>>>    http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
>>>>>   (Patch 1)
>>>>>
>>>>> (Thanks to Chuck for reviewing it and creating the webrev on my behalf.)
>>>>>
>>>>> This adds new siblings for the existing Integer/Long.reverseBytes
>>>>> intrinsics. Note: I did my best for the sparc implementation
>>>>> (sparc.ad) but haven't been able to build or test it (I don't have
>>>>> access to a sparc machine.)
>>>>>
>>>>> An impact of this patch can be seen in the microbenchmark
>>>>> jdk/test/java/nio/Buffer/SwapMicroBenchmark (which was written by
>>>>> Martin) with an experimental patch that lets DirectByteBuffer use
>>>>> those intrinsics (instead of simple Java implementations) on
>>>>> non-native endian operations:
>>>>>
>>>>>    http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
>>>>>        (Patch 2)
>>>>>
>>>>> This patch hasn't been checked in yet but is being worked on by Martin
>>>>> and
>>>>> Ulf.
>>>>>
>>>>> The numbers from my measurements on x86 32 bit follow. Note the
>>>>> numbers for BIG_ENDIAN.
>>>>>
>>>>> ----------Unmodified----------
>>>>> Method                   Millis Ratio
>>>>> swap char BIG_ENDIAN         64 1.000
>>>>> swap char LITTLE_ENDIAN      31 0.492
>>>>> swap short BIG_ENDIAN        75 1.176
>>>>> swap short LITTLE_ENDIAN     31 0.496
>>>>> swap int BIG_ENDIAN          45 0.711
>>>>> swap int LITTLE_ENDIAN        8 0.125
>>>>> swap long BIG_ENDIAN         72 1.131
>>>>> swap long LITTLE_ENDIAN      17 0.277
>>>>>
>>>>> ----------Modified (with Patches 1 and 2)----------
>>>>> Method                   Millis Ratio
>>>>> swap char BIG_ENDIAN         44 1.000
>>>>> swap char LITTLE_ENDIAN      31 0.709
>>>>> swap short BIG_ENDIAN        44 1.004
>>>>> swap short LITTLE_ENDIAN     31 0.708
>>>>> swap int BIG_ENDIAN          18 0.423
>>>>> swap int LITTLE_ENDIAN        8 0.180
>>>>> swap long BIG_ENDIAN         24 0.544
>>>>> swap long LITTLE_ENDIAN      17 0.400
>>>>>
>>>>> The speedups are clearly non-trivial. The speedup for int/long is due
>>>>> to the use of the existing Integer/Long.reverseBytes intrinsics in
>>>>> DirectByteBuffer (Patch 2). The speedup for short/char is due to the
>>>>> use of the new Character/Short.reverseBytes intrinsics in
>>>>> DirectByteBuffer (Patch 1) and Patch 2.
>>>>>
>>>>> Anyone willing to review it (Patch 1)?
>>>>>
>>>>> Thanks,
>>>>> Hiroshi
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>            
>>>>
>>>>          
>>>
>>>        
>>
>>      
>
>    



More information about the hotspot-compiler-dev mailing list