Short/Character.reverseBytes intrinsics
Chuck Rasbold
rasbold at google.com
Wed Apr 14 11:58:49 PDT 2010
See the document "Intel® 64 and IA-32 Architectures Software Developer's
Manual Volume 2A: Instruction Set Reference, A-M"
The section on movbe (currently page 732) indicates that the #UD exception
is thrown if CPUUID.O1H:ECX.MOVBE[bit 22] = 0.
Additionally, the CPUID section (table 3-15, page 261) lists all the bit
numbers and specifically the MOVBE bit.
-- Chuck
On Wed, Apr 14, 2010 at 11:29 AM, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Ian, thanks for your explanation.
>
> Where you found this information about movbe?
> I can't find any statement about this in the docs, listed here:
> http://developer.intel.com/products/processor/manuals/index.htm
>
> -Ulf
>
>
> Am 14.04.2010 19:39, schrieb Ian Rogers:
>
>> Hi,
>>
>> by Atom I mean the Intel Atom netbook processor. No other processor
>> provides movbe at this moment (AMD or Intel).
>>
>> Regards,
>> Ian
>>
>> On 14 April 2010 06:22, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>>
>>
>>
>>> Hi Roger,
>>>
>>> not sure if I understand right. Is 'Atom' meant as an AMD/Intel
>>> architecture
>>> ?
>>>
>>> Movbe I found in the instruction set of Intel.
>>>
>>> -Ulf
>>>
>>>
>>> Am 14.04.2010 01:57, schrieb Ian Rogers:
>>>
>>>
>>>> Hi Ulf,
>>>>
>>>> movbe is an Atom only instruction.
>>>>
>>>> Regards,
>>>> Ian
>>>>
>>>> On 13 April 2010 13:50, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>>>>
>>>>
>>>>
>>>>> +1
>>>>> I would like to see this enhancement.
>>>>>
>>>>> But we could do better on x86, as I guess those swap instructions would
>>>>> likely come accompanied with a move:
>>>>> 0x00b95d79: bswap %ebx ;*invokevirtual putInt
>>>>> ...
>>>>> 0x00b95d8d: mov %ebx,(%eax,%ecx,1) ;*invokevirtual putInt
>>>>>
>>>>> could be shorter:
>>>>> ...
>>>>> 0x00b95d8b: movbe %ebx,(%eax,%ecx,1) ;*invokevirtual putInt
>>>>>
>>>>> On char/short there could be an additional win:
>>>>> swap(x) { return (char)(Integer.reverseBytes(x)>>> 16); }:
>>>>> 0x00b8965d: bswap %edx
>>>>> 0x00b8965f: shr $0x10,%edx
>>>>> ...
>>>>> 0x00b8966c: mov %dx,(%eax) ;*invokevirtual putChar
>>>>> ; -
>>>>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>>>>
>>>>> ... but best would be:
>>>>> 0x00b89667: movbe %dx,(%eax) ;*invokevirtual putChar
>>>>> ; -
>>>>> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>>>>>
>>>>> Same thoughts on getInt, getChar/Short.
>>>>> On SPARC I don't know.
>>>>>
>>>>> -Ulf
>>>>>
>>>>>
>>>>> Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>>>>>
>>>>>
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> I'd like to contribute this patch that implements the intrinsics for
>>>>>> Short/Character.reverseBytes (in C2):
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
>>>>>> (Patch 1)
>>>>>>
>>>>>> (Thanks to Chuck for reviewing it and creating the webrev on my
>>>>>> behalf.)
>>>>>>
>>>>>> This adds new siblings for the existing Integer/Long.reverseBytes
>>>>>> intrinsics. Note: I did my best for the sparc implementation
>>>>>> (sparc.ad) but haven't been able to build or test it (I don't have
>>>>>> access to a sparc machine.)
>>>>>>
>>>>>> An impact of this patch can be seen in the microbenchmark
>>>>>> jdk/test/java/nio/Buffer/SwapMicroBenchmark (which was written by
>>>>>> Martin) with an experimental patch that lets DirectByteBuffer use
>>>>>> those intrinsics (instead of simple Java implementations) on
>>>>>> non-native endian operations:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
>>>>>> (Patch 2)
>>>>>>
>>>>>> This patch hasn't been checked in yet but is being worked on by Martin
>>>>>> and
>>>>>> Ulf.
>>>>>>
>>>>>> The numbers from my measurements on x86 32 bit follow. Note the
>>>>>> numbers for BIG_ENDIAN.
>>>>>>
>>>>>> ----------Unmodified----------
>>>>>> Method Millis Ratio
>>>>>> swap char BIG_ENDIAN 64 1.000
>>>>>> swap char LITTLE_ENDIAN 31 0.492
>>>>>> swap short BIG_ENDIAN 75 1.176
>>>>>> swap short LITTLE_ENDIAN 31 0.496
>>>>>> swap int BIG_ENDIAN 45 0.711
>>>>>> swap int LITTLE_ENDIAN 8 0.125
>>>>>> swap long BIG_ENDIAN 72 1.131
>>>>>> swap long LITTLE_ENDIAN 17 0.277
>>>>>>
>>>>>> ----------Modified (with Patches 1 and 2)----------
>>>>>> Method Millis Ratio
>>>>>> swap char BIG_ENDIAN 44 1.000
>>>>>> swap char LITTLE_ENDIAN 31 0.709
>>>>>> swap short BIG_ENDIAN 44 1.004
>>>>>> swap short LITTLE_ENDIAN 31 0.708
>>>>>> swap int BIG_ENDIAN 18 0.423
>>>>>> swap int LITTLE_ENDIAN 8 0.180
>>>>>> swap long BIG_ENDIAN 24 0.544
>>>>>> swap long LITTLE_ENDIAN 17 0.400
>>>>>>
>>>>>> The speedups are clearly non-trivial. The speedup for int/long is due
>>>>>> to the use of the existing Integer/Long.reverseBytes intrinsics in
>>>>>> DirectByteBuffer (Patch 2). The speedup for short/char is due to the
>>>>>> use of the new Character/Short.reverseBytes intrinsics in
>>>>>> DirectByteBuffer (Patch 1) and Patch 2.
>>>>>>
>>>>>> Anyone willing to review it (Patch 1)?
>>>>>>
>>>>>> Thanks,
>>>>>> Hiroshi
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20100414/f3c1041e/attachment.html
More information about the hotspot-compiler-dev
mailing list