Short/Character.reverseBytes intrinsics

Ulf Zibis Ulf.Zibis at gmx.de
Thu Apr 15 02:46:44 PDT 2010


Much thanks Chuck, not easy to find.

-Ulf


Am 14.04.2010 20:58, schrieb Chuck Rasbold:
> See the document "Intel® 64 and IA-32 Architectures Software 
> Developer's Manual Volume 2A: Instruction Set Reference, A-M"
>
>
>
>       The section on movbe (currently page 732) indicates that the #UD
>       exception is thrown if CPUUID.O1H:ECX.MOVBE[bit 22] = 0.
>
>
> Additionally, the CPUID section (table 3-15, page 261) lists all the 
> bit numbers and specifically the MOVBE bit.
>
> -- Chuck
>
> On Wed, Apr 14, 2010 at 11:29 AM, Ulf Zibis <Ulf.Zibis at gmx.de 
> <mailto:Ulf.Zibis at gmx.de>> wrote:
>
>     Ian, thanks for your explanation.
>
>     Where you found this information about movbe?
>     I can't find any statement about this in the docs, listed here:
>     http://developer.intel.com/products/processor/manuals/index.htm
>
>     -Ulf
>
>
>     Am 14.04.2010 19:39, schrieb Ian Rogers:
>
>         Hi,
>
>         by Atom I mean the Intel Atom netbook processor. No other
>         processor
>         provides movbe at this moment (AMD or Intel).
>
>         Regards,
>         Ian
>
>         On 14 April 2010 06:22, Ulf Zibis<Ulf.Zibis at gmx.de
>         <mailto:Ulf.Zibis at gmx.de>>  wrote:
>
>
>             Hi Roger,
>
>             not sure if I understand right. Is 'Atom' meant as an
>             AMD/Intel architecture
>             ?
>
>             Movbe I found in the instruction set of Intel.
>
>             -Ulf
>
>
>             Am 14.04.2010 01:57, schrieb Ian Rogers:
>
>                 Hi Ulf,
>
>                 movbe is an Atom only instruction.
>
>                 Regards,
>                 Ian
>
>                 On 13 April 2010 13:50, Ulf Zibis<Ulf.Zibis at gmx.de
>                 <mailto:Ulf.Zibis at gmx.de>>    wrote:
>
>
>                     +1
>                     I would like to see this enhancement.
>
>                     But we could do better on x86, as I guess those
>                     swap instructions would
>                     likely come accompanied with a move:
>                      0x00b95d79: bswap  %ebx              
>                     ;*invokevirtual putInt
>                      ...
>                      0x00b95d8d: mov    %ebx,(%eax,%ecx,1)
>                      ;*invokevirtual putInt
>
>                     could be shorter:
>                      ...
>                      0x00b95d8b: movbe    %ebx,(%eax,%ecx,1)
>                      ;*invokevirtual putInt
>
>                     On char/short there could be an additional win:
>                     swap(x) { return (char)(Integer.reverseBytes(x)>>>
>                        16); }:
>                      0x00b8965d: bswap  %edx
>                      0x00b8965f: shr    $0x10,%edx
>                      ...
>                      0x00b8966c: mov    %dx,(%eax)        
>                     ;*invokevirtual putChar
>                                                            ; -
>                     java.nio.DirectByteBuffer::putChar at 30 (line 482)
>
>                     ... but best would be:
>                      0x00b89667: movbe    %dx,(%eax)        
>                     ;*invokevirtual putChar
>                                                            ; -
>                     java.nio.DirectByteBuffer::putChar at 30 (line 482)
>
>                     Same thoughts on getInt, getChar/Short.
>                     On SPARC I don't know.
>
>                     -Ulf
>
>
>                     Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>
>
>                         Hi there,
>
>                         I'd like to contribute this patch that
>                         implements the intrinsics for
>                         Short/Character.reverseBytes (in C2):
>
>                         http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
>                         <http://cr.openjdk.java.net/%7Erasbold/reversebytes/webrev.01/>
>                          (Patch 1)
>
>                         (Thanks to Chuck for reviewing it and creating
>                         the webrev on my behalf.)
>
>                         This adds new siblings for the existing
>                         Integer/Long.reverseBytes
>                         intrinsics. Note: I did my best for the sparc
>                         implementation
>                         (sparc.ad <http://sparc.ad>) but haven't been
>                         able to build or test it (I don't have
>                         access to a sparc machine.)
>
>                         An impact of this patch can be seen in the
>                         microbenchmark
>                         jdk/test/java/nio/Buffer/SwapMicroBenchmark
>                         (which was written by
>                         Martin) with an experimental patch that lets
>                         DirectByteBuffer use
>                         those intrinsics (instead of simple Java
>                         implementations) on
>                         non-native endian operations:
>
>                         http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
>                         <http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk7/nioBits.java/>
>                               (Patch 2)
>
>                         This patch hasn't been checked in yet but is
>                         being worked on by Martin
>                         and
>                         Ulf.
>
>                         The numbers from my measurements on x86 32 bit
>                         follow. Note the
>                         numbers for BIG_ENDIAN.
>
>                         ----------Unmodified----------
>                         Method                   Millis Ratio
>                         swap char BIG_ENDIAN         64 1.000
>                         swap char LITTLE_ENDIAN      31 0.492
>                         swap short BIG_ENDIAN        75 1.176
>                         swap short LITTLE_ENDIAN     31 0.496
>                         swap int BIG_ENDIAN          45 0.711
>                         swap int LITTLE_ENDIAN        8 0.125
>                         swap long BIG_ENDIAN         72 1.131
>                         swap long LITTLE_ENDIAN      17 0.277
>
>                         ----------Modified (with Patches 1 and
>                         2)----------
>                         Method                   Millis Ratio
>                         swap char BIG_ENDIAN         44 1.000
>                         swap char LITTLE_ENDIAN      31 0.709
>                         swap short BIG_ENDIAN        44 1.004
>                         swap short LITTLE_ENDIAN     31 0.708
>                         swap int BIG_ENDIAN          18 0.423
>                         swap int LITTLE_ENDIAN        8 0.180
>                         swap long BIG_ENDIAN         24 0.544
>                         swap long LITTLE_ENDIAN      17 0.400
>
>                         The speedups are clearly non-trivial. The
>                         speedup for int/long is due
>                         to the use of the existing
>                         Integer/Long.reverseBytes intrinsics in
>                         DirectByteBuffer (Patch 2). The speedup for
>                         short/char is due to the
>                         use of the new Character/Short.reverseBytes
>                         intrinsics in
>                         DirectByteBuffer (Patch 1) and Patch 2.
>
>                         Anyone willing to review it (Patch 1)?
>
>                         Thanks,
>                         Hiroshi
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20100415/03542937/attachment.html 


More information about the hotspot-compiler-dev mailing list