Short/Character.reverseBytes intrinsics
Ulf Zibis
Ulf.Zibis at gmx.de
Thu Apr 15 02:46:44 PDT 2010
Much thanks Chuck, not easy to find.
-Ulf
Am 14.04.2010 20:58, schrieb Chuck Rasbold:
> See the document "Intel® 64 and IA-32 Architectures Software
> Developer's Manual Volume 2A: Instruction Set Reference, A-M"
>
>
>
> The section on movbe (currently page 732) indicates that the #UD
> exception is thrown if CPUUID.O1H:ECX.MOVBE[bit 22] = 0.
>
>
> Additionally, the CPUID section (table 3-15, page 261) lists all the
> bit numbers and specifically the MOVBE bit.
>
> -- Chuck
>
> On Wed, Apr 14, 2010 at 11:29 AM, Ulf Zibis <Ulf.Zibis at gmx.de
> <mailto:Ulf.Zibis at gmx.de>> wrote:
>
> Ian, thanks for your explanation.
>
> Where you found this information about movbe?
> I can't find any statement about this in the docs, listed here:
> http://developer.intel.com/products/processor/manuals/index.htm
>
> -Ulf
>
>
> Am 14.04.2010 19:39, schrieb Ian Rogers:
>
> Hi,
>
> by Atom I mean the Intel Atom netbook processor. No other
> processor
> provides movbe at this moment (AMD or Intel).
>
> Regards,
> Ian
>
> On 14 April 2010 06:22, Ulf Zibis<Ulf.Zibis at gmx.de
> <mailto:Ulf.Zibis at gmx.de>> wrote:
>
>
> Hi Roger,
>
> not sure if I understand right. Is 'Atom' meant as an
> AMD/Intel architecture
> ?
>
> Movbe I found in the instruction set of Intel.
>
> -Ulf
>
>
> Am 14.04.2010 01:57, schrieb Ian Rogers:
>
> Hi Ulf,
>
> movbe is an Atom only instruction.
>
> Regards,
> Ian
>
> On 13 April 2010 13:50, Ulf Zibis<Ulf.Zibis at gmx.de
> <mailto:Ulf.Zibis at gmx.de>> wrote:
>
>
> +1
> I would like to see this enhancement.
>
> But we could do better on x86, as I guess those
> swap instructions would
> likely come accompanied with a move:
> 0x00b95d79: bswap %ebx
> ;*invokevirtual putInt
> ...
> 0x00b95d8d: mov %ebx,(%eax,%ecx,1)
> ;*invokevirtual putInt
>
> could be shorter:
> ...
> 0x00b95d8b: movbe %ebx,(%eax,%ecx,1)
> ;*invokevirtual putInt
>
> On char/short there could be an additional win:
> swap(x) { return (char)(Integer.reverseBytes(x)>>>
> 16); }:
> 0x00b8965d: bswap %edx
> 0x00b8965f: shr $0x10,%edx
> ...
> 0x00b8966c: mov %dx,(%eax)
> ;*invokevirtual putChar
> ; -
> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>
> ... but best would be:
> 0x00b89667: movbe %dx,(%eax)
> ;*invokevirtual putChar
> ; -
> java.nio.DirectByteBuffer::putChar at 30 (line 482)
>
> Same thoughts on getInt, getChar/Short.
> On SPARC I don't know.
>
> -Ulf
>
>
> Am 13.04.2010 21:11, schrieb Hiroshi Yamauchi:
>
>
> Hi there,
>
> I'd like to contribute this patch that
> implements the intrinsics for
> Short/Character.reverseBytes (in C2):
>
> http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.01/
> <http://cr.openjdk.java.net/%7Erasbold/reversebytes/webrev.01/>
> (Patch 1)
>
> (Thanks to Chuck for reviewing it and creating
> the webrev on my behalf.)
>
> This adds new siblings for the existing
> Integer/Long.reverseBytes
> intrinsics. Note: I did my best for the sparc
> implementation
> (sparc.ad <http://sparc.ad>) but haven't been
> able to build or test it (I don't have
> access to a sparc machine.)
>
> An impact of this patch can be seen in the
> microbenchmark
> jdk/test/java/nio/Buffer/SwapMicroBenchmark
> (which was written by
> Martin) with an experimental patch that lets
> DirectByteBuffer use
> those intrinsics (instead of simple Java
> implementations) on
> non-native endian operations:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/nioBits.java/
> <http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk7/nioBits.java/>
> (Patch 2)
>
> This patch hasn't been checked in yet but is
> being worked on by Martin
> and
> Ulf.
>
> The numbers from my measurements on x86 32 bit
> follow. Note the
> numbers for BIG_ENDIAN.
>
> ----------Unmodified----------
> Method Millis Ratio
> swap char BIG_ENDIAN 64 1.000
> swap char LITTLE_ENDIAN 31 0.492
> swap short BIG_ENDIAN 75 1.176
> swap short LITTLE_ENDIAN 31 0.496
> swap int BIG_ENDIAN 45 0.711
> swap int LITTLE_ENDIAN 8 0.125
> swap long BIG_ENDIAN 72 1.131
> swap long LITTLE_ENDIAN 17 0.277
>
> ----------Modified (with Patches 1 and
> 2)----------
> Method Millis Ratio
> swap char BIG_ENDIAN 44 1.000
> swap char LITTLE_ENDIAN 31 0.709
> swap short BIG_ENDIAN 44 1.004
> swap short LITTLE_ENDIAN 31 0.708
> swap int BIG_ENDIAN 18 0.423
> swap int LITTLE_ENDIAN 8 0.180
> swap long BIG_ENDIAN 24 0.544
> swap long LITTLE_ENDIAN 17 0.400
>
> The speedups are clearly non-trivial. The
> speedup for int/long is due
> to the use of the existing
> Integer/Long.reverseBytes intrinsics in
> DirectByteBuffer (Patch 2). The speedup for
> short/char is due to the
> use of the new Character/Short.reverseBytes
> intrinsics in
> DirectByteBuffer (Patch 1) and Patch 2.
>
> Anyone willing to review it (Patch 1)?
>
> Thanks,
> Hiroshi
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20100415/03542937/attachment.html
More information about the hotspot-compiler-dev
mailing list