Swap should be better done native?
Ulf Zibis
Ulf.Zibis at gmx.de
Thu Apr 1 12:26:05 UTC 2010
Am 31.03.2010 22:54, schrieb Martin Buchholz:
> On Wed, Mar 31, 2010 at 13:41, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>
>> You remember on UTF-8 twiddling:
>>
>> Am 16.03.2010 22:51, schrieb Ulf Zibis:
>>
>>> Am 16.03.2010 21:57, schrieb Martin Buchholz:
>>>
>>>> On Tue, Mar 16, 2010 at 12:48, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>>>>
>>>>>>> 8-bit shift + compare would allow HotSpot to compile to smart 1-byte
>>>>>>> immediate op-codes.
>>>>>>> In encodeBufferLoop() you could use putChar(), putInt() instead put().
>>>>>>> Should perform better.
>>>>>>>
>>>>>>>
>>>>>> I'm not convinced. You would need to assemble bytes into an
>>>>>> int, and then break them apart into bytes on the other side?
>>>>>>
>>>>>>
>>>>> Some time ago, I disassembled such code. I could see, that the int was
>>>>> copied directly to memory by one 32-bit move instruction.
>>>>> In case of using put(byte), I saw 4 8-bit move instructions.
>>>>>
>>>> Ulf, I'd like to understand this better.
>>>>
>>>> How are you generating the machine code
>>>> (pointer to docs?)?
>>>>
>>> I must prepare it. Takes some time.
>>>
>> Now you can see, that putInt(int) is done at once, so faster as 4 put(byte).
>>
> Leider versteh ich dass immer noch nicht.
>
Thanks for some little German. :-D
> Ich weiss auch noch nicht,
> how to generate disassembled code.
>
Sorry about my flippancy/ignorance.
I started here: http://wikis.sun.com/display/HotSpotInternals/PrintAssembly
On Windows I use a hsdis-i386.dll compiled by Andreas Schösser and
provided from Volker Simonis <volker.simonis at gmail.com>.
> How can 4 put(byte) be converted into one put(int)?
>
See the following code snippets ...
===================================================
Codesnippet from EUC_TW$Encoder:
dst.put(SS2);
dst.put((byte)(0xa0 | p));
dst.put((byte)(db >> 8));
dst.put((byte)db);
becomes (124 bytes):
0x00b94ab2: mov %edx,%ebx ;*invokevirtual put
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94ab4: mov 0x14(%ebx),%ecx ;*getfield position
; -
java.nio.Buffer::nextPutIndex at 1 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94ab7: mov 0x18(%ebx),%edx ;*getfield limit
; -
java.nio.Buffer::nextPutIndex at 5 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94aba: cmp %edx,%ecx
0x00b94abc: jge 0x00b94b79 ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94ac2: mov %ebx,0x14(%esp)
0x00b94ac6: mov %ebx,%esi
0x00b94ac8: mov 0x8(%esi),%ebp
0x00b94acb: mov 0xc(%esi),%edi
0x00b94ace: mov 0x8(%esp),%eax
0x00b94ad2: or $0xa0,%eax ;*ior ; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 107 (line 267)
0x00b94ad8: mov %eax,0x8(%esp)
0x00b94adc: mov %ebp,%eax
0x00b94ade: mov %ecx,%ebx
0x00b94ae0: inc %ebx ;*iadd
; -
java.nio.Buffer::nextPutIndex at 26 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94ae1: mov %ebx,0x14(%esi) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94ae4: add %ecx,%eax
0x00b94ae6: movb $0x8e,(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 97 (line 266)
0x00b94ae9: cmp %edx,%ebx
0x00b94aeb: jge 0x00b94b8d ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 109 (line 267)
0x00b94af1: mov 0x8(%esp),%ebx
0x00b94af5: mov %bl,0x1(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 109 (line 267)
0x00b94af8: mov %ecx,%ebx
0x00b94afa: add $0x2,%ebx ;*iadd
; -
java.nio.Buffer::nextPutIndex at 26 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 109 (line 267)
0x00b94afd: mov %ebx,0x14(%esi) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 109 (line 267)
0x00b94b00: cmp %edx,%ebx
0x00b94b02: jge 0x00b94b9d ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 119 (line 268)
0x00b94b08: mov 0x18(%esp),%ebx
0x00b94b0c: mov %bl,0x2(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 119 (line 268)
0x00b94b0f: mov %ecx,%ebx
0x00b94b11: add $0x3,%ebx ;*iadd
; -
java.nio.Buffer::nextPutIndex at 26 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 119 (line 268)
0x00b94b14: mov %esi,%ebp
0x00b94b16: mov %ebx,0x14(%ebp) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 119 (line 268)
0x00b94b19: cmp %edx,%ebx
0x00b94b1b: jge 0x00b94ba9 ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 126 (line 269)
0x00b94b21: mov 0x4(%esp),%edx
0x00b94b25: mov %dl,0x3(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 126 (line 269)
0x00b94b28: add $0x4,%ecx
0x00b94b2b: mov %ecx,0x14(%ebp) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer0::encode at 126 (line 269)
===================================================
Alternative 1 codesnippet:
dst.putInt((SS2 << 24) | (0xa0 << 16) | (p << 16) | db);
becomes (63 bytes):
0x00b95d51: shl $0x10,%ebx
0x00b95d54: or %edi,%ebx
0x00b95d56: or $0x8ea00000,%ebx ;*ior ; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 93 (line 265)
0x00b95d5c: cmp $0x282c61d8,%ebp ;
{oop('java/nio/DirectByteBuffer')}
0x00b95d62: jne 0x00b95dcd ;*invokevirtual putInt
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d64: mov 0x18(%edx),%edi
0x00b95d67: mov 0x14(%edx),%ecx ;*getfield position
; -
java.nio.Buffer::nextPutIndex at 5 (line 518)
; -
java.nio.DirectByteBuffer::putInt at 4 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d6a: sub %ecx,%edi
0x00b95d6c: cmp $0x4,%edi
0x00b95d6f: jl 0x00b95de1 ;*if_icmpge
; -
java.nio.Buffer::nextPutIndex at 10 (line 518)
; -
java.nio.DirectByteBuffer::putInt at 4 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d71: movzbl 0x26(%edx),%eax
0x00b95d75: test %eax,%eax
0x00b95d77: jne 0x00b95d7b ;*ifeq
; -
java.nio.DirectByteBuffer::putInt at 17 (line 664)
; -
java.nio.DirectByteBuffer::putInt at 11 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d79: bswap %ebx ;*invokevirtual putInt
; -
java.nio.DirectByteBuffer::putInt at 30 (line 664)
; -
java.nio.DirectByteBuffer::putInt at 11 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d7b: mov %edx,%esi
0x00b95d7d: mov 0x8(%esi),%ebp
0x00b95d80: mov 0xc(%esi),%edi ;*getfield address
; -
java.nio.DirectByteBuffer::ix at 1 (line 225)
; -
java.nio.DirectByteBuffer::putInt at 7 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d83: mov %ecx,%eax
0x00b95d85: add $0x4,%eax
0x00b95d88: mov %eax,0x14(%edx) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 33 (line 521)
; -
java.nio.DirectByteBuffer::putInt at 4 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
0x00b95d8b: mov %ebp,%eax
0x00b95d8d: mov %ebx,(%eax,%ecx,1) ;*invokevirtual putInt
; -
java.nio.DirectByteBuffer::putInt at 30 (line 664)
; -
java.nio.DirectByteBuffer::putInt at 11 (line 676)
; -
sun.nio.cs.ext.E_31_d_n_codeToBuffer1::encode at 94 (line 265)
===================================================
On big endian machines, additionally the swapping becomes omitted.
===================================================
Alternative 2 codesnippet:
bb[0] = SS2;
bb[1] = (byte)(0xa0 | p);
bb[2] = (byte)(db >> 8);
bb[3] = (byte)db;
dst.put(bb, 0, 4);
becomes (149 bytes):
0x00b95f68: mov 0x30(%esp),%ebp ;*invokevirtual put
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95f6c: mov %ebp,0x30(%esp)
0x00b95f70: add $0xfffffffc,%ecx
0x00b95f73: or $0x4,%ecx
0x00b95f76: test %ecx,%ecx
0x00b95f78: jl 0x00b96120 ;*ifge
; -
java.nio.Buffer::checkBounds at 13 (line 551)
; - java.nio.ByteBuffer::put at 4
(line 803)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95f7e: mov 0x14(%ebp),%ecx ;*getfield position
; -
java.nio.Buffer::remaining at 5 (line 383)
; - java.nio.ByteBuffer::put at 9
(line 804)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95f81: mov 0x18(%ebp),%ebx ;*getfield limit
; -
java.nio.Buffer::remaining at 1 (line 383)
; - java.nio.ByteBuffer::put at 9
(line 804)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95f84: mov %ebx,%ebp
0x00b95f86: sub %ecx,%ebp
0x00b95f88: cmp $0x4,%ebp
0x00b95f8b: jl 0x00b9612d ;*if_icmple
; - java.nio.ByteBuffer::put at 12
(line 804)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95f91: cmp %ebx,%ecx
0x00b95f93: jge 0x00b96139 ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95f99: mov %eax,0x18(%esp)
0x00b95f9d: mov %edx,0x14(%esp)
0x00b95fa1: mov 0x30(%esp),%esi
0x00b95fa5: mov 0x8(%esi),%ebp
0x00b95fa8: mov 0xc(%esi),%edi
0x00b95fab: mov %ecx,%edx
0x00b95fad: inc %edx ;*iadd
; -
java.nio.Buffer::nextPutIndex at 26 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fae: mov %edx,0x14(%esi) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fb1: mov %ebp,%eax
0x00b95fb3: add %ecx,%eax
0x00b95fb5: movb $0x8e,(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fb8: cmp %ebx,%edx
0x00b95fba: jge 0x00b96155 ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fc0: mov 0x8(%esp),%edx
0x00b95fc4: mov %dl,0x1(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fc7: mov %ecx,%ebp
0x00b95fc9: add $0x2,%ebp ;*iadd
; -
java.nio.Buffer::nextPutIndex at 26 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fcc: mov %ebp,0x14(%esi) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fcf: cmp %ebx,%ebp
0x00b95fd1: jge 0x00b96142 ;*if_icmplt
; -
java.nio.Buffer::nextPutIndex at 8 (line 512)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fd7: mov 0x18(%esp),%edx
0x00b95fdb: mov %dl,0x2(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fde: mov %ecx,%ebp
0x00b95fe0: add $0x3,%ebp ;*iadd
; -
java.nio.Buffer::nextPutIndex at 26 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fe3: mov %ebp,0x14(%esi) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fe6: cmp %ebx,%ebp
0x00b95fe8: jge 0x00b96161 ;*aload_0
; - java.nio.ByteBuffer::put at 38
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95fee: mov 0x4(%esp),%ebx
0x00b95ff2: mov %bl,0x3(%eax) ;*invokevirtual putByte
; -
java.nio.DirectByteBuffer::put at 12 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
0x00b95ff5: add $0x4,%ecx
0x00b95ff8: mov %esi,%ebp
0x00b95ffa: mov %ecx,0x14(%ebp) ;*putfield position
; -
java.nio.Buffer::nextPutIndex at 27 (line 514)
; -
java.nio.DirectByteBuffer::put at 5 (line 271)
; - java.nio.ByteBuffer::put at 43
(line 808)
; -
java.nio.DirectByteBuffer::put at 117 (line 349)
; -
sun.nio.cs.ext.E_30_d_n_codeToBuffer01::encode at 136 (line 272)
===================================================
I believe, alternative 2 could become much better from HotSpot side
(enable -XX:+DoEscapeAnalysis didn't work here).
-Ulf
More information about the core-libs-dev
mailing list