Multiple copies of same code
Tom Rodriguez
Thomas.Rodriguez at Sun.COM
Mon Nov 30 11:09:36 PST 2009
On Nov 24, 2009, at 1:56 PM, Ulf Zibis wrote:
> I think, it's not only the code size that matters, but too the performance lack from all these jumps.
I wasn't suggesting that duplication of code is always irrelevant, just that in the particular case of the exception entry points it would impossible to measure an improvement from their elimination because the overall path is so expensive that a short jump wouldn't be noticeable.
tom
>
> In the method code below, you see a 2-line finally block. Looking at the compile result, I can see, that this block is repeated 6 times and consumes 1/3 of the whole assembly code for this method. Additionally, there are plenty of range-check and null-check block which too seem to be copy-and-pasted, so I guess, removing the redundant blocks from this example would make the code half-sized.
>
> On the other hand, the 1-length int [] dp could be optimized to a normal int field and pushing the 6 parameters to stack could be saved, if method decode() would be inlined, but isn't because of inline threshold, which sadly isn't frequency-related. This would additionally increase the performance.
>
>
> private CoderResult decodeArrayLoop(ByteBuffer src, CharBuffer dst) {
>
> byte[] sa = src.array();
> int sp = src.arrayOffset() + src.position();
> int sl = sp + src.remaining();
>
> char[] da = dst.array();
> int [] dp = new int[1];
> dp[0] = dst.arrayOffset() + dst.position();
> int dl = dp[0] + dst.remaining();
> try {
> while (sp < sl) {
> CoderResult result;
> byte byte1 = sa[sp];
> if (byte1 >= 0) { // ASCII G0
> if (dp[0] == dl)
> return CoderResult.OVERFLOW;
> da[dp[0]++] = (char)(byte1 & 0xff);
> sp++;
> } else if (byte1 != SS2) { // Codeset 1 G1
> if (sp + 1 == sl)
> break;
> result = decode(byte1, sa[sp+1], 0, da, dp, dl);
> if (result != null)
> return result;
> sp += 2;
> } else { // Codeset 2 G2
> if (sp + 4 > sl)
> break;
> int cnsPlane = cnspToIndex[sa[sp+1] & 0xff];
> if (cnsPlane < 0)
> return CoderResult.malformedForLength(2);
> result = decode(sa[sp+2], sa[sp+3], cnsPlane, da, dp, dl);
> if (result != null)
> return result;
> sp += 4;
> }
> }
> return CoderResult.UNDERFLOW;
> } finally {
> src.position(sp - src.arrayOffset());
> dst.position(dp[0] - dst.arrayOffset());
> }
> }
>
>
> -Ulf
>
>
> Am 22.11.2009 17:59, Chuck Rasbold schrieb:
>> Sure. It would be great to merge redundant code paths. But I don't
>> think the cost/benefit ratio is worth it.
>>
>> In the case you cite, there would be a savings of 4 bytes per path
>> removed, which are projected to be very infrequent. In a JIT, you
>> have to spend your compilation budget wisely.
>>
>> It's not that it can't be done. There are just better places to spend time.
>>
>> On Sat, Nov 21, 2009 at 5:54 AM, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>> In output of PrintAssembly I frequently see :
>>
>> ...
>> ... # more than 10 recurrences
>> ...
>> 726 B108: # B114 <- B10 Freq: 9.99898e-006
>> 726 # exception oop is in EAX; no code emitted
>> 726 MOV ECX,EAX
>> 728 JMP,s B114
>> 728
>> 72a B109: # B114 <- B9 Freq: 9.99918e-006
>> 72a # exception oop is in EAX; no code emitted
>> 72a MOV ECX,EAX
>> 72c JMP,s B114
>> 72c
>> 72e B110: # B114 <- B6 Freq: 9.99938e-006
>> 72e # exception oop is in EAX; no code emitted
>> 72e MOV ECX,EAX
>> 730 JMP,s B114
>> 730
>> 732 B111: # B114 <- B4 Freq: 9.99959e-006
>> 732 # exception oop is in EAX; no code emitted
>> 732 MOV ECX,EAX
>> 734 JMP,s B114
>> 734
>> 736 B112: # B114 <- B3 Freq: 9.99979e-006
>> 736 # exception oop is in EAX; no code emitted
>> 736 MOV ECX,EAX
>> 738 JMP,s B114
>> 738
>> 73a B113: # B114 <- B2 Freq: 9.99999e-006
>> 73a # exception oop is in EAX; no code emitted
>> 73a MOV ECX,EAX
>> 73a
>> 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005
>>
>>
>> Wouldn't it be better to have :
>>
>> ...
>> ... # more than 10 recurrences
>> ...
>> 73a B108: # B114 <- B10 Freq: 9.99898e-006
>> 73a B109: # B114 <- B9 Freq: 9.99918e-006
>> 73a B110: # B114 <- B6 Freq: 9.99938e-006
>> 73a B111: # B114 <- B4 Freq: 9.99959e-006
>> 73a B112: # B114 <- B3 Freq: 9.99979e-006
>> 73a B113: # B114 <- B2 Freq: 9.99999e-006
>> 73a # exception oop is in EAX; no code emitted
>> 73a MOV ECX,EAX
>> 73a
>> 73c B114: # N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99 Freq: 7.11172e-005
>>
>>
>>
More information about the hotspot-compiler-dev
mailing list