Multiple copies of same code

Tom Rodriguez Thomas.Rodriguez at Sun.COM
Mon Nov 30 11:09:36 PST 2009


On Nov 24, 2009, at 1:56 PM, Ulf Zibis wrote:

> I think, it's not only the code size that matters, but too the performance lack from all these jumps.

I wasn't suggesting that duplication of code is always irrelevant, just that in the particular case of the exception entry points it would impossible to measure an improvement from their elimination because the overall path is so expensive that a short jump wouldn't be noticeable.

tom

> 
> In the method code below, you see a 2-line finally block. Looking at the compile result, I can see, that this block is repeated 6 times and consumes 1/3 of the whole assembly code for this method. Additionally, there are plenty of range-check and null-check block which too seem to be copy-and-pasted, so I guess, removing the redundant blocks from this example would make the code half-sized.
> 
> On the other hand, the 1-length int [] dp could be optimized to a normal int field and pushing the 6 parameters to stack could be saved, if method decode() would be inlined, but isn't because of inline threshold, which sadly isn't frequency-related. This would additionally increase the performance.
> 
> 
>         private CoderResult decodeArrayLoop(ByteBuffer src, CharBuffer dst) {
> 
>             byte[] sa = src.array();
>             int sp = src.arrayOffset() + src.position();
>             int sl = sp + src.remaining();
> 
>             char[] da = dst.array();
>             int [] dp = new int[1];
>             dp[0] = dst.arrayOffset() + dst.position();
>             int dl = dp[0] + dst.remaining();
>             try {
>                 while (sp < sl) {
>                     CoderResult result;
>                     byte byte1 = sa[sp];
>                     if (byte1 >= 0) {               // ASCII      G0
>                         if (dp[0] == dl)
>                             return CoderResult.OVERFLOW;
>                         da[dp[0]++] = (char)(byte1 & 0xff);
>                         sp++;
>                     } else if (byte1 != SS2) {      // Codeset 1  G1
>                         if (sp + 1 == sl)
>                             break;
>                         result = decode(byte1, sa[sp+1], 0, da, dp, dl);
>                         if (result != null)
>                             return result;
>                         sp += 2;
>                     } else {                        // Codeset 2  G2
>                         if (sp + 4 > sl)
>                             break;
>                         int cnsPlane = cnspToIndex[sa[sp+1] & 0xff];
>                         if (cnsPlane < 0)
>                             return CoderResult.malformedForLength(2);
>                         result = decode(sa[sp+2], sa[sp+3], cnsPlane, da, dp, dl);
>                         if (result != null)
>                             return result;
>                         sp += 4;
>                     }
>                 }
>                 return CoderResult.UNDERFLOW;
>             } finally {
>                 src.position(sp - src.arrayOffset());
>                 dst.position(dp[0] - dst.arrayOffset());
>             }
>         }
> 
> 
> -Ulf
> 
> 
> Am 22.11.2009 17:59, Chuck Rasbold schrieb:
>> Sure.  It would be great to merge redundant code paths.  But I don't 
>> think the cost/benefit ratio is worth it. 
>> 
>> In the case you cite, there would be a savings of 4 bytes per path 
>> removed, which are projected to be very infrequent. In a JIT, you
>> have to spend your compilation budget wisely.
>> 
>> It's not that it can't be done. There are just better places to spend time. 
>> 
>> On Sat, Nov 21, 2009 at 5:54 AM, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>> In output of PrintAssembly I frequently see :
>> 
>> ...
>> ...   # more than 10 recurrences
>> ...
>> 726   B108: #        B114 <- B10  Freq: 9.99898e-006
>> 726           # exception oop is in EAX; no code emitted
>> 726           MOV    ECX,EAX
>> 728           JMP,s  B114
>> 728
>> 72a   B109: #        B114 <- B9  Freq: 9.99918e-006
>> 72a           # exception oop is in EAX; no code emitted
>> 72a           MOV    ECX,EAX
>> 72c           JMP,s  B114
>> 72c
>> 72e   B110: #        B114 <- B6  Freq: 9.99938e-006
>> 72e           # exception oop is in EAX; no code emitted
>> 72e           MOV    ECX,EAX
>> 730           JMP,s  B114
>> 730
>> 732   B111: #        B114 <- B4  Freq: 9.99959e-006
>> 732           # exception oop is in EAX; no code emitted
>> 732           MOV    ECX,EAX
>> 734           JMP,s  B114
>> 734
>> 736   B112: #        B114 <- B3  Freq: 9.99979e-006
>> 736           # exception oop is in EAX; no code emitted
>> 736           MOV    ECX,EAX
>> 738           JMP,s  B114
>> 738
>> 73a   B113: #        B114 <- B2  Freq: 9.99999e-006
>> 73a           # exception oop is in EAX; no code emitted
>> 73a           MOV    ECX,EAX
>> 73a
>> 73c   B114: #        N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99  Freq: 7.11172e-005
>> 
>> 
>> Wouldn't it be better to have :
>> 
>> ...
>> ...   # more than 10 recurrences
>> ...
>> 73a   B108: #        B114 <- B10  Freq: 9.99898e-006
>> 73a   B109: #        B114 <- B9  Freq: 9.99918e-006
>> 73a   B110: #        B114 <- B6  Freq: 9.99938e-006
>> 73a   B111: #        B114 <- B4  Freq: 9.99959e-006
>> 73a   B112: #        B114 <- B3  Freq: 9.99979e-006
>> 73a   B113: #        B114 <- B2  Freq: 9.99999e-006
>> 73a           # exception oop is in EAX; no code emitted
>> 73a           MOV    ECX,EAX
>> 73a
>> 73c   B114: #        N1132 <- B79 B113 B112 B111 B110 B109 B108 B103 B102 B101 B100 B93 B92 B91 B90 B87 B86 B85 B84 B83 B82 B81 B80 B107 B106 B105 B104 B78 B77 B76 B75 B99  Freq: 7.11172e-005
>> 
>> 
>> 



More information about the hotspot-compiler-dev mailing list