Optimize bytecode for combination enhanced-for and enums

Vitaly Davidovich vitalyd at gmail.com
Wed Mar 28 08:03:56 PDT 2012


Correct - I was pointing out the general case where such analysis would be
useful as well (good amount of code makes defensive copies just in case, so
runtime analysis would be beneficial).

Sent from my phone
On Mar 28, 2012 11:00 AM, "Rémi Forax" <forax at univ-mlv.fr> wrote:

> On 03/28/2012 04:53 PM, Vitaly Davidovich wrote:
>
>>
>> clone() could be removed but the issue is whether the JVM does enough
>> analysis to see that - I think enough inlining has to occur such that
>> clone() is inlined right into bar() and then the JVM can see that it
>> doesn't escape and the array is not mutated.  Either the allocation/copy is
>> removed or at least becomes stack allocated.
>>
>> Wrinkle with clone() is that it's virtual so if the inline cache is
>> exceeded and regular vtbl call is made, we lose this ability; that's a more
>> general case and shouldn't happen with things like bar().
>>
>>
> Technically here, the JIT can easily prove the exact type of the array
> (the receiver of clone()),
> so clone() is like if you have only one implementation method so there is
> no need to create a guard
> (but I know that Hotspot doesn't do that, that's why Array.copyOf() is a
> little faster than clone()).
>
> Rémi
>
>  Sent from my phone
>>
>> On Mar 28, 2012 10:43 AM, "Roel Spilker" <r.spilker at gmail.com <mailto:
>> r.spilker at gmail.com>> wrote:
>>
>>    Rémi, are you sure the clone call can be removed? I agree that
>>    that would even be better for hotspot JVMs that use escape
>>    analysis. That said, apart from engineering effort, if a synthetic
>>    inner class is used you get the benefit immediately, even on VMs
>>    that cannot afford escape analysis.
>>
>>    Roel
>>
>>
>>    On Wed, Mar 28, 2012 at 4:13 PM, Rémi Forax <forax at univ-mlv.fr
>>    <mailto:forax at univ-mlv.fr>> wrote:
>>
>>        On 03/28/2012 03:39 PM, Roel Spilker wrote:
>>
>>            Hi all,
>>
>>            TL;DR: for (Edge edge : Edge.values()) copies an array
>>            every time it is executed. This can be prevented. Is that
>>            a good idea and how to proceed from here?
>>
>>            I'm new to this list, so please let me know if this is not
>>            the place to post this.
>>
>>            When I was profiling our application I notices that we
>>            allocate quite some memory when iterating over enum
>>            values. The pattern used was:
>>
>>            class  Foo {
>>             void bar() {
>>               for (Edge e : Edge.values()) {
>>                 // do some work
>>               }
>>             }
>>            }
>>
>>            The call to Edge.values() obviously creates a new clone of
>>            the Edge[] containing all enum constants. So we've
>>            modified our code:
>>
>>            class  Foo {
>>             private static final Edge[] EDGES = Edge.values();
>>             void bar() {
>>               for (Edge e : EDGES) {
>>                 // do some work
>>               }
>>             }
>>            }
>>
>>            This solves our allocation problem, but it is not a nice
>>            solution. Since code in the enhanced-for has no way to
>>            modify the array this pattern can be done at compile-time.
>>            A synthetic inner class can be generated to keep the copy
>>            of the array. The desugared (pseudo) code would then look
>>            something like this:
>>
>>            /* syncthetic */ class Foo$0 {
>>             static final Edge[] EDGES = Edge.values();
>>            }
>>
>>            class  Foo {
>>             void bar() {
>>               for (Edge e : Foo$0.EDGES) {
>>                 // do some work
>>               }
>>             }
>>            }
>>
>>            There is precedence for this kind of
>>            desugaring/optimization: When you use an enum-in-switch, a
>>            synthetic class is generated containing an int array for
>>            the ordinals.
>>
>>            I have a few questions:
>>            - Do you think this is a good optimization? The trade-off
>>            here is creating a copy every time the enhanced-for is
>>            used (could be in an inner loop) versus the overhead of
>>            loading an extra class.
>>            - Is there a better optimization possible/required?
>>            EnumSet uses
>>            SharedSecrets.**getJavaLangAccess().**getEnumConstantsShared(*
>> *elementType),
>>            but that won't work in user code.
>>            - If it is a good idea, how do I proceed from here?
>>            Possibly create a JEP? How can I contribute? Who wants to
>>            support?
>>
>>            Roel Spilker
>>
>>
>>        Hi Roel,
>>        There are several issues with your template, EDGES is
>>        initialized too soon, i.e.
>>        even if you don't call bar() and stay too long, i.e even if
>>        you never call bar() more than once.
>>
>>        I think it's better to let the the escape analysis pass done
>>        by the JIT to inline
>>        values() and remove the call to clone().
>>
>>        Rémi
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20120328/025c2347/attachment.html 


More information about the compiler-dev mailing list