Optimize bytecode for combination enhanced-for and enums

Vitaly Davidovich vitalyd at gmail.com
Wed Mar 28 07:53:23 PDT 2012


clone() could be removed but the issue is whether the JVM does enough
analysis to see that - I think enough inlining has to occur such that
clone() is inlined right into bar() and then the JVM can see that it
doesn't escape and the array is not mutated.  Either the allocation/copy is
removed or at least becomes stack allocated.

Wrinkle with clone() is that it's virtual so if the inline cache is
exceeded and regular vtbl call is made, we lose this ability; that's a more
general case and shouldn't happen with things like bar().

Sent from my phone
On Mar 28, 2012 10:43 AM, "Roel Spilker" <r.spilker at gmail.com> wrote:

> Rémi, are you sure the clone call can be removed? I agree that that would
> even be better for hotspot JVMs that use escape analysis. That said, apart
> from engineering effort, if a synthetic inner class is used you get the
> benefit immediately, even on VMs that cannot afford escape analysis.
>
> Roel
>
>
> On Wed, Mar 28, 2012 at 4:13 PM, Rémi Forax <forax at univ-mlv.fr> wrote:
>
>> On 03/28/2012 03:39 PM, Roel Spilker wrote:
>>
>>> Hi all,
>>>
>>> TL;DR: for (Edge edge : Edge.values()) copies an array every time it is
>>> executed. This can be prevented. Is that a good idea and how to proceed
>>> from here?
>>>
>>> I'm new to this list, so please let me know if this is not the place to
>>> post this.
>>>
>>> When I was profiling our application I notices that we allocate quite
>>> some memory when iterating over enum values. The pattern used was:
>>>
>>> class  Foo {
>>>  void bar() {
>>>    for (Edge e : Edge.values()) {
>>>      // do some work
>>>    }
>>>  }
>>> }
>>>
>>> The call to Edge.values() obviously creates a new clone of the Edge[]
>>> containing all enum constants. So we've modified our code:
>>>
>>> class  Foo {
>>>  private static final Edge[] EDGES = Edge.values();
>>>  void bar() {
>>>    for (Edge e : EDGES) {
>>>      // do some work
>>>    }
>>>  }
>>> }
>>>
>>> This solves our allocation problem, but it is not a nice solution. Since
>>> code in the enhanced-for has no way to modify the array this pattern can be
>>> done at compile-time. A synthetic inner class can be generated to keep the
>>> copy of the array. The desugared (pseudo) code would then look something
>>> like this:
>>>
>>> /* syncthetic */ class Foo$0 {
>>>  static final Edge[] EDGES = Edge.values();
>>> }
>>>
>>> class  Foo {
>>>  void bar() {
>>>    for (Edge e : Foo$0.EDGES) {
>>>      // do some work
>>>    }
>>>  }
>>> }
>>>
>>> There is precedence for this kind of desugaring/optimization: When you
>>> use an enum-in-switch, a synthetic class is generated containing an int
>>> array for the ordinals.
>>>
>>> I have a few questions:
>>> - Do you think this is a good optimization? The trade-off here is
>>> creating a copy every time the enhanced-for is used (could be in an inner
>>> loop) versus the overhead of loading an extra class.
>>> - Is there a better optimization possible/required? EnumSet uses
>>> SharedSecrets.**getJavaLangAccess().**getEnumConstantsShared(**elementType),
>>> but that won't work in user code.
>>> - If it is a good idea, how do I proceed from here? Possibly create a
>>> JEP? How can I contribute? Who wants to support?
>>>
>>> Roel Spilker
>>>
>>>
>> Hi Roel,
>> There are several issues with your template, EDGES is initialized too
>> soon, i.e.
>> even if you don't call bar() and stay too long, i.e even if you never
>> call bar() more than once.
>>
>> I think it's better to let the the escape analysis pass done by the JIT
>> to inline
>> values() and remove the call to clone().
>>
>> Rémi
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20120328/3c71e91b/attachment.html 


More information about the compiler-dev mailing list