RFR: JDK-8241798: Allow enums to have more constants

Brian Goetz brian.goetz at oracle.com
Mon Mar 30 18:10:32 UTC 2020


On the more general (and longer term) topic of "how far can we push 
this", here are some constraints:

  - The spec requires that migrating between a class with static fields 
and an enum to be binary compatible. This essentially means that every 
enum constant has to be a field.
  - static final fields have to be initialized from <init>, so we can't 
use this "outlining" trick to break up this part of <init>

For a trivial enum

     enum Foo { A }

we get the following classfile:

Constant pool:
    #1 = Fieldref           #4.#29         // Foo.$VALUES:[LFoo;
    #2 = Methodref          #30.#31        // 
"[LFoo;".clone:()Ljava/lang/Object;
    #3 = Class              #14            // "[LFoo;"
    #4 = Class              #32            // Foo
    #5 = Methodref          #10.#33        // 
java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
    #6 = Methodref          #10.#34        // 
java/lang/Enum."<init>":(Ljava/lang/String;I)V
    #7 = String             #11            // A
    #8 = Methodref          #4.#34         // 
Foo."<init>":(Ljava/lang/String;I)V
    #9 = Fieldref           #4.#35         // Foo.A:LFoo;
   #10 = Class              #36            // java/lang/Enum
   #11 = Utf8               A
   #12 = Utf8               LFoo;
   #13 = Utf8               $VALUES
   #14 = Utf8               [LFoo;
   #15 = Utf8               values
   #16 = Utf8               ()[LFoo;
   #17 = Utf8               Code
   #18 = Utf8               LineNumberTable
   #19 = Utf8               valueOf
   #20 = Utf8               (Ljava/lang/String;)LFoo;
   #21 = Utf8               <init>
   #22 = Utf8               (Ljava/lang/String;I)V
   #23 = Utf8               Signature
   #24 = Utf8               ()V
   #25 = Utf8               <clinit>
   #26 = Utf8               Ljava/lang/Enum<LFoo;>;
   #27 = Utf8               SourceFile
   #28 = Utf8               Foo.java
   #29 = NameAndType        #13:#14        // $VALUES:[LFoo;
   #30 = Class              #14            // "[LFoo;"
   #31 = NameAndType        #37:#38        // clone:()Ljava/lang/Object;
   #32 = Utf8               Foo
   #33 = NameAndType        #19:#39        // 
valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
   #34 = NameAndType        #21:#22        // 
"<init>":(Ljava/lang/String;I)V
   #35 = NameAndType        #11:#12        // A:LFoo;
   #36 = Utf8               java/lang/Enum
   #37 = Utf8               clone
   #38 = Utf8               ()Ljava/lang/Object;
   #39 = Utf8 (Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;

public static final Foo A;
     descriptor: LFoo;
     flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL, ACC_ENUM

At the very least, each constant requires a unique UTF8 for its name 
(#11 here for Foo.A.)  And that field has to be initialized, regardless 
of what we do with values():

          0: new           #4                  // class Foo
          3: dup
          4: ldc           #7                  // String A
          6: iconst_0
          7: invokespecial #8                  // Method 
"<init>":(Ljava/lang/String;I)V
         10: putstatic     #9                  // Field A:LFoo;

To support the putstatic, we need a constant-specific NameAndType (#35) 
and Fieldref (#9).  So that's a minimum of 3 CP slots per enum 
constants, putting a ceiling of about 21K if we want to initialize the 
field via bytecode.  (We could do it reflectively which would reduce the 
number of CP slots but would cost way more in startup.)

The other limit we are bumping into is the size limit of <init>.  Liam's 
patch moves the values() array creation out of init, leaving only the 
constant initialization; with ~13 bytes per field, the 64K method size 
limit gives me about 5K enum constants. (Liam, what's the constraint 
that brings you down to 4100?)  We can't "outline" the initialization of 
the field because of the constraint that final static fields be set in 
<init>.

However, we can probably squeeze a little more out by another 
refactoring, which would be to initialize the values method _first_ and 
then initialize the fields from that:

    aload_1  // push array on the stack
    sipush #n  // push index on the stack
    aaload // fetch element
    putstatic Foo.A

This is 8 bytes, instead of 13, which means we can get more like 8K enum 
constants with these constraints.









On 3/28/2020 7:37 PM, Liam Miller-Cushon wrote:
> Please consider this change to allow enums to have ~4100 constants (up 
> from the current limit of ~2740), by moving the creation of the values 
> array out of the clinit to a helper method.
>
> It's fair to ask whether this is worth doing. Increasing the limit by 
> this amount doesn't really change the use-cases enums are suitable 
> for: they still can't represent arbitrarily large collections of 
> constants, and if you have an enum with >2740 constants you're 
> probably going to want >4100 eventually. On the other hand, I don't 
> see any obvious drawbacks to the change (in terms of complexity in the 
> compiler, performance impact, or complexity of generated code). And 
> other options for increasing the limit would require significantly 
> more difficult and longer-term changes (e.g. tackling the method and 
> constant pool size limits).
>
> Another question is whether this is the best code generation strategy. 
> Currently javac saves the values array to a field and returns copies 
> of it using clone(). The approach ecj uses skips the field, and just 
> re-creates the array every time values() is called. For now I'm 
> keeping the values field to minimize the performance impact of this 
> change, but the ecj approach would avoid that field and the helper method.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8241798
> webrev: http://cr.openjdk.java.net/~cushon/8241798/webrev.00/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20200330/9bc112aa/attachment.htm>


More information about the compiler-dev mailing list