RFR: JDK-8241798: Allow enums to have more constants
Brian Goetz
brian.goetz at oracle.com
Mon Mar 30 18:10:32 UTC 2020
On the more general (and longer term) topic of "how far can we push
this", here are some constraints:
- The spec requires that migrating between a class with static fields
and an enum to be binary compatible. This essentially means that every
enum constant has to be a field.
- static final fields have to be initialized from <init>, so we can't
use this "outlining" trick to break up this part of <init>
For a trivial enum
enum Foo { A }
we get the following classfile:
Constant pool:
#1 = Fieldref #4.#29 // Foo.$VALUES:[LFoo;
#2 = Methodref #30.#31 //
"[LFoo;".clone:()Ljava/lang/Object;
#3 = Class #14 // "[LFoo;"
#4 = Class #32 // Foo
#5 = Methodref #10.#33 //
java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
#6 = Methodref #10.#34 //
java/lang/Enum."<init>":(Ljava/lang/String;I)V
#7 = String #11 // A
#8 = Methodref #4.#34 //
Foo."<init>":(Ljava/lang/String;I)V
#9 = Fieldref #4.#35 // Foo.A:LFoo;
#10 = Class #36 // java/lang/Enum
#11 = Utf8 A
#12 = Utf8 LFoo;
#13 = Utf8 $VALUES
#14 = Utf8 [LFoo;
#15 = Utf8 values
#16 = Utf8 ()[LFoo;
#17 = Utf8 Code
#18 = Utf8 LineNumberTable
#19 = Utf8 valueOf
#20 = Utf8 (Ljava/lang/String;)LFoo;
#21 = Utf8 <init>
#22 = Utf8 (Ljava/lang/String;I)V
#23 = Utf8 Signature
#24 = Utf8 ()V
#25 = Utf8 <clinit>
#26 = Utf8 Ljava/lang/Enum<LFoo;>;
#27 = Utf8 SourceFile
#28 = Utf8 Foo.java
#29 = NameAndType #13:#14 // $VALUES:[LFoo;
#30 = Class #14 // "[LFoo;"
#31 = NameAndType #37:#38 // clone:()Ljava/lang/Object;
#32 = Utf8 Foo
#33 = NameAndType #19:#39 //
valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
#34 = NameAndType #21:#22 //
"<init>":(Ljava/lang/String;I)V
#35 = NameAndType #11:#12 // A:LFoo;
#36 = Utf8 java/lang/Enum
#37 = Utf8 clone
#38 = Utf8 ()Ljava/lang/Object;
#39 = Utf8 (Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
public static final Foo A;
descriptor: LFoo;
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL, ACC_ENUM
At the very least, each constant requires a unique UTF8 for its name
(#11 here for Foo.A.) And that field has to be initialized, regardless
of what we do with values():
0: new #4 // class Foo
3: dup
4: ldc #7 // String A
6: iconst_0
7: invokespecial #8 // Method
"<init>":(Ljava/lang/String;I)V
10: putstatic #9 // Field A:LFoo;
To support the putstatic, we need a constant-specific NameAndType (#35)
and Fieldref (#9). So that's a minimum of 3 CP slots per enum
constants, putting a ceiling of about 21K if we want to initialize the
field via bytecode. (We could do it reflectively which would reduce the
number of CP slots but would cost way more in startup.)
The other limit we are bumping into is the size limit of <init>. Liam's
patch moves the values() array creation out of init, leaving only the
constant initialization; with ~13 bytes per field, the 64K method size
limit gives me about 5K enum constants. (Liam, what's the constraint
that brings you down to 4100?) We can't "outline" the initialization of
the field because of the constraint that final static fields be set in
<init>.
However, we can probably squeeze a little more out by another
refactoring, which would be to initialize the values method _first_ and
then initialize the fields from that:
aload_1 // push array on the stack
sipush #n // push index on the stack
aaload // fetch element
putstatic Foo.A
This is 8 bytes, instead of 13, which means we can get more like 8K enum
constants with these constraints.
On 3/28/2020 7:37 PM, Liam Miller-Cushon wrote:
> Please consider this change to allow enums to have ~4100 constants (up
> from the current limit of ~2740), by moving the creation of the values
> array out of the clinit to a helper method.
>
> It's fair to ask whether this is worth doing. Increasing the limit by
> this amount doesn't really change the use-cases enums are suitable
> for: they still can't represent arbitrarily large collections of
> constants, and if you have an enum with >2740 constants you're
> probably going to want >4100 eventually. On the other hand, I don't
> see any obvious drawbacks to the change (in terms of complexity in the
> compiler, performance impact, or complexity of generated code). And
> other options for increasing the limit would require significantly
> more difficult and longer-term changes (e.g. tackling the method and
> constant pool size limits).
>
> Another question is whether this is the best code generation strategy.
> Currently javac saves the values array to a field and returns copies
> of it using clone(). The approach ecj uses skips the field, and just
> re-creates the array every time values() is called. For now I'm
> keeping the values field to minimize the performance impact of this
> change, but the ecj approach would avoid that field and the helper method.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8241798
> webrev: http://cr.openjdk.java.net/~cushon/8241798/webrev.00/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20200330/9bc112aa/attachment.htm>
More information about the compiler-dev
mailing list