RFR: JDK-8241798: Allow enums to have more constants
Remi Forax
forax at univ-mlv.fr
Mon Mar 30 19:34:34 UTC 2020
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Liam Miller-Cushon" <cushon at google.com>, "compiler-dev"
> <compiler-dev at openjdk.java.net>
> Envoyé: Lundi 30 Mars 2020 20:10:32
> Objet: Re: RFR: JDK-8241798: Allow enums to have more constants
> On the more general (and longer term) topic of "how far can we push this", here
> are some constraints:
> - The spec requires that migrating between a class with static fields and an
> enum to be binary compatible. This essentially means that every enum constant
> has to be a field.
> - static final fields have to be initialized from <clinit>, so we can't use this
> "outlining" trick to break up this part of <clinit>
not if JDK-8209964 is implemented so this is only true at shorter term
> For a trivial enum
> enum Foo { A }
> we get the following classfile:
> Constant pool:
> #1 = Fieldref #4.#29 // Foo.$VALUES:[LFoo;
> #2 = Methodref #30.#31 // "[LFoo;".clone:()Ljava/lang/Object;
> #3 = Class #14 // "[LFoo;"
> #4 = Class #32 // Foo
> #5 = Methodref #10.#33 //
> java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
> #6 = Methodref #10.#34 // java/lang/Enum."<init>":(Ljava/lang/String;I)V
> #7 = String #11 // A
> #8 = Methodref #4.#34 // Foo."<init>":(Ljava/lang/String;I)V
> #9 = Fieldref #4.#35 // Foo.A:LFoo;
> #10 = Class #36 // java/lang/Enum
> #11 = Utf8 A
> #12 = Utf8 LFoo;
> #13 = Utf8 $VALUES
> #14 = Utf8 [LFoo;
> #15 = Utf8 values
> #16 = Utf8 ()[LFoo;
> #17 = Utf8 Code
> #18 = Utf8 LineNumberTable
> #19 = Utf8 valueOf
> #20 = Utf8 (Ljava/lang/String;)LFoo;
> #21 = Utf8 <init>
> #22 = Utf8 (Ljava/lang/String;I)V
> #23 = Utf8 Signature
> #24 = Utf8 ()V
> #25 = Utf8 <clinit>
> #26 = Utf8 Ljava/lang/Enum<LFoo;>;
> #27 = Utf8 SourceFile
> #28 = Utf8 Foo.java
> #29 = NameAndType #13:#14 // $VALUES:[LFoo;
> #30 = Class #14 // "[LFoo;"
> #31 = NameAndType #37:#38 // clone:()Ljava/lang/Object;
> #32 = Utf8 Foo
> #33 = NameAndType #19:#39 //
> valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
> #34 = NameAndType #21:#22 // "<init>":(Ljava/lang/String;I)V
> #35 = NameAndType #11:#12 // A:LFoo;
> #36 = Utf8 java/lang/Enum
> #37 = Utf8 clone
> #38 = Utf8 ()Ljava/lang/Object;
> #39 = Utf8 (Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
> public static final Foo A;
> descriptor: LFoo;
> flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL, ACC_ENUM
> At the very least, each constant requires a unique UTF8 for its name (#11 here
> for Foo.A.) And that field has to be initialized, regardless of what we do with
> values():
> 0: new #4 // class Foo
> 3: dup
> 4: ldc #7 // String A
> 6: iconst_0
> 7: invokespecial #8 // Method "<init>":(Ljava/lang/String;I)V
> 10: putstatic #9 // Field A:LFoo;
> To support the putstatic, we need a constant-specific NameAndType (#35) and
> Fieldref (#9). So that's a minimum of 3 CP slots per enum constants, putting a
> ceiling of about 21K if we want to initialize the field via bytecode. (We could
> do it reflectively which would reduce the number of CP slots but would cost way
> more in startup.)
> The other limit we are bumping into is the size limit of <init>. Liam's patch
> moves the values() array creation out of init, leaving only the constant
> initialization; with ~13 bytes per field, the 64K method size limit gives me
> about 5K enum constants. (Liam, what's the constraint that brings you down to
> 4100?) We can't "outline" the initialization of the field because of the
> constraint that final static fields be set in <init>.
> However, we can probably squeeze a little more out by another refactoring, which
> would be to initialize the values method _first_ and then initialize the fields
> from that:
> aload_1 // push array on the stack
> sipush #n // push index on the stack
> aaload // fetch element
> putstatic Foo.A
> This is 8 bytes, instead of 13, which means we can get more like 8K enum
> constants with these constraints.
yes,
Currently doing an abstract execution of the clinit of an enum is quite simple.
I fear that the transformation you are proposing make harder for AOT to find the value of the constant without executing the clinit at compile time.
So it's a more disruptive change.
Rémi
> On 3/28/2020 7:37 PM, Liam Miller-Cushon wrote:
>> Please consider this change to allow enums to have ~4100 constants (up from the
>> current limit of ~2740), by moving the creation of the values array out of the
>> clinit to a helper method.
>> It's fair to ask whether this is worth doing. Increasing the limit by this
>> amount doesn't really change the use-cases enums are suitable for: they still
>> can't represent arbitrarily large collections of constants, and if you have an
>> enum with >2740 constants you're probably going to want >4100 eventually. On
>> the other hand, I don't see any obvious drawbacks to the change (in terms of
>> complexity in the compiler, performance impact, or complexity of generated
>> code). And other options for increasing the limit would require significantly
>> more difficult and longer-term changes (e.g. tackling the method and constant
>> pool size limits).
>> Another question is whether this is the best code generation strategy. Currently
>> javac saves the values array to a field and returns copies of it using clone().
>> The approach ecj uses skips the field, and just re-creates the array every time
>> values() is called. For now I'm keeping the values field to minimize the
>> performance impact of this change, but the ecj approach would avoid that field
>> and the helper method.
>> bug: [ https://bugs.openjdk.java.net/browse/JDK-8241798 |
>> https://bugs.openjdk.java.net/browse/JDK-8241798 ]
>> webrev: [ http://cr.openjdk.java.net/~cushon/8241798/webrev.00/ |
>> http://cr.openjdk.java.net/~cushon/8241798/webrev.00/ ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20200330/ec10a029/attachment-0001.htm>
More information about the compiler-dev
mailing list