condy and enums

John Rose john.r.rose at oracle.com
Fri Oct 13 22:59:08 UTC 2023


On 13 Oct 2023, at 12:55, Liam Miller-Cushon wrote:

> Thanks for the comments, that is a very neat idea!
>
> I was labouring under a misunderstanding of what the verifier 
> constraints
> are. My understanding is that at the Java language level, each static 
> final
> has to be definitely assigned at the end of the static initializer 
> (JLS
> 8.3.1.2), but the VM has fewer restrictions. Is the only constraint 
> that
> putstatic is only allowed to write to static final fields inside 
> clinit,
> and there is no requirement that static finals are definitely assigned 
> in
> the clinit, so using Unsafe to initialize the fields is fair game?

Yes, that is the case.

> I have a rough draft of your approach here:
> https://github.com/openjdk/jdk/pull/16191
>
> * It uses reflection and Unsafe.staticFieldOffset for the JVM method.
>
> * I took a short-cut and disabled generation of values() array. It 
> would be
> possible to split the values() array across multiple methods, but that
> would still require additional constant pool entries, and as you 
> described
> the constant pool size ends up being the limiting factor. I think the
> values() array could still be a good use of indy or condy.

Yes, condy would work.  The values method would then be generated like 
this:

```
ldc Condy{makeValuesArrayBuilder} : MethodHandle
invokevirtual {MethodHandle::invoke()ThisEnum[]}
areturn
```

Or, a java.lang.ClassValue could be used, and the Enum::values method,
defined just once, would root around in reflective stuff once to set up
the CV.  That’s a moral equivalent of condy, but “at a distance”.

Or, indy could be used quite directly:

```
invokedynamic  {makeValuesArrayBuilder} : ()ThisEnum[]
areturn
```

I do NOT recommend binding a condy constant to an array.

The reason is that such an array is not in a position to be optimized
by the JIT.  Arrays are by default mutable, and they are treated
as constant only in certain “stable” places such as in the body
of a list created by `List::of`.  I would accumulate and store
the original source of all clones of the values arrays in a list,
not an array, and make sure the list is immutable (marked stable
in its array state).

>
> With that prototype, I was able to generate an enum with ~32K 
> constants.
> Each enum name currently uses two constant pool entries (a Utf8 and 
> String
> entry for the name). Perhaps there's a way to get that down to one 
> entry,
> and get to ~65k enums.

Yes, there is.  Use reflection to infer the strings.  Don’t access 
them
from the CP via ldc (or as BSM arguments).

> I collected some preliminary performance data: with 4000 constants, 
> cold
> start time is ~400ms with the new codegen and ~700ms without. With 
> ~32K
> constants, startup takes a full ~17s. From some initial investigation, 
> most
> of that time is spent in Class.getDeclaredFields and
> Unsafe.staticFieldOffset.

The problem with core reflection is that it is kind of bulky and slow.
I was going to suggest short-circuiting through an internal API that
uses MemberName but I see there is not one present.  Funny, I thought
I wrote it for JSR 292, but maybe it was GC-ed somehow.

So you may need a sequence of strings to drive the reflective queries,
if you want to avoid all the reflective overheads.  You could derive
a sequence of strings from a single string (or a small number of
max-length strings concatenated) that contains all the field names
in sequence.  That’s O(1) CP entries.  Then you could call the
MemberName query API for single-field queries, mimicing the way it
is used in the Lookup API to build field access method handles.

I’m going into these very internal details just to show what would
be required, as a JDK engineer, to build a properly performant
service method that could iterate the fields of an enum and
initialize them properly.

For prototypes, core reflect is just fine.  But it will not be as
fast as using the off-label internal APIs such as Unsafe and
MemberName.

> This has been fun and educational. it also adds complexity to javac, 
> and
> may not be a clear improvement for non-pathological enums that don't 
> have
> many thousands of entries.
>
> Do you think it could be worth pursuing?

Actually, no, until there is a compelling use case for more than a few
thousand enum members, but (of course) less than 65K or so.

Like you, I found this fun and educational.  It’s a thought exercise
in using metaprogramming (method handle and lookups) instead of bytecode
churning.

There are better applications of metaprogramming, such as string concat,
lambda generation, and (I presume) pattern-switch code generation.

— John

> On Wed, Oct 11, 2023 at 6:08 PM John Rose <john.r.rose at oracle.com> 
> wrote:
>
>> P.S. I trust it is clear how the single service method below would be 
>> used
>> in the <clinit> of each client enum. One other thing occurred to me:
>> Enums which have bootstrap entanglements with the MethodHandle class 
>> would
>> need special treatment. For that reason, it might be better to not 
>> pass the
>> argument MethodHandle enumMemberCreator but rather just call a 
>> private
>> static method of a fixed name and signature, within the same enum. 
>> That can
>> be easily done with method handles, if those are “on line”, but 
>> can also be
>> done with core reflection, which boots up sooner.
>>
>> On 11 Oct 2023, at 18:01, John Rose wrote:
>>
>> ```
>> public static void initializeEnumClass(Lookup enumClassLU, 
>> MethodHandle
>> enumMemberCreator) {
>> int ordinal = 0;
>> if (!enumClassLU.hasPrivateAccess()) throw (IAE);
>> Class<? extends Enum> ec =
>> enumClassLU.lookupClass().asSubClass(Enum.class);
>> for (Field f : ec.getDeclaredFields()) { //order significant here
>> if (f is an enum member) {
>> Object e = enumMemberCreator.invokeExact(f, ordinal++);
>> // next stuff can be done more directly by Unsafe
>> assert(f.get(null) == null); //caller resp.
>> f.setAccessible(true);
>> f.set(null, e);
>> }
>> }
>> }
>> ```
>>
>> The creation of the values array should be done in `<clinit>`, as 
>> well, or
>> as a condy (yes, that’s a good usage of condy!) and cloned as a 
>> fresh copy
>> for each call to `values()`. And it can be done reflectively as well. 
>> Just
>> iterate over all the fields and store them into the array. (Use the
>> `ordinal()` as the array index, or just assert that the fields are in 
>> the
>> correct order already.)
>>
>> With those two adjustments, to bind enums and build the values array
>> reflectively, your enum would be limited only by the maximum size of 
>> the
>> constant pool. That is, you could have up to about 65k enums (but not 
>> the
>> whole 2^16).
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/compiler-dev/attachments/20231013/c8100a77/attachment-0001.htm>


More information about the compiler-dev mailing list