EnumData space optimization in j.l.Class (JEP-146)

Peter Levart peter.levart at gmail.com
Tue Dec 18 09:18:47 UTC 2012


On 12/17/2012 11:39 PM, Remi Forax wrote:
> On 12/17/2012 11:14 PM, Peter Levart wrote:
>> On 12/17/2012 10:26 PM, Mandy Chung wrote:
>>> On 12/17/12 7:36 AM, Peter Levart wrote:
>>>> Hi David and others,
>>>>
>>>> Here's a patch that eliminates one of two fields in 
>>>> java.lang.Class, related to caching enum constants:
>>>>
>>>> http://dl.dropbox.com/u/101777488/jdk8-tl/JEP-149.enum/webrev.01/index.html 
>>>>
>>>>
>>>> It does it by moving one field to a subclass of HashMap, which is 
>>>> referenced by a remaining field that serves two different 
>>>> purposes/stages of caching.
>>>>
>>>
>>> Your observation of merging the enumConstants and 
>>> enumConstantDirectory is a good one.   I see that caching of 
>>> enumConstantDirectory is important as it's used by EnumMap and 
>>> EnumSet whose performance is critical (specified with constant time 
>>> operations).  I'm unsure about Class.getEnumConstants whether it's 
>>> performance critical and worths the complexity of your proposed fix 
>>> (the enumData field of two types).  If a class has cached an 
>>> enumConstantDirectory, Class.getEnumConstants can return a clone of 
>>> its values().
>>>
>>> Anyone knows how Class.getEnumConstants is commonly used and needs 
>>> to be performant?  I suspect it's more typical to obtain the list of 
>>> enum constants statistically (calling Enum.values()) than reflectively.
>> Hi Mandy,
>>
>> public Class.getEnumConstants() is a reflection mirror of 
>> SomeEnum.values(). It returns a defensive copy of the constants 
>> array. The primary place for Enum constants is in a private static 
>> final $VALUES field, generated by compiler in each Enum subclass. But 
>> that I think is not part of specification, so for internal usage (as 
>> far as I have managed to find out only in the constructors of EnumSet 
>> and EnumMap), the package-private Class.getEnumConstantsShared() is 
>> used which obtains a copy of the array by calling SomeEnum.values() 
>> and than caches is.
>>
>> The Class.enumConstantDirectory() on the other hand is an internal 
>> package-private method that returns a shared/cached Map<String, T>, 
>> which is used internally to implement SomeEnum.valueOf(String) and 
>> Enum.valueOf(Class, String) static methods.
>>
>> Both package-private methods must be fast.
>>
>> Regards, Peter
>
> for what it worth, I'm the guy behind the patch of bug 6276988 (it was 
> before OpenJDK was setup BTW),
>   http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6276988
> and for the little story, I need that patch because I was developing 
> an Eclipse plugin that uses EnumSet to represent the possible 
> completion values.
> So to answer to Mandy, this application needs really fast EnumSet 
> creation thus really fast getEnumConstantShared() because the EnumSets 
> was created as user types code.
Hi Rémi,

is 600M EnumSets / sec good enough for a fast typist?

>
> Also, Peter, in your getEnumConstantShared(), while the first 
> instanceof is a cheap one, the second is not.
> I think I prefer either the status quo or to group all exotic fields 
> in a specific object and pay the indirection to that object but not 
> the instanceof checks.
This patch is on hold now, so no worries ;-) But I'd like to discuss 
it's performance aspect further, because it's interesting. The same 
two-stage-caching approach could be used to squeeze the 3 fields that 
are left to cache annotations into one.

I tried a variant where the second check was "else if (enumData != 
null)" instead of "else if (enumData instanceof Enum[])" because that's 
the invariant. To my surprise, the micro-benchmark showed no 
differences. I also tried to re-order the ifs with different variations 
of instanceof vs. != null checks. And the presented variant seems to be 
the best (having the same performance as the variant where the second 
check is != null), but I have chosen this variant because it is easier 
to understand.

The surprising part is also the comparison of results in the first line 
of EnumSet.noneOf(Class) measurements that shows a little performance 
increase (10%) compared to the original code. The performance increase 
might be due to the fact that original code executes two volatile reads 
of the same field (this can be improved) whereas the patch only does one 
volatile read. But the results also show that there's no decrease in 
speed if there's no indirection in spite of the fact that the code does 
at least one instanceof check no matter how the compiler manages to 
reorder it.

I thought It might be fair to compare the performance of interpreted 
mode too. Perhaps the instanceof checks are not so fast in the 
interpreted mode. So here it is:

*** Original JDK8 code

Executing: /home/peter/Apps64/jdk1.8.0-jdk8-tl/bin/java -Xmx4G -Xint -cp 
../out/production/test test.EnumTest reference

       EnumSet.noneOf(Class): 202088722 201933106 202181088 201664653 
201261067 201305820 202299656 201439518
      MyEnum.valueOf(String): 184241499 179307134 179289265 179092376 
179079423 178963942 179005498 178971665
       EnumSet.noneOf(Class): 201450845 201488075 201247787 201696777 
201546327 201431311 203371175 201693493


*** Patched code

Executing: /home/peter/Apps64/jdk1.8.0-jdk8-tl/bin/java -Xmx4G -Xint -cp 
../out/production/test -Xbootclasspath/p:../out/production/jdk test.EnumTest

       EnumSet.noneOf(Class): 211831223 210830127 211460330 210753843 
211091717 211055566 211582679 208887087
      MyEnum.valueOf(String): 191420289 186842232 186712513 186791632 
186765074 186766574 187003334 186756508
       EnumSet.noneOf(Class): 201084554 200767426 200900559 201425971 
201106758 201031393 201839016 201115387


I had to divide the number of loops by 1000. The performance aspects of 
patch are clearly "lost in the sea of interpreter cycles"...

I think that grouping all exotic fields for different caching aspects 
into a specific object is not that good idea, because it increases the 
chances that this object will be allocated, but not entirely populated. 
The tricks like the one presented do not increase the space used even in 
the worst-case scenarios.

Regards, Peter

>
> cheers,
> Rémi
>
>>
>>>
>>> Mandy
>>>
>>>> These are the results of a micro-benchmark that exercises public 
>>>> API that uses the internal j.l.Class API regarding enum constants:
>>>>
>>>> enum MyEnum { ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, 
>>>> TEN }
>>>> EnumSet.noneOf(MyEnum.class): 300_000_000 loops
>>>> MyEnum.valueOf(String): 30_000_000 loops * 10 calls for different 
>>>> names
>>>>
>>>> ** Original JDK8 code
>>>>
>>>> Executing: /home/peter/Apps64/jdk1.8.0-jdk8-tl/bin/java -Xmx4G -cp 
>>>> ../out/production/test test.EnumTest reference
>>>>
>>>>       EnumSet.noneOf(Class): 351610312 340302968 339893333 
>>>> 339774384 339750612 339558414 339547022 339621595
>>>>      MyEnum.valueOf(String): 935153830 897188742 887541353 
>>>> 960839820 886119463 885818334 885827093 885752461
>>>>       EnumSet.noneOf(Class): 339552678 339469528 339513757 
>>>> 339451341 339512154 339511634 339664326 339793144
>>>>
>>>> ** patched java.lang.Class
>>>>
>>>> Executing: /home/peter/Apps64/jdk1.8.0-jdk8-tl/bin/java -Xmx4G -cp 
>>>> ../out/production/test -Xbootclasspath/p:../out/production/jdk 
>>>> test.EnumTest
>>>>
>>>>       EnumSet.noneOf(Class): 351724931 339286591 305082929 
>>>> 305042885 305058303 305044144 305073463 305049604
>>>>      MyEnum.valueOf(String): 955032718 908534137 891406394 
>>>> 891506147 891414312 893652469 891412757 891409294
>>>>       EnumSet.noneOf(Class): 414044087 406904161 406788898 
>>>> 406839824 406765274 406815728 407002576 406779162
>>>>
>>>> The slow-down of about 20% (last line) is presumably a consequence 
>>>> of another in-direction to obtain shared enum constants array when 
>>>> there is already a Map in place. It is still fast though (300M 
>>>> EnumSet instances / 0.4 s).
>>>>
>>>> Here's the source of the micro-benchmark:
>>>>
>>>> https://raw.github.com/plevart/jdk8-tl/JEP-149.enum/test/src/test/EnumTest.java 
>>>>
>>>>
>>>> I don't know what's more important in this occasion. A small space 
>>>> gain (8 or 4 bytes per j.l.Class instance) or a small performance 
>>>> gain (20%).
>>>>
>>>> Regards, Peter
>>>>
>>>
>>
>




More information about the core-libs-dev mailing list