[13] RFR (M): 8223213: Implement fast class initialization checks on x86-64

David Holmes david.holmes at oracle.com
Tue May 21 22:34:59 UTC 2019


Hi Vladimir,

On 22/05/2019 3:33 am, Vladimir Ivanov wrote:
> Thanks for the feedback, David!
> 
> Updated webrev:
>    http://cr.openjdk.java.net/~vlivanov/8223213/webrev.01/
> 
> Some responses inline:
>> Forgot to mention that your new test doesn't look like it will play 
>> nicely when run with Graal enabled, so you may need to split up into 
>> different @test sections and add "@requires !vm.graal.enabled" to 
>> exclude graal.
> 
> What kind of problem when running with Graal do you have in mind?
> 
> I double-checked that the test passes with Graal enabled.

Your various @run lines are trying to execute in different compilation 
modes: -Xint, C1 only, C2 only, plus various permutations. If you take 
those command-lines and then add all the Graal flags to it then you are 
no longer testing any of the things you wanted to test. At best you test 
Graal 7 times - which is pointless. At worst you may find some 
variations will timeout when Graal is applied.

For that matter an Xcomp run will also negate your intentions.

Cheers,
David
-----

>>> I'll be very happy to see this go in - though I do wish we had more 
>>> platform coverage than just x86_64. Hopefully the other archs will 
>>> jump on-board with this as well.
> 
> Yes, fully agree with you. It should be pretty straightforward for 
> maintainers to mirror x86-specific changes for their architectures.
> 
>>> I was initially confused by the UseFastClassInitChecks flag as I 
>>> couldn't really see why you would want to turn it off (other than 
>>> perhaps during testing) but I see that it is really used (as you 
>>> explained to Vladimir K.) to exclude the new code for platforms which 
>>> have not implemented it. Though I'm still not sure that we shouldn't 
>>> have something to detect it being turned on at runtime on platforms 
>>> that don't support it (it will likely crash quickly but still ...). 
>>> Keep wondering if there is a better way to handle this aspect of the 
>>> change ...
> 
> I deliberately made the flag develop, so it's possible to change it from 
> command-line only in debug builds. I could introduce additional 
> platform-specific validation, but it doesn't look worth the effort for 
> such narrow case (and there are other develop flags which guard broken 
> functionality).
> 
>>> I can't comment on the actual interpreter and compiler changes - sorry.
> 
> No problem, I'll wait for more reviews from Runtime team.
> 
>>> This will need re-basing now that JDK-8219974 has been backed out.
> 
> Done.
> 
> Best regards,
> Vladimir Ivanov
> 
>>> On 2/05/2019 9:17 am, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8223213
>>>>
>>>> (It's a followup RFR on a earlier RFC [1].)
>>>>
>>>> Recent changes severely affected how static initializers are 
>>>> executed and for long-running initializers it manifested as a severe 
>>>> slowdown.
>>>> As an example, it led to a 3x slowdown on some Clojure applications
>>>> (JDK-8219233 [2]). The root cause is that until a class is fully 
>>>> initialized, every invocation of static method on it goes through 
>>>> method resolution.
>>>>
>>>> Proposed fix introduces fast class initialization barriers for C1, 
>>>> C2, and template interpreter on x86-64. I did some experiments with 
>>>> cross-platform approaches, but haven't got satisfactory results.
>>>>
>>>> On other platforms, behavior stays (mostly) intact. (I had to revert 
>>>> some changes introduced by JDK-8219492 [3], since the assumptions 
>>>> they rely on about accesses inside a class don't hold in all cases.)
>>>>
>>>> The barrier is as simple as:
>>>>     if (holder->is_not_initialized() &&
>>>>         !holder->is_reentrant_initialization(current_thread)) {
>>>>       // trigger call site re-resolution and block there
>>>>     }
>>>>
>>>> There are 3 places where barriers are added:
>>>>    * in template interpreter for invokestatic bytecode;
>>>>    * at nmethod verified entry point (for normal compilations);
>>>>    * c2i adapters;
>>>>
>>>> For template interperter, there's additional check added into 
>>>> TemplateTable::resolve_cache_and_index which calls into 
>>>> InterpreterRuntime::resolve_from_cache when fast path checks fail.
>>>>
>>>> In case of nmethods, the barrier is put before frame construction, 
>>>> so existing compiler runtime routines can be reused 
>>>> (SharedRuntime::get_handle_wrong_method_stub()).
>>>>
>>>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers 
>>>> nmethod recompilation once the class is fully initialized.
>>>>
>>>> OSR compilations don't need a barrier.
>>>>
>>>> Correspondence between barriers and transitions they cover:
>>>>    (1) from interpreter (barrier on caller side)
>>>>         * all transitions: interpreter, compiled (i2c), native, aot, 
>>>> ...
>>>>
>>>>    (2) from compiled (barrier on callee side)
>>>>         to compiled, to native (barrier in native wrapper on entry)
>>>>
>>>>    (3) c2i bypasses both barriers (interpreter and compiled) and 
>>>> requires a dedicated barrier in c2i
>>>>
>>>>    (4) to Graal/AOT code:
>>>>          from interpreter: covered by interpreter barrier
>>>>          from compiled: call site patching is disabled, leading to 
>>>> repeated call site resolution until method holder is fully 
>>>> initialized (original behavior).
>>>>
>>>> Performance experiments with clojure [2] demonstrated that the fix 
>>>> almost completely recuperates the regression:
>>>>
>>>>    (1) always reresolve (w/o the fix):    ~12,0s ( 1x)
>>>>    (2) C1/C2 barriers only:                ~3,8s (~3x)
>>>>    (3) int/C1/C2 barriers:                 ~3,2s (-20%)
>>>> --------
>>>>    (4) barriers disabled for invokestatic  ~3,2s
>>>>
>>>> I deliberately tried to keep the patch backport-friendly for 
>>>> 8u/11u/12u and refrained from using newer features like nmethod 
>>>> barriers introduced recently. The fix can be refactored later 
>>>> specifically for 13 as a followup change.
>>>>
>>>> Testing: clojure startup, tier1-5
>>>>
>>>> Thanks!
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1] 
>>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html 
>>>>
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492


More information about the hotspot-compiler-dev mailing list