[13] RFR (M): 8223213: Implement fast class initialization checks on x86-64
David Holmes
david.holmes at oracle.com
Tue May 14 12:17:51 UTC 2019
Forgot to mention that your new test doesn't look like it will play
nicely when run with Graal enabled, so you may need to split up into
different @test sections and add "@requires !vm.graal.enabled" to
exclude graal.
David
On 14/05/2019 7:44 pm, David Holmes wrote:
> Hi Vladimir,
>
> I'll be very happy to see this go in - though I do wish we had more
> platform coverage than just x86_64. Hopefully the other archs will jump
> on-board with this as well.
>
> I was initially confused by the UseFastClassInitChecks flag as I
> couldn't really see why you would want to turn it off (other than
> perhaps during testing) but I see that it is really used (as you
> explained to Vladimir K.) to exclude the new code for platforms which
> have not implemented it. Though I'm still not sure that we shouldn't
> have something to detect it being turned on at runtime on platforms that
> don't support it (it will likely crash quickly but still ...). Keep
> wondering if there is a better way to handle this aspect of the change ...
>
> I can't comment on the actual interpreter and compiler changes - sorry.
>
> This will need re-basing now that JDK-8219974 has been backed out.
>
> Thanks,
> David
>
> On 2/05/2019 9:17 am, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8223213
>>
>> (It's a followup RFR on a earlier RFC [1].)
>>
>> Recent changes severely affected how static initializers are executed
>> and for long-running initializers it manifested as a severe slowdown.
>> As an example, it led to a 3x slowdown on some Clojure applications
>> (JDK-8219233 [2]). The root cause is that until a class is fully
>> initialized, every invocation of static method on it goes through
>> method resolution.
>>
>> Proposed fix introduces fast class initialization barriers for C1, C2,
>> and template interpreter on x86-64. I did some experiments with
>> cross-platform approaches, but haven't got satisfactory results.
>>
>> On other platforms, behavior stays (mostly) intact. (I had to revert
>> some changes introduced by JDK-8219492 [3], since the assumptions they
>> rely on about accesses inside a class don't hold in all cases.)
>>
>> The barrier is as simple as:
>> if (holder->is_not_initialized() &&
>> !holder->is_reentrant_initialization(current_thread)) {
>> // trigger call site re-resolution and block there
>> }
>>
>> There are 3 places where barriers are added:
>> * in template interpreter for invokestatic bytecode;
>> * at nmethod verified entry point (for normal compilations);
>> * c2i adapters;
>>
>> For template interperter, there's additional check added into
>> TemplateTable::resolve_cache_and_index which calls into
>> InterpreterRuntime::resolve_from_cache when fast path checks fail.
>>
>> In case of nmethods, the barrier is put before frame construction, so
>> existing compiler runtime routines can be reused
>> (SharedRuntime::get_handle_wrong_method_stub()).
>>
>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers
>> nmethod recompilation once the class is fully initialized.
>>
>> OSR compilations don't need a barrier.
>>
>> Correspondence between barriers and transitions they cover:
>> (1) from interpreter (barrier on caller side)
>> * all transitions: interpreter, compiled (i2c), native, aot, ...
>>
>> (2) from compiled (barrier on callee side)
>> to compiled, to native (barrier in native wrapper on entry)
>>
>> (3) c2i bypasses both barriers (interpreter and compiled) and
>> requires a dedicated barrier in c2i
>>
>> (4) to Graal/AOT code:
>> from interpreter: covered by interpreter barrier
>> from compiled: call site patching is disabled, leading to
>> repeated call site resolution until method holder is fully initialized
>> (original behavior).
>>
>> Performance experiments with clojure [2] demonstrated that the fix
>> almost completely recuperates the regression:
>>
>> (1) always reresolve (w/o the fix): ~12,0s ( 1x)
>> (2) C1/C2 barriers only: ~3,8s (~3x)
>> (3) int/C1/C2 barriers: ~3,2s (-20%)
>> --------
>> (4) barriers disabled for invokestatic ~3,2s
>>
>> I deliberately tried to keep the patch backport-friendly for
>> 8u/11u/12u and refrained from using newer features like nmethod
>> barriers introduced recently. The fix can be refactored later
>> specifically for 13 as a followup change.
>>
>> Testing: clojure startup, tier1-5
>>
>> Thanks!
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1]
>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html
>>
>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233
>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492
More information about the hotspot-runtime-dev
mailing list