[13] RFR (M): 8223213: Implement fast class initialization checks on x86-64
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue May 21 17:33:18 UTC 2019
Thanks for the feedback, David!
Updated webrev:
http://cr.openjdk.java.net/~vlivanov/8223213/webrev.01/
Some responses inline:
> Forgot to mention that your new test doesn't look like it will play
> nicely when run with Graal enabled, so you may need to split up into
> different @test sections and add "@requires !vm.graal.enabled" to
> exclude graal.
What kind of problem when running with Graal do you have in mind?
I double-checked that the test passes with Graal enabled.
>> I'll be very happy to see this go in - though I do wish we had more
>> platform coverage than just x86_64. Hopefully the other archs will
>> jump on-board with this as well.
Yes, fully agree with you. It should be pretty straightforward for
maintainers to mirror x86-specific changes for their architectures.
>> I was initially confused by the UseFastClassInitChecks flag as I
>> couldn't really see why you would want to turn it off (other than
>> perhaps during testing) but I see that it is really used (as you
>> explained to Vladimir K.) to exclude the new code for platforms which
>> have not implemented it. Though I'm still not sure that we shouldn't
>> have something to detect it being turned on at runtime on platforms
>> that don't support it (it will likely crash quickly but still ...).
>> Keep wondering if there is a better way to handle this aspect of the
>> change ...
I deliberately made the flag develop, so it's possible to change it from
command-line only in debug builds. I could introduce additional
platform-specific validation, but it doesn't look worth the effort for
such narrow case (and there are other develop flags which guard broken
functionality).
>> I can't comment on the actual interpreter and compiler changes - sorry.
No problem, I'll wait for more reviews from Runtime team.
>> This will need re-basing now that JDK-8219974 has been backed out.
Done.
Best regards,
Vladimir Ivanov
>> On 2/05/2019 9:17 am, Vladimir Ivanov wrote:
>>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8223213
>>>
>>> (It's a followup RFR on a earlier RFC [1].)
>>>
>>> Recent changes severely affected how static initializers are executed
>>> and for long-running initializers it manifested as a severe slowdown.
>>> As an example, it led to a 3x slowdown on some Clojure applications
>>> (JDK-8219233 [2]). The root cause is that until a class is fully
>>> initialized, every invocation of static method on it goes through
>>> method resolution.
>>>
>>> Proposed fix introduces fast class initialization barriers for C1,
>>> C2, and template interpreter on x86-64. I did some experiments with
>>> cross-platform approaches, but haven't got satisfactory results.
>>>
>>> On other platforms, behavior stays (mostly) intact. (I had to revert
>>> some changes introduced by JDK-8219492 [3], since the assumptions
>>> they rely on about accesses inside a class don't hold in all cases.)
>>>
>>> The barrier is as simple as:
>>> if (holder->is_not_initialized() &&
>>> !holder->is_reentrant_initialization(current_thread)) {
>>> // trigger call site re-resolution and block there
>>> }
>>>
>>> There are 3 places where barriers are added:
>>> * in template interpreter for invokestatic bytecode;
>>> * at nmethod verified entry point (for normal compilations);
>>> * c2i adapters;
>>>
>>> For template interperter, there's additional check added into
>>> TemplateTable::resolve_cache_and_index which calls into
>>> InterpreterRuntime::resolve_from_cache when fast path checks fail.
>>>
>>> In case of nmethods, the barrier is put before frame construction, so
>>> existing compiler runtime routines can be reused
>>> (SharedRuntime::get_handle_wrong_method_stub()).
>>>
>>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers
>>> nmethod recompilation once the class is fully initialized.
>>>
>>> OSR compilations don't need a barrier.
>>>
>>> Correspondence between barriers and transitions they cover:
>>> (1) from interpreter (barrier on caller side)
>>> * all transitions: interpreter, compiled (i2c), native, aot, ...
>>>
>>> (2) from compiled (barrier on callee side)
>>> to compiled, to native (barrier in native wrapper on entry)
>>>
>>> (3) c2i bypasses both barriers (interpreter and compiled) and
>>> requires a dedicated barrier in c2i
>>>
>>> (4) to Graal/AOT code:
>>> from interpreter: covered by interpreter barrier
>>> from compiled: call site patching is disabled, leading to
>>> repeated call site resolution until method holder is fully
>>> initialized (original behavior).
>>>
>>> Performance experiments with clojure [2] demonstrated that the fix
>>> almost completely recuperates the regression:
>>>
>>> (1) always reresolve (w/o the fix): ~12,0s ( 1x)
>>> (2) C1/C2 barriers only: ~3,8s (~3x)
>>> (3) int/C1/C2 barriers: ~3,2s (-20%)
>>> --------
>>> (4) barriers disabled for invokestatic ~3,2s
>>>
>>> I deliberately tried to keep the patch backport-friendly for
>>> 8u/11u/12u and refrained from using newer features like nmethod
>>> barriers introduced recently. The fix can be refactored later
>>> specifically for 13 as a followup change.
>>>
>>> Testing: clojure startup, tier1-5
>>>
>>> Thanks!
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> [1]
>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html
>>>
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492
More information about the hotspot-runtime-dev
mailing list