RFR: 8276901: Implement UseHeavyMonitors consistently [v10]
Martin Doerr
mdoerr at openjdk.java.net
Wed Dec 1 16:44:28 UTC 2021
On Wed, 1 Dec 2021 12:36:02 GMT, Roman Kennke <rkennke at openjdk.org> wrote:
>> The flag UseHeavyMonitors seems to imply that it makes Hotspot always use inflated monitors, rather than stack locks. However, it is only implemented in the interpreter that way. When it calls into runtime, it would still happily stack-lock. Even worse, C1 uses another flag UseFastLocking to achieve something similar (with the same caveat that runtime would stack-lock anyway). C2 doesn't have any such mechanism at all.
>> I would like to experiment with disabling stack-locking, and thus, having this flag work as expected would seem very useful.
>>
>> The change removes the C1 flag UseFastLocking, and replaces its uses with equivalent (i.e. inverted) UseHeavyMonitors instead. I think it makes sense to make UseHeavyMonitors develop (I wouldn't want anybody to use this in production, not currently without this change, and not with this change). I also added a flag VerifyHeavyMonitors to be able to verify that stack-locking is really disabled. We can't currently verify this uncondiftionally (e.g. in debug builds) because all non-x86_64 platforms would need work.
>>
>> Testing:
>> - [x] tier1
>> - [x] tier2
>> - [x] tier3
>> - [ ] tier4
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
>
> Use heavy monitors in runtime only on supported architectures
PPC64 could be implemented like this:
diff --git a/src/hotspot/cpu/ppc/ppc.ad b/src/hotspot/cpu/ppc/ppc.ad
index 958059e1ca2..dc96bd15836 100644
--- a/src/hotspot/cpu/ppc/ppc.ad
+++ b/src/hotspot/cpu/ppc/ppc.ad
@@ -12132,7 +12132,7 @@ instruct partialSubtypeCheck(iRegPdst result, iRegP_N2P subklass, iRegP_N2P supe
instruct cmpFastLock(flagsReg crx, iRegPdst oop, iRegPdst box, iRegPdst tmp1, iRegPdst tmp2) %{
match(Set crx (FastLock oop box));
effect(TEMP tmp1, TEMP tmp2);
- predicate(!Compile::current()->use_rtm());
+ predicate(!Compile::current()->use_rtm() && !UseHeavyMonitors);
format %{ "FASTLOCK $oop, $box, $tmp1, $tmp2" %}
ins_encode %{
@@ -12149,7 +12149,7 @@ instruct cmpFastLock(flagsReg crx, iRegPdst oop, iRegPdst box, iRegPdst tmp1, iR
instruct cmpFastLock_tm(flagsReg crx, iRegPdst oop, rarg2RegP box, iRegPdst tmp1, iRegPdst tmp2, iRegPdst tmp3) %{
match(Set crx (FastLock oop box));
effect(TEMP tmp1, TEMP tmp2, TEMP tmp3, USE_KILL box);
- predicate(Compile::current()->use_rtm());
+ predicate(Compile::current()->use_rtm() && !UseHeavyMonitors);
format %{ "FASTLOCK $oop, $box, $tmp1, $tmp2, $tmp3 (TM)" %}
ins_encode %{
@@ -12165,6 +12165,18 @@ instruct cmpFastLock_tm(flagsReg crx, iRegPdst oop, rarg2RegP box, iRegPdst tmp1
ins_pipe(pipe_class_compare);
%}
+instruct cmpFastLock_hm(flagsReg crx, iRegPdst oop, rarg2RegP box) %{
+ match(Set crx (FastLock oop box));
+ predicate(UseHeavyMonitors);
+
+ format %{ "FASTLOCK $oop, $box (HM)" %}
+ ins_encode %{
+ // Set NE to indicate 'failure' -> take slow-path.
+ __ crandc($crx$$CondRegister, Assembler::equal, $crx$$CondRegister, Assembler::equal);
+ %}
+ ins_pipe(pipe_class_compare);
+%}
+
instruct cmpFastUnlock(flagsReg crx, iRegPdst oop, iRegPdst box, iRegPdst tmp1, iRegPdst tmp2, iRegPdst tmp3) %{
match(Set crx (FastUnlock oop box));
effect(TEMP tmp1, TEMP tmp2, TEMP tmp3);
diff --git a/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp b/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp
index a834fa1af36..bac8ef164f8 100644
--- a/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp
+++ b/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp
@@ -2014,8 +2014,10 @@ nmethod *SharedRuntime::generate_native_wrapper(MacroAssembler *masm,
// Try fastpath for locking.
// fast_lock kills r_temp_1, r_temp_2, r_temp_3.
- __ compiler_fast_lock_object(r_flag, r_oop, r_box, r_temp_1, r_temp_2, r_temp_3);
- __ beq(r_flag, locked);
+ if (!UseHeavyMonitors) {
+ __ compiler_fast_lock_object(r_flag, r_oop, r_box, r_temp_1, r_temp_2, r_temp_3);
+ __ beq(r_flag, locked);
+ }
// None of the above fast optimizations worked so we have to get into the
// slow case of monitor enter. Inline a special case of call_VM that
diff --git a/src/hotspot/share/runtime/synchronizer.cpp b/src/hotspot/share/runtime/synchronizer.cpp
index 4c5ea4a6e40..4f9c7c21a9b 100644
--- a/src/hotspot/share/runtime/synchronizer.cpp
+++ b/src/hotspot/share/runtime/synchronizer.cpp
@@ -418,7 +418,7 @@ void ObjectSynchronizer::handle_sync_on_value_based_class(Handle obj, JavaThread
}
static bool useHeavyMonitors() {
-#if defined(X86) || defined(AARCH64)
+#if defined(X86) || defined(AARCH64) || defined(PPC64)
return UseHeavyMonitors;
#else
return false;
I don't like hacking the regular assembler implementations. Better would be to change C2 such that it doesn't generate FastLockNodes. But that may be a bit cumbersome.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6320
More information about the hotspot-compiler-dev
mailing list