Intermittent JRuby json issue related to tiered or G1
Aleksey Shipilev
shade at redhat.com
Wed Feb 17 14:50:44 UTC 2021
On 2/16/21 9:05 PM, Aleksey Shipilev wrote:
> On 2/16/21 9:00 PM, Charles Oliver Nutter wrote:
>> I am a bit confused about your JDK9 reference. If it was fixed in 9
>> why does it reliably reproduce in 15? Perhaps I am misunderstanding
>> the lineage of the fix you are referring to.
>
> I am saying that there are no direct JIRA hits that could explain why this is happening. The only
> hit I got is for fix already in JDK 9, so it should not happen again.
>
> I am (slowly) bisecting between JDK 15 and JDK 16 to see which fix directly or accidentally fixed
> it. Then we would know what we are dealing with.
This thing is really hairy. Reverse bisects shows that this one:
https://bugs.openjdk.java.net/browse/JDK-8257847
...makes failure in fastdebug much less likely. This explains why I have not seen the failures in
JDK 16 and JDK 17 yesterday. I have managed to reliably crash the recent JDK by promoting the assert
in question into guarantee:
diff --git a/src/hotspot/share/opto/ifnode.cpp b/src/hotspot/share/opto/ifnode.cpp
index 29624765324..467d8f19276 100644
--- a/src/hotspot/share/opto/ifnode.cpp
+++ b/src/hotspot/share/opto/ifnode.cpp
@@ -948,7 +948,9 @@ bool IfNode::fold_compares_helper(ProjNode* proj, ProjNode* success, ProjNode* f
assert((dom_bool->_test.is_less() && proj->_con) ||
(dom_bool->_test.is_greater() && !proj->_con), "incorrect test");
// this test was canonicalized
- assert(this_bool->_test.is_less() && !fail->_con, "incorrect test");
+ guarantee(this_bool->_test.is_less() && !fail->_con, "incorrect test: dom_bool.test=%d
proj._con=%d this_bool.test=%d fail._con=%d",
+ dom_bool->_test._test, proj->_con,
+ this_bool->_test._test, fail->_con);
cond = (hi_test == BoolTest::le || hi_test == BoolTest::gt) ? BoolTest::gt : BoolTest::ge;
...which then fails with:
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (ifnode.cpp:955), pid=2438111, tid=2438182
# guarantee(this_bool->_test.is_less() && !fail->_con) failed: incorrect test: dom_bool.test=3
proj._con=1 this_bool.test=7 fail._con=1
#
# JRE version: OpenJDK Runtime Environment (17.0) (build 17-internal+0-adhoc.shade.jdk)
# Java VM: OpenJDK 64-Bit Server VM (17-internal+0-adhoc.shade.jdk, mixed mode, sharing, tiered,
compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x7fc3ee] IfNode::fold_compares_helper(ProjNode*, ProjNode*, ProjNode*,
PhaseIterGVN*) [clone .part.0]+0x19e
#
# Core dump will be written. Default location: Core dumps may be processed with
"/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to
/home/shade/temp/jruby/jruby-issue-6554/core.2438111)
#
# An error report file with more information is saved as:
# /home/shade/temp/jruby/jruby-issue-6554/hs_err_pid2438111.log
#
# Compiler replay data is saved as:
# /home/shade/temp/jruby/jruby-issue-6554/replay_pid2438111.log
"this_bool.test=7" means the test is "GE". The downstream code does not expect this. It expects the
test to be canonicalized. This minimal thing bails out on discovery of such bad test:
diff --git a/src/hotspot/share/opto/ifnode.cpp b/src/hotspot/share/opto/ifnode.cpp
@@ -971,6 +973,9 @@ bool IfNode::fold_compares_helper(ProjNode* proj, ProjNode* success, ProjNode* f
lo = igvn->transform(new AddINode(lo, igvn->intcon(1)));
cond = BoolTest::ge;
}
+ } else {
+ // Safety: something is broken, break away.
+ return false;
}
} else {
const TypeInt* failtype = filtered_int_type(igvn, n, proj);
I think I'll submit two issues: one that codes fold_compares_helper more defensively like in the
patch above (this would be backportable), and then the follow-up that targets to address the actual
problem (why do we have uncanonicalized test).
--
Thanks,
-Aleksey
More information about the hotspot-compiler-dev
mailing list