Assertion failure on PPC64 after 8200545: Improve filter for enqueued deferred cards
Thomas Schatzl
thomas.schatzl at oracle.com
Fri May 24 14:54:08 UTC 2019
Hi,
On Fri, 2019-05-24 at 13:48 +0000, Doerr, Martin wrote:
> Hi Thomas,
>
> > The only way I could imagine such an error would be if the compiler
> > did something weird with writing the fields of the heap attribute
> > table? I.e. it loads a machine word (containing four of those
> > RegionAttr entries), modifies just one of the bytes, and writes
> > back the whole word. I.e. then some concurrent reader might see
> > inconsistent values that flip back and forth.
> > I really doubt this is the case though. Particularly I assume that
> > at least on ppc64/linux you also use gcc.
>
> Correct. It has been built by gcc 7.3.1.
> AIX version has been built by clang IBM XL C/C++ for AIX, Version
> 16.1.0.2.
>
> I wonder why only SPARC needs a workaround.
> If Solaris Studio compilers perform such kind of optimizations, other
> compilers may do that too.
The SPARC workaround has another background: SPARC does not like
(small) structs.
See
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-May/025790.html
referring to
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2014-December/011557.html
.
Some fiddling on godbolt did not show strange code generation on a very
much shortened variant of that code with the available compilers (did
not have gcc 7.3.1). I do not know if this optimization would be
allowed (if it is an optimization), but I could imagine so...
I can prepare some change for you to test, making the fields accessed
"atomically" (via e.g. our Atomic::load/store methods, or making them
word sized) but I can't really test if it fixes the problem as we've
never reproduced this case afaict in our CI. Would that help you?
Next week I intend to dedicate some time to think through the code
again. It seems worth thinking through the code with memory ordering
problems in mind again.
We could go backout this change, but this will make JDK-8213108 out for
review a bit more complicated at least. But if it is a memory
visibility problem, I fear that as explained in an earlier email, this
would just make the assert go away but not fix the issue...
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list