RFR: 8225716: G1 GC: Undefined behaviour in G1BlockOffsetTablePart::block_at_or_preceding
Thomas Schatzl
thomas.schatzl at oracle.com
Fri Jun 14 12:06:22 UTC 2019
Hi,
On Thu, 2019-06-13 at 16:21 +0100, Andrew Haley wrote:
> The Block Offset Table is accessed racily in many places. This is UB,
> but I suspect the authors of G1 assumed that the UB was "benign". In
Not diluting your point (it is a bug/UB), I think the code simply
assumes volatile access to the offset table given your description
below.
> fact it is not, and there is a narrow race condition which causes G1
> to crash at times of heavy stress.
>
> The bug has only ever been observed on 86-32, but it isn't a
> target-specific bug. It's probably not specific to G1 either.
The other generational collector use similar code that apparently also
assume volatile access in
BlockOffsetArrayNonContigSpace::block_start_unsafe. While this is not
issues for Serial obviously, it may be for Parallel/CMS and needs to be
looked at too.
>
> The problem occurs when the BOT is updated by one thread while it's
> being read by another. The BOT isn't volatile, so a C++ compiler is
> entitled to transform this code:
>
[...]
> But if someone is modifying _offset_array then this code is no longer
> correct. We first read the offset, see that we have to slide back N
> cards, and then we read the (changed) offset and slide back the wrong
> (and huge) nuber of cards, and a crash is the inevitable result. It's
> a matter of luck that this hasn't happened before now.
Great analysis.
>
> Making the BOT volatile would probably fix this, but IMO it makes
> more sense to make the access explicit:
>
In addition to Kim's comments I would suggest to do both to make the
intent clear looking at the variable too.
> u_char G1BlockOffsetTable::offset_array(size_t index) const {
> check_index(index, "index out of range");
> return Atomic::load(&_offset_array[index]);
> }
>
> void set_offset_array_raw(size_t index, u_char offset) {
> Atomic::store(offset, &_offset_array[index]);
> }
>
> http://cr.openjdk.java.net/~aph/8225716/
>
Btw, please push the final fix into the jdk/jdk13 repos (I assume we
will manage reviewing this in time ;)). In any case this is serious
enough to get it into the product asap.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list