RFR: 8229169: False failure of GenericTaskQueue::pop_local on architectures with weak memory model

Kim Barrett kim.barrett at oracle.com
Wed Aug 7 02:17:21 UTC 2019


> On Aug 6, 2019, at 7:03 AM, Jie Fu <fujie at loongson.cn> wrote:
> 
> Hi Martin,
> 
> Thanks for your review and valuable comments.
> 
> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.01/
> 
> Please see comments inline.
> 
> On 2019/8/6 下午5:29, Doerr, Martin wrote
>> From your description, OrderAccess ::loadload() seems to be the appropriate barrier.
>> It should be used on all platforms because it contains a compiler barrier for TSO platforms which prevent compilers from reordering the load accesses.
> Done. Thanks.
>> 
>> We should also check that the writer of _age._top uses at least a release (or storestore) barrier in your scenario.
> The writer calls GenericTaskQueue<E, F, N>::pop_global to update _age._top with _age.cmpxchg(newAge, oldAge) [1] .
> I think it already contains the release (or storestore) barrier semantic.
> What do you think?
>> I wonder why I've never seen this issue on PPC64. The test "TestHumongousClassLoader" seems to work stable. But could be that we just never hit this corner case by chance.
> Here is my reproducer to debug this issue: http://cr.openjdk.java.net/~jiefu/8229169/reproducer/
> And this is my patch used to catch the corner case: http://cr.openjdk.java.net/~jiefu/8229169/gc-debug.diff
> HTH.
> 
> Any comments?
> 
> Thanks a lot.
> Best regards,
> Jie
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l222

The additional loadload barrier looks good.  I can sponsor this change.

I'd like the comment to be explicit that the barrier is to prevent
reordering the two reads of _age.  Perhaps note that if size == 0 then
_age cannot change after the read used for the size check.

I think a possible alternative to adding the barrier would be to
ensure the two uses of _age really are using the same data, as they
appear to be (incorrectly) assuming.  That is, read _age once and use
the result of that read in both places, e.g. something like

  Age curAge = _age.get();  // read once for consistent value below
  idx_t tp = curAge.top();
  ...
  } else {
    return pop_local_slow(localBot, curAge);
  }

I think I prefer the additional loadload barrier though. Maybe wait a
couple of days to see if anyone else wants to chime in though; these
sorts of things can be hard to review. I'll ask if anyone else here
has time to take a look.

Martin, I assume you also saw JDK-8229020, also from Jie?  I'm
somewhat surprised neither of these has been previously reported.
Maybe these Loongson CPUs are more aggressively re-ordering reads than
are other platforms?




More information about the hotspot-gc-dev mailing list