RFR: Re-do streamlining of read-barriers in Access API, and fix call paths that might lead to read-barriers via oop_iterate()

Fri Jul 6 12:04:24 UTC 2018

Am 06.07.2018 um 12:39 schrieb Aleksey Shipilev:
> On 07/05/2018 08:43 PM, Roman Kennke wrote:
>> The previous patch to use SBS::resolve_forwarded() directly from Access
>> API impl failed because some fricking code paths get to read-barriers
>> via oop_iterate() and make full-GC fail because fwd-ptr is temporarily
>> pointing to nirvana.
>>
>> This fixes all those code paths to avoid read-barriers. Some use
>> explicit *_raw() accessor variants now, and metadata is now accessed via
>> raw call wholesale.
>>
>> the non-Shenandoah stuff will have to be upstreamed soon. Want to bake
>> it a little more in Shenandoah though, who knows, maybe we find more
>> such off-the-rails code paths?
>>
>> http://cr.openjdk.java.net/~rkennke/fix-rbs/webrev.00/
> 
> OK for sh/jdk.


Thanks. Testing turned out another code path that requires raw access
(and needs upstreaming). This is obviously bogus to do barrier'd access:

diff --git a/src/hotspot/share/oops/instanceRefKlass.inline.hpp
b/src/hotspot/share/oops/instanceRefKlass.inline.hpp
--- a/src/hotspot/share/oops/instanceRefKlass.inline.hpp
+++ b/src/hotspot/share/oops/instanceRefKlass.inline.hpp
@@ -184,9 +184,9 @@

   log_develop_trace(gc, ref)("InstanceRefKlass %s for obj " PTR_FORMAT,
s, p2i(obj));
   log_develop_trace(gc, ref)("     referent_addr/* " PTR_FORMAT " / "
PTR_FORMAT,
-      p2i(referent_addr), p2i((oop)HeapAccess<ON_UNKNOWN_OOP_REF |
AS_NO_KEEPALIVE>::oop_load_at(obj,
java_lang_ref_Reference::referent_offset)));
+      p2i(referent_addr), p2i((oop)RawAccess<>::oop_load(referent_addr)));
   log_develop_trace(gc, ref)("     discovered_addr/* " PTR_FORMAT " / "
PTR_FORMAT,
-      p2i(discovered_addr),
p2i((oop)HeapAccess<AS_NO_KEEPALIVE>::oop_load(discovered_addr)));
+      p2i(discovered_addr),
p2i((oop)RawAccess<>::oop_load(discovered_addr)));
 }
 #endif


Ok to include that in the patch?


>> Testing: tier3_gc_shenandoah
> 
> This seems to improve Serial:
>   before: 19535.551 ± 225.129  ops/s
>    after: 20120.271 ± 103.193  ops/s

Very nice. So we finally found a benchmark that is performance sensitive
on runtime barriers? I always used to argue that runtime barriers would
likely not show up anywhere...

Cheers, Roman