RFR(S): 8153267: nmethod's exception cache not multi-thread safe
Doerr, Martin
martin.doerr at sap.com
Wed Apr 6 13:24:54 UTC 2016
Hi Jamsheed and all,
thanks for your explanation.
About Case 1:
I basically agree with that reading _next==NULL or _count==0 only leads to false negatives and is not critical.
Yes, we could live with a few more false negatives on weak memory model platforms (even though this is not my preferred design).
About Case 2:
What I'm missing on the reader's side of the _count field is something which prevents processors from speculatively loading the contents of the ExceptionCache.
In ExceptionCache::test_address, the _count only affects the control flow.
PPC and ARM processors can predict branches which depend on the _count field and load speculatively from the pc and handler fields (which may be stale data!).
Due to out-of-order execution of the loads, it can actually happen, that the new _count value is observed, but stale data is read from pc and handler fields.
I guess it is highly unlikely that we will ever observe this, but there's no guarantee.
I think my concern about using non-volatile fields for the _exception_cache is also still valid.
Nothing prevents C++ Compilers from loading the pointer twice from memory. They may expect to get the pointer to the same instance both times but actually get two different ones.
For example, this may lead to the situation that handler_for_exception_and_pc uses one ExceptionCache instance for calling the match function and another one (du to reload of non-volatile field) for calling next().
May other people would like to comment on this lengthy discussion as well?
Best regards,
Martin
From: Jamsheed C m [mailto:jamsheed.c.m at oracle.com]
Sent: Mittwoch, 6. April 2016 13:54
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(S): 8153267: nmethod's exception cache not multi-thread safe
Hi Martin,
On 4/6/2016 2:49 PM, Doerr, Martin wrote:
Hi Jamsheed,
here are the cases of add_handler_for_exception_and_pc we should talk about:
Case 1: A new ExceptionCache instance needs to get added.
The storestore barrier you have added is used in the constructor of the ExceptionCache and it releases the most critical fields of it. I think this is what you explained in [1] in your email below.
The new values of _count and _next fields are written afterwards and hence not covered by this release barrier. Readers of the _exception_cache may read _count==0 or _next==NULL.
One could argue that this is not critical, but I guess this was not intended?
At least the _exception_cache field needs to be volatile to prevent optimizers from breaking anything. This is always needed for fields which are accessed concurrently by multiple threads without locks (as the readers do).
I think releasing the completely initialized ExceptionCache instance is a much cleaner design.
Having count < actual entries, or having _next = null is OK (as there is always (locked)slow path to check again).
Quoting comment from read path.
// We never grab a lock to read the exception cache, so we may
// have false negatives. This is okay, as it can only happen during
// the first few exception lookups for a given nmethod.
Weak memory platforms may have a few more false negatives. but isn't that OK ?
This helps us, as we can remove volatile from picture, and actually good for read paths.
Case 2: An existing ExceptionCache instance gets a new entry.
In this case your storestore barrier is good to release all updated fields. However, we need to consider the readers, too. The _count field needs to be volatile and the load must acquire. Otherwise, stale data may get read by processors which perform loads on speculative paths.
storestore mem barrier handles this, as count <= no of real entries. and there is always locked slow path to check again.
As said before, there may be a few more false negatives in weak memory platforms than strong ones.
Best Regards,
Jamsheed
I have added the acquire barrier for the _count field here:
http://cr.openjdk.java.net/~mdoerr/8153267_exception_cache/webrev.01/
Does this answer your questions or is anything still unclear?
Best regards,
Martin
From: Jamsheed C m [mailto:jamsheed.c.m at oracle.com]
Sent: Mittwoch, 6. April 2016 10:11
To: Doerr, Martin <martin.doerr at sap.com><mailto:martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR(S): 8153267: nmethod's exception cache not multi-thread safe
Thanks for the reply. trying to understand stuffs.
void nmethod::add_handler_for_exception_and_pc(Handle exception, address pc, address handler) {
// There are potential race conditions during exception cache updates, so we
// must own the ExceptionCache_lock before doing ANY modifications. Because
// we don't lock during reads, it is possible to have several threads attempt
// to update the cache with the same data. We need to check for already inserted
// copies of the current data before adding it.
MutexLocker ml(ExceptionCache_lock);
ExceptionCache* target_entry = exception_cache_entry_for_exception(exception);
if (target_entry == NULL || !target_entry->add_address_and_handler(pc,handler)) {
target_entry = new ExceptionCache(exception,pc,handler);
add_exception_cache_entry(target_entry);
}
}
[1]there is a storestore mem barrier before count is updated in add_address_and_handler
this ensure exception pc and handler address are updated before count is incremented and Exception cache entry is updated at ( nm->_exception_cache or in the list ec->_next ).
address nmethod::handler_for_exception_and_pc(Handle exception, address pc) {
// We never grab a lock to read the exception cache, so we may
// have false negatives. This is okay, as it can only happen during
// the first few exception lookups for a given nmethod.
ExceptionCache* ec = exception_cache();
while (ec != NULL) {
address ret_val;
if ((ret_val = ec->match(exception,pc)) != NULL) {
return ret_val;
}
ec = ec->next();
}
return NULL;
}
and in read logic. we first check ec entry is available (non null check) before proceeding further.
if ec is non null and ec_type,excpetion pc, and handler are available by[1]. though count can be reordered and not updated with new value.
this fixes the issue. why you think it doesn't?
Best Regards,
Jamsheed
On 4/5/2016 3:40 PM, Doerr, Martin wrote:
Hi Jamsheed,
thanks for pointing me to it. Interesting that you have found such a problem so shortly before me :)
My webrev addresses some aspects which are not covered by your fix:
- add_handler_for_exception_and_pc adds a new ExceptionCache instance in the other case. They need to get released as well.
- The readers of the _exception_cache field are not safe, yet. As Andrew Haley pointed out, optimizers may modify load accesses for non-volatile fields.
So I think my change is still needed.
And after taking a closer look at your change, I think the _count field which is addressed by your fix needs to be volatile as well. I can incorporate that in my change if you like.
Would you agree?
Best regards,
Martin
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Jamsheed C m
Sent: Montag, 4. April 2016 08:14
To: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR(S): 8153267: nmethod's exception cache not multi-thread safe
Hi Martin,
"nmethod's exception cache not multi-thread safe" bug is fixed in b107
bug id: https://bugs.openjdk.java.net/browse/JDK-8143897
fix changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f918c20107d9
discussion link: http://openjdk.5641.n7.nabble.com/RFR-XS-8143897-Weblogic12medrec-assert-handler-address-SharedRuntime-compute-compiled-exc-handler-nme-td255611.html
Best Regards,
Jamsheed
On 4/1/2016 6:07 PM, Doerr, Martin wrote:
Hello everyone,
we have found a concurrency problem with the nmethod's exception cache. Readers of the cache may read stale data on weak memory platforms.
The writers of the cache are synchronized by locks, but there may be concurrent readers: The compiler runtimes use nmethod::handler_for_exception_and_pc to access the cache without locking.
Therefore, the nmethod's field _exception_cache needs to be volatile and adding new entries must be done by releasing stores. (Loading seems to be fine without acquire because there's an address dependency from the load of the cache to the usage of its contents which is sufficient to ensure ordering on all openjdk platforms.)
I also added a minor cleanup: I changed nmethod::is_alive to read the volatile field _state only once. It is certainly undesired to force the compiler to load it from memory twice.
Webrev is here:
http://cr.openjdk.java.net/~mdoerr/8153267_exception_cache/webrev.00/
Please review. I will also need a sponsor.
Best regards,
Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160406/ef0e2353/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list