RFR(S): 8153267: nmethod's exception cache not multi-thread safe
Jamsheed C m
jamsheed.c.m at oracle.com
Thu Apr 7 08:41:34 UTC 2016
Hi Martin,
one comment:
the count increment update should use release store + atomic update.
ref:
https://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming
Best Regards
Jamsheed
On 4/6/2016 6:54 PM, Doerr, Martin wrote:
>
> Hi Jamsheed and all,
>
> thanks for your explanation.
>
> About Case 1:
>
> I basically agree with that reading _next==NULL or _count==0 only
> leads to false negatives and is not critical.
>
> Yes, we could live with a few more false negatives on weak memory
> model platforms (even though this is not my preferred design).
>
> About Case 2:
>
> What I’m missing on the reader’s side of the _count field is something
> which prevents processors from speculatively loading the contents of
> the ExceptionCache.
>
> In ExceptionCache::test_address, the _count only affects the control flow.
>
> PPC and ARM processors can predict branches which depend on the _count
> field and load speculatively from the pc and handler fields (which may
> be stale data!).
>
> Due to out-of-order execution of the loads, it can actually happen,
> that the new _count value is observed, but stale data is read from pc
> and handler fields.
>
> I guess it is highly unlikely that we will ever observe this, but
> there’s no guarantee.
>
> I think my concern about using non-volatile fields for the
> _exception_cache is also still valid.
>
> Nothing prevents C++ Compilers from loading the pointer twice from
> memory. They may expect to get the pointer to the same instance both
> times but actually get two different ones.
>
> For example, this may lead to the situation that
> handler_for_exception_and_pc uses one ExceptionCache instance for
> calling the match function and another one (du to reload of
> non-volatile field) for calling next().
>
> May other people would like to comment on this lengthy discussion as well?
>
> Best regards,
>
> Martin
>
> *From:*Jamsheed C m [mailto:jamsheed.c.m at oracle.com]
> *Sent:* Mittwoch, 6. April 2016 13:54
> *To:* Doerr, Martin <martin.doerr at sap.com>;
> hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: RFR(S): 8153267: nmethod's exception cache not
> multi-thread safe
>
> Hi Martin,
>
> On 4/6/2016 2:49 PM, Doerr, Martin wrote:
>
> Hi Jamsheed,
>
> here are the cases of add_handler_for_exception_and_pc we should
> talk about:
>
> Case 1: A new ExceptionCache instance needs to get added.
>
> The storestore barrier you have added is used in the constructor
> of the ExceptionCache and it releases the most critical fields of
> it. I think this is what you explained in [1] in your email below.
>
> The new values of _count and _next fields are written afterwards
> and hence not covered by this release barrier. Readers of the
> _exception_cache may read _count==0 or _next==NULL.
>
> One could argue that this is not critical, but I guess this was
> not intended?
>
> At least the _exception_cache field needs to be volatile to
> prevent optimizers from breaking anything. This is always needed
> for fields which are accessed concurrently by multiple threads
> without locks (as the readers do).
>
> I think releasing the completely initialized ExceptionCache
> instance is a much cleaner design.
>
> Having count < actual entries, or having _next = null is OK (as there
> is always (locked)slow path to check again).
> Quoting comment from read path.
>
> // We never grab a lock to read the exception cache, so we may
> // have false negatives. This is okay, as it can only happen during
> // the first few exception lookups for a given nmethod.
>
> Weak memory platforms may have a few more false negatives. but isn't
> that OK ?
> This helps us, as we can remove volatile from picture, and actually
> good for read paths.
>
>
> Case 2: An existing ExceptionCache instance gets a new entry.
>
> In this case your storestore barrier is good to release all
> updated fields. However, we need to consider the readers, too. The
> _count field needs to be volatile and the load must acquire.
> Otherwise, stale data may get read by processors which perform
> loads on speculative paths.
>
> storestore mem barrier handles this, as count <= no of real entries.
> and there is always locked slow path to check again.
> As said before, there may be a few more false negatives in weak memory
> platforms than strong ones.
>
> Best Regards,
> Jamsheed
>
>
> I have added the acquire barrier for the _count field here:
>
> http://cr.openjdk.java.net/~mdoerr/8153267_exception_cache/webrev.01/
> <http://cr.openjdk.java.net/%7Emdoerr/8153267_exception_cache/webrev.01/>
>
> Does this answer your questions or is anything still unclear?
>
> Best regards,
>
> Martin
>
> *From:*Jamsheed C m [mailto:jamsheed.c.m at oracle.com]
> *Sent:* Mittwoch, 6. April 2016 10:11
> *To:* Doerr, Martin <martin.doerr at sap.com>
> <mailto:martin.doerr at sap.com>;
> hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> *Subject:* Re: RFR(S): 8153267: nmethod's exception cache not
> multi-thread safe
>
> Thanks for the reply. trying to understand stuffs.
>
>
>
> void nmethod::add_handler_for_exception_and_pc(Handle
> exception, address pc, address handler) {
> // There are potential race conditions during exception
> cache updates, so we
> // must own the ExceptionCache_lock before doing ANY
> modifications. Because
> // we don't lock during reads, it is possible to have
> several threads attempt
> // to update the cache with the same data. We need to check
> for already inserted
> // copies of the current data before adding it.
>
> MutexLocker ml(ExceptionCache_lock);
> ExceptionCache* target_entry =
> exception_cache_entry_for_exception(exception);
>
> if (target_entry == NULL ||
> !target_entry->add_address_and_handler(pc,handler)) {
> target_entry = new ExceptionCache(exception,pc,handler);
> add_exception_cache_entry(target_entry);
> }
> }
>
>
> [1]there is a storestore mem barrier before count is updated in
> add_address_and_handler
> this ensure exception pc and handler address are updated before
> count is incremented and Exception cache entry is updated at (
> nm->_exception_cache or in the list ec->_next ).
>
>
>
> address nmethod::handler_for_exception_and_pc(Handle
> exception, address pc) {
> // We never grab a lock to read the exception cache, so we may
> // have false negatives. This is okay, as it can only happen
> during
> // the first few exception lookups for a given nmethod.
> ExceptionCache* ec = exception_cache();
> while (ec != NULL) {
> address ret_val;
> if ((ret_val = ec->match(exception,pc)) != NULL) {
> return ret_val;
> }
> ec = ec->next();
> }
> return NULL;
> }
>
>
> and in read logic. we first check ec entry is available (non null
> check) before proceeding further.
> if ec is non null and ec_type,excpetion pc, and handler are
> available by[1]. though count can be reordered and not updated
> with new value.
>
> this fixes the issue. why you think it doesn't?
>
> Best Regards,
> Jamsheed
>
>
> On 4/5/2016 3:40 PM, Doerr, Martin wrote:
>
> Hi Jamsheed,
>
> thanks for pointing me to it. Interesting that you have found
> such a problem so shortly before me J
>
> My webrev addresses some aspects which are not covered by your
> fix:
>
> -add_handler_for_exception_and_pc adds a new ExceptionCache
> instance in the other case. They need to get released as well.
>
> -The readers of the _exception_cache field are not safe, yet.
> As Andrew Haley pointed out, optimizers may modify load
> accesses for non-volatile fields.
>
> So I think my change is still needed.
>
> And after taking a closer look at your change, I think the
> _count field which is addressed by your fix needs to be
> volatile as well. I can incorporate that in my change if you like.
>
> Would you agree?
>
> Best regards,
>
> Martin
>
> *From:*hotspot-compiler-dev
> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] *On
> Behalf Of *Jamsheed C m
> *Sent:* Montag, 4. April 2016 08:14
> *To:* hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> *Subject:* Re: RFR(S): 8153267: nmethod's exception cache not
> multi-thread safe
>
> Hi Martin,
>
> "nmethod's exception cache not multi-thread safe" bug is
> fixed in b107
> bug id: https://bugs.openjdk.java.net/browse/JDK-8143897
> fix changeset:
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f918c20107d9
> discussion link:
> http://openjdk.5641.n7.nabble.com/RFR-XS-8143897-Weblogic12medrec-assert-handler-address-SharedRuntime-compute-compiled-exc-handler-nme-td255611.html
>
> Best Regards,
> Jamsheed
>
> On 4/1/2016 6:07 PM, Doerr, Martin wrote:
>
> Hello everyone,
>
> we have found a concurrency problem with the nmethod’s
> exception cache. Readers of the cache may read stale data
> on weak memory platforms.
>
> The writers of the cache are synchronized by locks, but
> there may be concurrent readers: The compiler runtimes use
> nmethod::handler_for_exception_and_pc to access the cache
> without locking.
>
> Therefore, the nmethod's field _exception_cache needs to
> be volatile and adding new entries must be done by
> releasing stores. (Loading seems to be fine without
> acquire because there's an address dependency from the
> load of the cache to the usage of its contents which is
> sufficient to ensure ordering on all openjdk platforms.)
>
> I also added a minor cleanup: I changed nmethod::is_alive
> to read the volatile field _state only once. It is
> certainly undesired to force the compiler to load it from
> memory twice.
>
> Webrev is here:
>
> http://cr.openjdk.java.net/~mdoerr/8153267_exception_cache/webrev.00/
> <http://cr.openjdk.java.net/%7Emdoerr/8153267_exception_cache/webrev.00/>
>
> Please review. I will also need a sponsor.
>
> Best regards,
>
> Martin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160407/846ec5f5/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list