RFR(S): 8153267: nmethod's exception cache not multi-thread safe

Jamsheed C m jamsheed.c.m at oracle.com
Thu Apr 7 08:41:34 UTC 2016


Hi Martin,

one comment:

the count increment update should use release store + atomic update.
ref: 
https://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming

Best Regards
Jamsheed

On 4/6/2016 6:54 PM, Doerr, Martin wrote:
>
> Hi Jamsheed and all,
>
> thanks for your explanation.
>
> About Case 1:
>
> I basically agree with that reading _next==NULL or _count==0 only 
> leads to false negatives and is not critical.
>
> Yes, we could live with a few more false negatives on weak memory 
> model platforms (even though this is not my preferred design).
>
> About Case 2:
>
> What I’m missing on the reader’s side of the _count field is something 
> which prevents processors from speculatively loading the contents of 
> the ExceptionCache.
>
> In ExceptionCache::test_address, the _count only affects the control flow.
>
> PPC and ARM processors can predict branches which depend on the _count 
> field and load speculatively from the pc and handler fields (which may 
> be stale data!).
>
> Due to out-of-order execution of the loads, it can actually happen, 
> that the new _count value is observed, but stale data is read from pc 
> and handler fields.
>
> I guess it is highly unlikely that we will ever observe this, but 
> there’s no guarantee.
>
> I think my concern about using non-volatile fields for the 
> _exception_cache is also still valid.
>
> Nothing prevents C++ Compilers from loading the pointer twice from 
> memory. They may expect to get the pointer to the same instance both 
> times but actually get two different ones.
>
> For example, this may lead to the situation that 
> handler_for_exception_and_pc uses one ExceptionCache instance for 
> calling the match function and another one (du to reload of 
> non-volatile field) for calling next().
>
> May other people would like to comment on this lengthy discussion as well?
>
> Best regards,
>
> Martin
>
> *From:*Jamsheed C m [mailto:jamsheed.c.m at oracle.com]
> *Sent:* Mittwoch, 6. April 2016 13:54
> *To:* Doerr, Martin <martin.doerr at sap.com>; 
> hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: RFR(S): 8153267: nmethod's exception cache not 
> multi-thread safe
>
> Hi Martin,
>
> On 4/6/2016 2:49 PM, Doerr, Martin wrote:
>
>     Hi Jamsheed,
>
>     here are the cases of add_handler_for_exception_and_pc we should
>     talk about:
>
>     Case 1: A new ExceptionCache instance needs to get added.
>
>     The storestore barrier you have added is used in the constructor
>     of the ExceptionCache and it releases the most critical fields of
>     it. I think this is what you explained in [1] in your email below.
>
>     The new values of _count and _next fields are written afterwards
>     and hence not covered by this release barrier. Readers of the
>     _exception_cache may read _count==0 or _next==NULL.
>
>     One could argue that this is not critical, but I guess this was
>     not intended?
>
>     At least the _exception_cache field needs to be volatile to
>     prevent optimizers from breaking anything. This is always needed
>     for fields which are accessed concurrently by multiple threads
>     without locks (as the readers do).
>
>     I think releasing the completely initialized ExceptionCache
>     instance is a much cleaner design.
>
> Having count < actual entries, or having _next = null is OK (as there 
> is always (locked)slow path to check again).
> Quoting comment from read path.
>
>       // We never grab a lock to read the exception cache, so we may
>       // have false negatives. This is okay, as it can only happen during
>       // the first few exception lookups for a given nmethod.
>
> Weak memory platforms may have a few more false negatives. but isn't 
> that OK ?
> This helps us, as we can remove volatile from picture, and actually 
> good for read paths.
>
>
>     Case 2: An existing ExceptionCache instance gets a new entry.
>
>     In this case your storestore barrier is good to release all
>     updated fields. However, we need to consider the readers, too. The
>     _count field needs to be volatile and the load must acquire.
>     Otherwise, stale data may get read by processors which perform
>     loads on speculative paths.
>
> storestore mem barrier handles this,  as count <= no of real entries. 
> and there is always locked slow path to check again.
> As said before, there may be a few more false negatives in weak memory 
> platforms than strong ones.
>
> Best Regards,
> Jamsheed
>
>
>     I have added the acquire barrier for the _count field here:
>
>     http://cr.openjdk.java.net/~mdoerr/8153267_exception_cache/webrev.01/
>     <http://cr.openjdk.java.net/%7Emdoerr/8153267_exception_cache/webrev.01/>
>
>     Does this answer your questions or is anything still unclear?
>
>     Best regards,
>
>     Martin
>
>     *From:*Jamsheed C m [mailto:jamsheed.c.m at oracle.com]
>     *Sent:* Mittwoch, 6. April 2016 10:11
>     *To:* Doerr, Martin <martin.doerr at sap.com>
>     <mailto:martin.doerr at sap.com>;
>     hotspot-compiler-dev at openjdk.java.net
>     <mailto:hotspot-compiler-dev at openjdk.java.net>
>     *Subject:* Re: RFR(S): 8153267: nmethod's exception cache not
>     multi-thread safe
>
>     Thanks for the reply. trying to understand stuffs.
>
>
>
>         void nmethod::add_handler_for_exception_and_pc(Handle
>         exception, address pc, address handler) {
>           // There are potential race conditions during exception
>         cache updates, so we
>           // must own the ExceptionCache_lock before doing ANY
>         modifications. Because
>           // we don't lock during reads, it is possible to have
>         several threads attempt
>           // to update the cache with the same data. We need to check
>         for already inserted
>           // copies of the current data before adding it.
>
>           MutexLocker ml(ExceptionCache_lock);
>           ExceptionCache* target_entry =
>         exception_cache_entry_for_exception(exception);
>
>           if (target_entry == NULL ||
>         !target_entry->add_address_and_handler(pc,handler)) {
>             target_entry = new ExceptionCache(exception,pc,handler);
>             add_exception_cache_entry(target_entry);
>           }
>         }
>
>
>     [1]there is a storestore mem barrier before count is updated in 
>     add_address_and_handler
>     this ensure exception pc and handler address are updated before
>     count is incremented  and  Exception cache entry is updated at (
>     nm->_exception_cache or in the list ec->_next ).
>
>
>
>         address nmethod::handler_for_exception_and_pc(Handle
>         exception, address pc) {
>           // We never grab a lock to read the exception cache, so we may
>           // have false negatives. This is okay, as it can only happen
>         during
>           // the first few exception lookups for a given nmethod.
>           ExceptionCache* ec = exception_cache();
>           while (ec != NULL) {
>             address ret_val;
>             if ((ret_val = ec->match(exception,pc)) != NULL) {
>               return ret_val;
>             }
>             ec = ec->next();
>           }
>           return NULL;
>         }
>
>
>     and in read logic. we first check ec entry is available (non null
>     check) before proceeding further.
>     if ec is non null and ec_type,excpetion pc, and handler are
>     available by[1]. though count can be reordered and not updated
>     with new value.
>
>     this fixes the issue. why you think it doesn't?
>
>     Best Regards,
>     Jamsheed
>
>
>     On 4/5/2016 3:40 PM, Doerr, Martin wrote:
>
>         Hi Jamsheed,
>
>         thanks for pointing me to it. Interesting that you have found
>         such a problem so shortly before me J
>
>         My webrev addresses some aspects which are not covered by your
>         fix:
>
>         -add_handler_for_exception_and_pc adds a new ExceptionCache
>         instance in the other case. They need to get released as well.
>
>         -The readers of the _exception_cache field are not safe, yet.
>         As Andrew Haley pointed out, optimizers may modify load
>         accesses for non-volatile fields.
>
>         So I think my change is still needed.
>
>         And after taking a closer look at your change, I think the
>         _count field which is addressed by your fix needs to be
>         volatile as well. I can incorporate that in my change if you like.
>
>         Would you agree?
>
>         Best regards,
>
>         Martin
>
>         *From:*hotspot-compiler-dev
>         [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] *On
>         Behalf Of *Jamsheed C m
>         *Sent:* Montag, 4. April 2016 08:14
>         *To:* hotspot-compiler-dev at openjdk.java.net
>         <mailto:hotspot-compiler-dev at openjdk.java.net>
>         *Subject:* Re: RFR(S): 8153267: nmethod's exception cache not
>         multi-thread safe
>
>         Hi Martin,
>
>         "nmethod's exception cache not multi-thread safe"  bug is
>         fixed in b107
>         bug id: https://bugs.openjdk.java.net/browse/JDK-8143897
>         fix changeset:
>         http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/f918c20107d9
>         discussion link:
>         http://openjdk.5641.n7.nabble.com/RFR-XS-8143897-Weblogic12medrec-assert-handler-address-SharedRuntime-compute-compiled-exc-handler-nme-td255611.html
>
>         Best Regards,
>         Jamsheed
>
>         On 4/1/2016 6:07 PM, Doerr, Martin wrote:
>
>             Hello everyone,
>
>             we have found a concurrency problem with the nmethod’s
>             exception cache. Readers of the cache may read stale data
>             on weak memory platforms.
>
>             The writers of the cache are synchronized by locks, but
>             there may be concurrent readers: The compiler runtimes use
>             nmethod::handler_for_exception_and_pc to access the cache
>             without locking.
>
>             Therefore, the nmethod's field _exception_cache needs to
>             be volatile and adding new entries must be done by
>             releasing stores. (Loading seems to be fine without
>             acquire because there's an address dependency from the
>             load of the cache to the usage of its contents which is
>             sufficient to ensure ordering on all openjdk platforms.)
>
>             I also added a minor cleanup: I changed nmethod::is_alive
>             to read the volatile field _state only once. It is
>             certainly undesired to force the compiler to load it from
>             memory twice.
>
>             Webrev is here:
>
>             http://cr.openjdk.java.net/~mdoerr/8153267_exception_cache/webrev.00/
>             <http://cr.openjdk.java.net/%7Emdoerr/8153267_exception_cache/webrev.00/>
>
>             Please review. I will also need a sponsor.
>
>             Best regards,
>
>             Martin
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160407/846ec5f5/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list