RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

David Holmes david.holmes at oracle.com
Tue Oct 4 12:16:33 UTC 2016


On 4/10/2016 8:22 PM, Hiroshi H Horii wrote:
> Dear David,
>
> Thank you for your comments. You are correct. In the previous webrev, a
> caller (in copy_and_push_safe_barrier) may use new_obj's fields
> unsafely. Very sorry.
>
> I changed the log format in copy_and_push_safe_barrier not to use fields
> of new_obj. Could you review this again?
> http://cr.openjdk.java.net/~horii/8154736/webrev.02/

src/share/vm/gc/parallel/psPromotionManager.inline.hpp

274       new_obj = NULL;
285     new_obj = NULL;

Sorry but you are losing me here. You've gone from simply removing 
barriers on the cmpxchg to changing the functionality of the methods 
that use the cmpxchg - instead of return the forwardee() you are now 
returning NULL! ??

David
-----

> The callers of PSPromotionManager::copy_to_survivor_space are here.
>   PSPromotionManager::copy_and_push_safe_barrier
>   PSScavengeFromKlassClosure::do_oop
>
> I confirmed any fields of new_obj is not used in the two methods in this
> webrev.
>
> In addition, I reduced passing a constant literal "forwarding" in
> copy_and_push_safe_barrier and added some guards before logging in
> PSPromotionManager::copy_to_survivor_space as follows.
>
>   if (log_develop_is_enabled(Trace, gc, scavenge)) {
>    log_develop_trace(gc, scavenge)(...);
>  }
>
> If copy_to_survivor_space should not return new_obj if its fields are
> unsafe, I would like to change the return type of copy_to_survivor_space
> to "void" (or allow copy_to_survivor_space to return NULL).
>
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
>
>
> David Holmes <david.holmes at oracle.com> wrote on 10/04/2016 16:32:35:
>
>> From: David Holmes <david.holmes at oracle.com>
>> To: Hiroshi H Horii/Japan/IBM at IBMJP, Carsten Varming <varming at gmail.com>
>> Cc: hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>,
>> "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>,
>> "hotspot-runtime-dev at openjdk.java.net" <hotspot-runtime-
>> dev at openjdk.java.net>, Michihiro Horie/Japan/IBM at IBMJP, "ppc-aix-
>> port-dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>,
>> Thomas Schatzl <thomas.schatzl at oracle.com>, Tim Ellison
>> <Tim_Ellison at uk.ibm.com>
>> Date: 10/04/2016 16:33
>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> copy_to_survivor for ppc64
>>
>> On 4/10/2016 12:15 AM, Hiroshi H Horii wrote:
>> > Dear Carsten,
>> >
>> > Thank you for your correction. And very sorry about my easy mistakes...
>> > I created webrev again.
> http://cr.openjdk.java.net/~horii/8154736/webrev.01/
>> > I believe, all of the unsafe usages of new_obj, which has been pointed
>> > in this thread, is fixed with this webrev.
>>
>> I still am uneasy about this. If it is not safe to access the fields of
>> new_obj in the tracing statements but we return new_obj to the caller,
>> then it may not be safe for the caller to access the fields of new_obj!
>>
>> That aside:
>>
>> src/share/vm/gc/parallel/psPromotionManager.inline.hpp
>>
>>   293   if (o->is_forwarded()) {
>>   294     new_obj = o->forwardee();
>>   295     // fields in new_obj may not be synchronized.
>>   296     if (log_develop_is_enabled(Trace, gc, scavenge) &&
>> o->is_forwarded()) {
>>
>> Why the second check of o->is_forwarded() ?
>>
>> 297       log_develop_trace(gc, scavenge)("{%s %s " PTR_FORMAT " -> "
>> PTR_FORMAT "}",
>>   298                         "forwarding",
>>
>> Why are you passing "forwarding" as an argument for the first %s instead
>> of just expressing it directly? I see this is a copy'n'paste from the
>> existing code - and I'm guessing at one point there was a conditional
>> around that. I think it should be fixed.
>>
>> Thanks,
>> David
>>
>> > Dear all,
>> >
>> > Can I ask a review of this webrev and give thoughts and comments again?
>> >
>> > Regards,
>> > Hiroshi
>> > -----------------------
>> > Hiroshi Horii, Ph.D.
>> > IBM Research - Tokyo
>> >
>> >
>> > Carsten Varming <varming at gmail.com> wrote on 10/03/2016 12:55:25:
>> >
>> >> From: Carsten Varming <varming at gmail.com>
>> >> To: Hiroshi H Horii/Japan/IBM at IBMJP
>> >> Cc: Thomas Schatzl <thomas.schatzl at oracle.com>, David Holmes
>> >> <david.holmes at oracle.com>, hotspot-compiler-dev <hotspot-compiler-
>> >> dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-
>> >> gc-dev at openjdk.java.net>, "hotspot-runtime-dev at openjdk.java.net"
>> >> <hotspot-runtime-dev at openjdk.java.net>, Michihiro Horie/Japan/
>> >> IBM at IBMJP, "ppc-aix-port-dev at openjdk.java.net" <ppc-aix-port-
>> >> dev at openjdk.java.net>, Tim Ellison <Tim_Ellison at uk.ibm.com>
>> >> Date: 10/03/2016 12:56
>> >> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >> copy_to_survivor for ppc64
>> >>
>> >> Dear Hiroshi,
>> >>
>> >> It looks like  psPromotionManager.cpp:509 contains a logging
>> >> statement that could read data from an oop forwarded by another thread.
>> >>
>> >> I don't see how your new logging
>> >> in PSPromotionManager::copy_and_push_safe_barrier can be safe. In
>> >> the two new statements you read data from new_obj, but in both cases
>> >> it is possible that another thread still haven't written the data in
>> >> new_obj (new_obj->klass() reads new_obj->_metadata).
>> >>
>> >> Carsten
>> >>
>> >> On Sun, Oct 2, 2016 at 10:46 AM, Hiroshi H Horii <HORII at jp.ibm.com>
> wrote:
>> >> Hi, Thomas, and David,
>> >>
>> >> Thank you for your comments.
>> >>
>> >> > I think Hiroshi thinks that since the work stealing itself does a CAS
>> >> > with barrier after obtaining "new_obj" in the other thread, it should
>> >> > be safe (for other threads consuming an object on the task queue).
>> >>
>> >> Thank you. What Thomas thankfully explain is that I wanted to
>> >> mention why relaxed CAS is available for copy_to_survivor.
>> >>
>> >> > I also do not think it is safe as is - for example, at least
>> >> > PSPromotionManager::copy_and_push_safe_barrier() reads data from the
>> >> > returned new_obj (in another log message :)) regardless of failure.
>> >> >
>> >> > That method also reads the forwardee if forwarded, and then again
> uses
>> >> > object information in that same log message. A quick look did not
> show
>> >> > other issues, but don't count this as a review.
>> >>
>> >> Thank you for your comments.
>> >>
>> >> As Carsten suggested, I guess, size may not be necessary for logging
>> >> when CAS is failed (the size will be logged by the other thread that
>> >> successfully operates the CAS). By reducing printing a size of
>> >> new_obj, relaxing CAS for forwarding pointers becomes safe, I believe.
>> >>
>> >> In my understanding, PSPromotionManager::copy_and_push_safe_barrier
>> >> () updates a card table for new_obj. However, this new_obj will not
>> >> be used fro card tables in the same GC as a root of GC because all
>> >> of entries in card tables were registered as tasks before any calls
>> >> of copy_and_push_safe_barrier.
>> >>
>> >> I created a new webrev that reduces print formats when CAS is
>> >> failed. Could you review this and give comments on it?
>> >> http://cr.openjdk.java.net/~horii/8154736/webrev.00/
>> >>
>> >> Regards,
>> >> Hiroshi
>> >> -----------------------
>> >> Hiroshi Horii, Ph.D.
>> >> IBM Research - Tokyo
>> >>
>> >>
>> >> Thomas Schatzl <thomas.schatzl at oracle.com> wrote on 09/30/2016
> 21:02:31:
>> >>
>> >> > From: Thomas Schatzl <thomas.schatzl at oracle.com>
>> >> > To: David Holmes <david.holmes at oracle.com>, Hiroshi H
>> > Horii/Japan/IBM at IBMJP
>> >> > Cc: hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>,
>> >> > Tim Ellison <Tim_Ellison at uk.ibm.com>, Michihiro Horie/Japan/
>> >> > IBM at IBMJP, "ppc-aix-port-dev at openjdk.java.net" <ppc-aix-port-
>> >> > dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-
>> >> > gc-dev at openjdk.java.net>, "hotspot-runtime-dev at openjdk.java.net"
>> >> > <hotspot-runtime-dev at openjdk.java.net>
>> >> > Date: 09/30/2016 21:04
>> >> > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >> > copy_to_survivor for ppc64
>> >> >
>> >> > Hi,
>> >> >
>> >> > On Fri, 2016-09-30 at 21:12 +1000, David Holmes wrote:
>> >> > > On 30/09/2016 8:17 PM, Hiroshi H Horii wrote:
>> >> > > >
>> >> > > > Dear David, and Dan,
>> >> > > >
>> >> > > > Thank you for your comments.
>> >> > > >
>> >> > > > >
>> >> > > > > In
>> >> > > > > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:
>> >> > > > > 266 the log line reads data from the forwardee even when
> the CAS
>> >> > > > > fails. I believe those reads will be unsafe without barriers
>> >> > > > > after
>> >> > > > > the copy of the content of the object.
>> >> > > > >
> hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:28
>> >> > > > > 8
>> >> > > > > same problem as in line 266
>> >> > > > Can we use o->size() or new_obj_size instead of new_obj->size()?
>> >> >
>> >> > They are not equivalent. Parallel GC and other collectors creatively
>> >> > reuse the "length" field of objArrays to indicate progress in the
>> >> > scanning them during GC.
>> >> >
>> >> > new_obj_size is the result of a call to o->size() (and the
> compiler may
>> >> > redo computations at any point), so has the same issue.
>> >> >
>> >> > > > > If you feel that the use of new_obj->size() is potentially
> unsafe
>> >> > > > > then
>> >> > > > > the fact we return new_obj means that any use of new_obj by the
>> >> > > > > caller
>> >> > > > > may also potentially be unsafe.
>> >> > > > In my understanding, while copying objects to a survivor
> space, if
>> >> > > > a thread creates a new_obj and sets a pointer with CAS, the other
>> >> > > > threads can touch the new_obj after the thread calls
>> >> > > > push_contents(new_obj) (Line: 239). In push_contents,
>> >> > > > OrderAccess::release_store is called before pushing the
> object as a
>> >> > > > task into a deque of workstealing (taskqueue.inline.hpp). If the
>> >> > > > other thread reads the task, all of copy for new_obj is safe.
>> >> > > I'm not familiar with the larger picture of the GC protocols here,
>> >> > > but just looking at this code fragment in isolation if the CAS
> fails
>> >> > > we read o->forwardee() to set new_obj. That in itself is fine
> because
>> >> > > we're reading the field that we were testing with the CAS. But we
>> >> > > could then deference new_obj before the thread that won the CAS
> calls
>> >> > > push_contents; and even if it is after push_contents we have
> not done
>> >> > > an acquire to pair with the release-store in push_contents.
>> >> >
>> >> > I think Hiroshi thinks that since the work stealing itself does a CAS
>> >> > with barrier after obtaining "new_obj" in the other thread, it should
>> >> > be safe (for other threads consuming an object on the task queue).
>> >> >
>> >> > > So I'm really not seeing how we can use a barrier-less CAS here.
>> >> >
>> >> > I also do not think it is safe as is - for example, at least
>> >> > PSPromotionManager::copy_and_push_safe_barrier() reads data from the
>> >> > returned new_obj (in another log message :)) regardless of failure.
>> >> >
>> >> > That method also reads the forwardee if forwarded, and then again
> uses
>> >> > object information in that same log message. A quick look did not
> show
>> >> > other issues, but don't count this as a review.
>> >> >
>> >> > Thanks,
>> >> >   Thomas
>> >> >
>>
>



More information about the hotspot-gc-dev mailing list