Request for review (M): 7016112 CMS: crash during promotion testing
Bengt Rutisson
bengt.rutisson at oracle.com
Thu Jun 23 05:49:22 UTC 2011
Coleen,
Thanks for the review!
On 2011-06-23 03:46, Coleen Phillimore wrote:
> http://cr.openjdk.java.net/~brutisso/7016112/webrev.01/src/share/vm/utilities/quicksort.cpp.html <http://cr.openjdk.java.net/%7Ebrutisso/7016112/webrev.01/src/share/vm/utilities/quicksort.cpp.html>
>
> 46 bool aIsOdd = a % 2 == 1;
> 47 bool bIsOdd = b % 2 == 1;
> Can you add () around the bit operations? Since the order precedence
> of these operators is always surprising (at least to me).
Yes, I'll do that.
> I like the way you set the pattern for the internal testing
> framework. Thanks for getting this started.
Thanks, it was really useful to have the unit tests when I made this fix.
> When we move these methodOops out of permgen, we might want to go back
> to stdlib::quicksort. It would be less code for us to maintain.
> Otherwise, this looks pretty straightforward and I like this fix.
I agree that we should get back to stdlib::quicksort when it is safe to
use it again.
Thanks,
Bengt
>
> Thanks,
> Coleen
>
>
> On 6/21/2011 5:32 PM, Bengt Rutisson wrote:
>>
>> Hi again,
>>
>> For completeness. Here is the graph for sorting maximum length arrays
>> on Linux x64 (run on my laptop). These runs show that my
>> implementation takes twice as long as the stdlib version. I am not
>> happy about that, but I don't know how much effort it is worth to
>> optimize for this case.
>>
>> Bengt
>>
>>
>> On 2011-06-21 14:50, Bengt Rutisson wrote:
>>>
>>> Hi Runtime and GC,
>>>
>>> Sending this review request to both groups. I fixed a GC bug, but
>>> the changes are in runtime code.
>>>
>>> The bug that I fixed is this one:
>>> 7016112 CMS: crash during promotion testing
>>> http://monaco.sfbay.sun.com/detail.jsf?cr=7016112
>>>
>>> And here is the webrev:
>>> http://cr.openjdk.java.net/~brutisso/7016112/webrev.01/
>>>
>>> The investigation to find the root of the crashes reported in
>>> 7016112 was quite lengthy. It is not that easy to read the CR and
>>> figure out what is going on, so here is some background:
>>>
>>> When we load classes we store references to the methods of a class
>>> in an object array. When we have loaded all methods we sort the
>>> object array to allow binary search within the array. To do this
>>> sort we use stdlib::qsort(), which is the standard library quicksort
>>> implementation.
>>>
>>> If we are using CMS we might be doing concurrent marking while we
>>> are sorting the object array. The object array can be found by the
>>> concurrent marking code and it may start iterating over the array
>>> while we are sorting it. The problem is that on Windows the
>>> stdlib::qsort() is not implemented to do atomic updates of elements
>>> in the array that it sorts. Instead it does a byte-by-byte swap when
>>> it needs to swap two values. That is an easy way to implement
>>> different element widths, but it means that at some point in time
>>> one element may contain a few bytes from the element above or below.
>>> If this happens at the same time as the marking code is trying to
>>> read that element, we will be marking some random address and not
>>> the method that was supposed to be marked.
>>>
>>> On Solaris and Linux the stdlib::qsort() implementations try to swap
>>> as wide data types as possible so this issue should not occur there.
>>> On the other hand we have no guarantees that this will always be how
>>> stdlib::qsort() is implemented.
>>>
>>> After some discussions about different ways of solving this we came
>>> to the conclusion that the simplest way is to implement our own
>>> quicksort that operates on the correct pointer width (oop or
>>> narrowOop).
>>>
>>> So, this is what I have done to fix this bug.
>>>
>>> Also, it is likely that this problem will go away when the perm gen
>>> removal project is finished. Right now it looks like we will not be
>>> tracing and marking methods at all after that change.
>>>
>>> * Questions *
>>>
>>> - Should we keep the bubble sort that is done before calling
>>> quicksort in methodOopDesc::sort_methods() ?
>>>
>>> - Should we keep the idempotent option or should we try to always
>>> use idempotent sorting (see performance test below)?
>>>
>>> - What is the best way to handle unit tests? I added a flag called
>>> ExecuteInternalVMTests to run unit tests. This is in line with the
>>> existing ErrorHandlerTest flag. My thought is that we can use this
>>> same flag for other unit tests than just the quicksort tests. Would
>>> be good if we could get these tests executed by JPRT as well. I
>>> simply run these with "java -XX:+ExecuteInternalVMTests -version".
>>>
>>>
>>> * Testing *
>>>
>>> Did the obvious testing: Ran JPRT with the changes in the webrev and
>>> ran the failing nsk test from the bug
>>> (nsk/sysdict/vm/stress/jck12a/sysdictj12a008) repeatedly for sevaral
>>> days without failing.
>>>
>>> I created some unit tests for the quicksort implementation and they
>>> all pass.
>>>
>>> I also made a build that sorts both with my own quicksort and with
>>> stdlib::qsort and then compares that the arrays have the same sort
>>> order. Ran this through JPRT and it seems to work on all platforms.
>>> That run also included the unit tests. If anybody wants to see how
>>> this testing was done, there is a separate webrev for that build.
>>> The interesting code is in methodOop.cpp:
>>>
>>> http://cr.openjdk.java.net/~brutisso/7016112/webrev-verify-sorting/
>>>
>>> * Performance *
>>>
>>> I am a bit unsure how to get any relevant performance data for this
>>> change. What I have done so far is to create a class that has 65535
>>> methods (which is the maximum - the length of the method array is
>>> u2) and I have measured how long it takes to sort this method array.
>>> The methods have random names but every hundredth method has 4
>>> overloaded version. This makes sure that there are some duplicates
>>> in the array.
>>>
>>> For now I have run this on my Windows x64 work station with 4 cpus
>>> and on a Solaris Sparc machine with 2 cpus (uname says: SunOS
>>> sthsparc24 5.10 Generic_139555-08 sun4us sparc FJSV,GPUZC-M Solaris).
>>>
>>> I am attaching graphs for the results. The Y-axis has time in nano
>>> seconds. Judging by this my quicksort is a bit faster on Windows and
>>> a bit slower on Solaris. But the interesting thing is that the
>>> idempotent version is faster than the default behavior on Windows
>>> and on par with the default on Solaris. I assume that this is due to
>>> the fact that some stores can be avoided if we don't do swap of
>>> duplicates. This means that (at least on Windows) swap is more
>>> expensive than compare. If this is true we should probably remove
>>> the special treatment of idempotent.
>>>
>>> I could run this on more machines, but I am not sure how relevant
>>> this type of data is. Most method arrays will be much shorter and
>>> compared to reading the class from a file the sort will be in the
>>> noise.
>>>
>>> Long email...I hope I covered most of the issues here. Let me know
>>> if you have any questions.
>>>
>>> Thanks,
>>> Bengt
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20110623/3d18fd0c/attachment.htm>
More information about the hotspot-gc-dev
mailing list