Code Review request for new MallocMaxBytes option and fixed test (7030610, 7037122, 7123945)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Mar 12 19:28:51 PDT 2013
On 3/12/13 7:40 PM, Coleen Phillmore wrote:
>
> On 3/12/2013 9:20 PM, Daniel D. Daugherty wrote:
>> I think Ron's off the air for the evening, but I'm doing gatekeeping
>> duty so I'll chime in to keep the flow going...
>
> I'm waiting for a checkin job... so I can merge and checkin again...
Yes, I saw you migrate your job from JPRT-hotspotwest to JPRT-east
and I see that it failed on JPRT-east... Are you planning to try
a rerun?
>>
>>
>> On 3/12/13 6:59 PM, Coleen Phillmore wrote:
>>>
>>> Hi, I had a look at this and I thought Atomic operations didn't work
>>> for jlong on all platforms?
>>
>> I've heard that too and then I looked at Atomic::add(jlong) and realized
>> that it stays in a loop until it works. In any case, Ron modeled his
>> logic after the JVM stat counter stuff so it is just as reliable as
>> those counters. And no, he couldn't use the JVM stat counter stuff
>> because it is not available in product bits and I insisted that his
>> fix had to work for product bits. In any case, the counter is used
>> for a one-way trip to memory exhaustion so absolute accuracy is not
>> required (like the JVM stat counters).
>
> I just grepped the source and Atomic::load() has Unimplemented() in
> some platforms (not open). It is not supported everywhere.
Are these 32-bit or 64-bit platforms? If these are 32-bit platforms,
then there isn't an issue because size_t is 32-bits so the atomic
jlong operations aren't used.
> I don't think this should be a product flag if it's inaccurate. As
> a testing aid it's fine, but shouldn't be introduced in product.
So you're saying it is preferable for the memory exhaustion tests to not
be run in product because they kill the machines. Sorry, I prefer the
tests to run in all the build flavors.
>>
>>> I'm also concerned with performance of yet more atomic operations in
>>> os::malloc() calls. Can you run refworkload and performance tests
>>> to verify that these do not slow down?
>>
>> The new atomic counter only comes into play when the MallocMaxBytes
>> option is non-zero so I don't see why you think this will slow things
>> down. We're talking one test and branch.
>>
>>> I'd be a lot happier if this was a develop flag
>>
>> Develop flags are not available in product bits and the point of adding
>> the MallocMaxBytes option is so that we can run these memory exhaustion
>> tests reasonably even with product bits. The longest run for the revised
>> test in JPRT was just over 30 seconds on a Solaris SPARC machine. One of
>> the open test bugs talks about the test swamping the machine for tens of
>> minutes and sometimes being unkillable. Which is why JTREG had problems
>> cleaning up after the test...
>
> Ok. I can see why this is product then. It would be good to bound
> these OOM tests somehow. This is an interesting solution.
Ron thought you would like it. Was it you or Karen that asked for
the temporary malloc ceiling from Ron's last fix to become a
permanent addition? I'll have to check my archive...
>>
>>> and the code 589-600 was a function check_malloc_bytes() and 604-612
>>> was add_malloc_bytes().
>>
>> Lastly, you want to change the in-line diagnostic code into a function
>> call and you're worried about slowing things down?
>
> Clean code is worth a call instruction.
You can't argue both sides of the street. :-)
You were complaining about the new atomic ops above that were
guarded by a flag check so later asking for a call instruction
is way more expensive.
>>
>>
>>> It's hard to see the ::malloc() there now!
>> It was hard to the ::malloc() in there before and yes, this
>> fix makes it worse. However, how often is anyone in this code
>> besides Zhengyu and now Ron?
>
> So the excuse for messy code is that not many people see it?
Messy code? Really, do you really want to go down that rat hole
in this particular file? I'm not at all sure why you're going
after this change so hard, but it is definitely interesting to
observe.
I'll let Ron jump in on this thread tomorrow. This is his code
so he gets to make the call. He might tell me I'm all wet and
agree with everything you say. Who knows...
Dan
>
> Thanks,
> Coleen
>>
>> Dan
>>
>>
>>>
>>> Coleen
>>>
>>> On 3/12/2013 8:22 PM, Ron Durbin wrote:
>>>> David
>>>>
>>>> Dan has provided a link. So you can access the webrev and review
>>>> the changes.
>>>>
>>>> Ron
>>>>
>>>> -----Original Message-----
>>>> From: Daniel D. Daugherty
>>>> Sent: Tuesday, March 12, 2013 8:43 AM
>>>> To: hotspot-runtime-dev at openjdk.java.net
>>>> Subject: Re: Code Review request for new MallocMaxBytes option and
>>>> fixed test (7030610, 7037122, 7123945)
>>>>
>>>> David,
>>>>
>>>> Thanks for spotting the URL hiccup. I missed it during my review.
>>>>
>>>> There is something wrong with Ron's access to cr.openjdk.java.net
>>>> so I'm going to host his webrev for this review:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/for_rdurbin/7030610-webrev/0-hsx25/
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 3/11/13 8:54 PM, David Holmes wrote:
>>>>> Ron,
>>>>>
>>>>> On 12/03/2013 10:01 AM, Ron Durbin wrote:
>>>>>> All
>>>>>>
>>>>>> I have a code review readyfor:
>>>>>>
>>>>>> WEBREV http://javaweb.us.oracle.com/~rdurbin/7030610-webrev/0-hsx25/
>>>>> Please make available an open webrev on cr.openjdk.java.net
>>>>>
>>>>> Thank you.
>>>>> David
>>>>>
>>>>>> 7030610 runtime/6878713/Test6878713.sh fails, Error. failed to clean
>>>>>> up files after test.
>>>>>>
>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7030610
>>>>>>
>>>>>> https://jbs.oracle.com/bugs/browse/JDK-7030610
>>>>>>
>>>>>> 7037122 runtime/6878713/Test6878713.sh is written incorrectly
>>>>>>
>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7037122
>>>>>>
>>>>>> https://jbs.oracle.com/bugs/browse/JDK-7037122
>>>>>>
>>>>>> 7123945 runtime/6878713/Test6878713.sh require about 2G of
>>>>>>
>>>>>> _http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7123945_
>>>>>>
>>>>>> https://jbs.oracle.com/bugs/browse/JDK-7123945
>>>>>>
>>>>>> Summary: Add new diagnostic option -XX:MallocMaxBytes=NNN and fix
>>>>>>
>>>>>> Test6878713.sh. Test6878713.sh is a security regression test with
>>>>>> four different failure modes.Three aredocumented in the defects
>>>>>> above
>>>>>> and the forth
>>>>>>
>>>>>> was found duringbug triage. The root cause of these failures can be
>>>>>>
>>>>>> traced back to two issues:Inconsistent test conditionsetup and error
>>>>>> handling.
>>>>>>
>>>>>> All four defect manifestations are fixed by this set of changes.
>>>>>>
>>>>>> New code built and test executed via JPRT test job; Linux ARM and
>>>>>> Linux PPC version built, but not tested because ARM and PPC are
>>>>>> currently build-only platforms in JPRT.
>>>>>>
>>>>>> As always, comments and questions are welcome.
>>>>>>
>>>>>> Ron
>>>>>>
>>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list