Code Review request for new MallocMaxBytes option and fixed test (7030610, 7037122, 7123945)

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Mar 12 18:20:48 PDT 2013


I think Ron's off the air for the evening, but I'm doing gatekeeping
duty so I'll chime in to keep the flow going...


On 3/12/13 6:59 PM, Coleen Phillmore wrote:
>
> Hi, I had a look at this and I thought Atomic operations didn't work 
> for jlong on all platforms?

I've heard that too and then I looked at Atomic::add(jlong) and realized
that it stays in a loop until it works. In any case, Ron modeled his
logic after the JVM stat counter stuff so it is just as reliable as
those counters. And no, he couldn't use the JVM stat counter stuff
because it is not available in product bits and I insisted that his
fix had to work for product bits. In any case, the counter is used
for a one-way trip to memory exhaustion so absolute accuracy is not
required (like the JVM stat counters).

> I'm also concerned with performance of yet more atomic operations in 
> os::malloc() calls.   Can you run refworkload and performance tests to 
> verify that these do not slow down?

The new atomic counter only comes into play when the MallocMaxBytes
option is non-zero so I don't see why you think this will slow things
down. We're talking one test and branch.

>   I'd be a lot happier if this was a develop flag

Develop flags are not available in product bits and the point of adding
the MallocMaxBytes option is so that we can run these memory exhaustion
tests reasonably even with product bits. The longest run for the revised
test in JPRT was just over 30 seconds on a Solaris SPARC machine. One of
the open test bugs talks about the test swamping the machine for tens of
minutes and sometimes being unkillable. Which is why JTREG had problems
cleaning up after the test...

> and the code 589-600 was a function check_malloc_bytes() and 604-612 
> was add_malloc_bytes().

Lastly, you want to change the in-line diagnostic code into a function
call and you're worried about slowing things down?


> It's hard to see the ::malloc() there now!
It was hard to the ::malloc() in there before and yes, this
fix makes it worse. However, how often is anyone in this code
besides Zhengyu and now Ron?

Dan


>
> Coleen
>
> On 3/12/2013 8:22 PM, Ron Durbin wrote:
>> David
>>
>> Dan has provided a link. So you can access the webrev and review the 
>> changes.
>>
>> Ron
>>
>> -----Original Message-----
>> From: Daniel D. Daugherty
>> Sent: Tuesday, March 12, 2013 8:43 AM
>> To: hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: Code Review request for new MallocMaxBytes option and 
>> fixed test (7030610, 7037122, 7123945)
>>
>> David,
>>
>> Thanks for spotting the URL hiccup. I missed it during my review.
>>
>> There is something wrong with Ron's access to cr.openjdk.java.net so 
>> I'm going to host his webrev for this review:
>>
>> http://cr.openjdk.java.net/~dcubed/for_rdurbin/7030610-webrev/0-hsx25/
>>
>> Dan
>>
>>
>> On 3/11/13 8:54 PM, David Holmes wrote:
>>> Ron,
>>>
>>> On 12/03/2013 10:01 AM, Ron Durbin wrote:
>>>> All
>>>>
>>>> I have a code review readyfor:
>>>>
>>>> WEBREV http://javaweb.us.oracle.com/~rdurbin/7030610-webrev/0-hsx25/
>>> Please make available an open webrev on cr.openjdk.java.net
>>>
>>> Thank you.
>>> David
>>>
>>>> 7030610 runtime/6878713/Test6878713.sh fails, Error. failed to clean
>>>> up files after test.
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7030610
>>>>
>>>> https://jbs.oracle.com/bugs/browse/JDK-7030610
>>>>
>>>> 7037122 runtime/6878713/Test6878713.sh is written incorrectly
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7037122
>>>>
>>>> https://jbs.oracle.com/bugs/browse/JDK-7037122
>>>>
>>>> 7123945 runtime/6878713/Test6878713.sh require about 2G of
>>>>
>>>> _http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7123945_
>>>>
>>>> https://jbs.oracle.com/bugs/browse/JDK-7123945
>>>>
>>>> Summary: Add new diagnostic option -XX:MallocMaxBytes=NNN and fix
>>>>
>>>> Test6878713.sh. Test6878713.sh is a security regression test with
>>>> four different failure modes.Three aredocumented in the defects above
>>>> and the forth
>>>>
>>>> was found duringbug triage. The root cause of these failures can be
>>>>
>>>> traced back to two issues:Inconsistent test conditionsetup and error
>>>> handling.
>>>>
>>>> All four defect manifestations are fixed by this set of changes.
>>>>
>>>> New code built and test executed via JPRT test job; Linux ARM and
>>>> Linux PPC version built, but not tested because ARM and PPC are
>>>> currently build-only platforms in JPRT.
>>>>
>>>> As always, comments and questions are welcome.
>>>>
>>>> Ron
>>>>
>>>
>



More information about the hotspot-runtime-dev mailing list