Round 3 RFR (S) 6424123: JVM crashes on failed 'strdup' call.
Zhengyu Gu
zhengyu.gu at oracle.com
Mon Jul 28 18:55:50 UTC 2014
Thanks for the review.
Yes, I will change strdup() to os::strdup() in os::strdup_check_oom().
Thanks,
-Zhengyu
On 7/28/2014 2:50 PM, Christian Tornqvist wrote:
> I agree with Coleen, this change is good. You should change the strdup call inside os::strdup_check_oom to os::strdup to make it clear that we're not calling the runtime strdup.
>
> Thanks,
> Christian
>
> -----Original Message-----
> From: hotspot-runtime-dev
> [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Coleen
> Phillimore
> Sent: Monday, July 28, 2014 1:10 PM
> To: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: Round 3 RFR (S) 6424123: JVM crashes on failed 'strdup' call.
>
>
> Zhengyu, see below.
>
> On 7/16/14, 4:57 PM, Coleen Phillimore wrote:
>> On 7/11/14, 9:53 AM, Zhengyu Gu wrote:
>>> Hi David,
>>>
>>> Thanks for the review. Please see comments inline.
>>>
>>> On 7/11/2014 2:33 AM, David Holmes wrote:
>>>> Hi Zhengyu,
>>>>
>>>> A number of comments ...
>>>>
>>>> First an aside ... based on the recent header file changes for
>>>> .iniline.hpp file I would think these pairs:
>>>>
>>>> +#include "memory/allocation.hpp"
>>>> +#include "memory/allocation.inline.hpp"
>>>>
>>>> would reduce to
>>>>
>>>> +#include "memory/allocation.inline.hpp"
>>>>
>>>> if allocation.inline.hpp included allocation.hpp as it apparently
>>>> should.
>>>>
>>> I will fix them.
>>>
>>>> ---
>>>>
>>>> In the vm_version_* changes are you sure os::malloc will be ready to
>>>> work very early in the initialization sequence of the VM? os::malloc
>>>> calls out to a lot of other code especially in debug mode - eg: will
>>>> check_heap() work, GuardedMemory, MemTracker itself?
>>>>
>>>> Similarly with the use of os::strdup_check_oom - will it actually
>>>> manage to terminate cleanly if OOME happens?
>>>>
>>>> Same comments for arguments.cpp changes. It may be too early to call
>>>> os::strdup_check_oom.
>>>>
>>> The earliest os::malloc() calls are from initialization of static
>>> objects, ahead of vm_version_* initialization, so I don't think it is
>>> an issue.
>>>
>>>> ---
>>>>
>>>> src/share/vm/compiler/compilerOracle.cpp
>>>>
>>>> It's not obvious that your change is correct. If
>>>> MethodMatcher::match is called several times might you not want to
>>>> duplicate the string on each call? That said it isn't obvious that
>>>> these strings need copying at all - we'd need to trace their
>>>> complete lifecycle.
>>>>
>>>> Similar comments for NamedCounter in runtime.cpp and vmNode in
>>>> fprofiler.cpp
>>>>
>>>> Having to add all those destructors seems messy and it isn't evident
>>>> that it is necessary.
>>> I moved 'strdup' from caller to the class, I think it makes the
>>> ownership more explicit. I can not tell if the string needs to be
>>> duplicated, or vmNode, NamedCounter ever to be released. If so, it
>>> looks to me like memory leaks.
>> I agree with Zhengyu. I think this is better than doing strdup at
>> all of the call sites also. It looks like these NamedCounters do leak
>> the string here:
>>
>> void FastLockNode::create_rtm_lock_counter(JVMState* state) { #if
>> INCLUDE_RTM_OPT
>> Compile* C = Compile::current();
>> if (C->profile_rtm() || (PrintPreciseRTMLockingStatistics &&
>> C->use_rtm())) {
>> RTMLockingNamedCounter* rlnc = (RTMLockingNamedCounter*)
>> OptoRuntime::new_named_counter(state,
>> NamedCounter::RTMLockingCounter);
>> _rtm_counters = rlnc->counters();
>> if (UseRTMForStackLocks) {
>> rlnc = (RTMLockingNamedCounter*)
>> OptoRuntime::new_named_counter(state,
>> NamedCounter::RTMLockingCounter); <= never deallocates duplicated
>> string in this function
>> _stack_rtm_counters = rlnc->counters();
>> }
>> }
>> #endif
>> }
>>
>> I don't this is for Zhenyu to fix though. I think Zhenyu has provided
>> a destructor to use to deallocate the string properly.
> Can you file a bug for this if you agree with my assertion that there is a
> leak here?
>
>> MethodOptionMatcher is debatable though. It does copy the option each
>> time, but maybe the copy should be outside MethodOptionMatcher since
>> the reason for copying is that the name is on the stack in the caller
>> of add_option_string(). This one maybe shouldn't have the destructor
>> change.
> I meant to say that this is a matter of opinion and looks fine if you want
> to keep it.
>
> Again, I think this change is good.
>
> Coleen
>> fprofiler looks pretty clear cut that copying the string inside the
>> class is better.
>>
>> Coleen
>>> CC'd compiler mail list.
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>>> ---
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 11/07/2014 3:39 AM, Zhengyu Gu wrote:
>>>>> Sorry for the long delay. The update is mainly based on Coleen's
>>>>> suggestion.
>>>>>
>>>>> DuplicateString() is changed to os::strdup_check_oom().
>>>>>
>>>>> Bug:https://bugs.openjdk.java.net/browse/JDK-6424123
>>>>> <https://bugs.openjdk.java.net/browse/JDK-6424123>
>>>>> Webrev:http://cr.openjdk.java.net/~zgu/6424123/webrev.02/
>>>>> <http://cr.openjdk.java.net/%7Ezgu/6424123/webrev.02/>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Zhengyu
>>>>>
>>>>> <http://cr.openjdk.java.net/%7Ezgu/6424123/webrev.02/>
>>>>>
>>>>>
>>>>> On 6/10/2014 5:30 PM, Coleen Phillimore wrote:
>>>>>> Hi, I think this is potentially a really good cleanup. May I
>>>>>> suggest making DuplicateString an os function like
>>>>>> os::duplicate_string() and have it always throw OOM, so that the
>>>>>> alloc_failmode parameter doesn't look inconsistent.
>>>>>>
>>>>>> If new code wants to call os::strdup() and check for null return,
>>>>>> it can do that. Otherwise call os::safe_duplicate_string() or
>>>>>> os::duplicate_string() as a wrapper to a os::strdup() that doesn't
>>>>>> return null.
>>>>>>
>>>>>> I'm not convinced the original bug is caused by missing a null
>>>>>> check to strdup (unless it passed into strdup an already null
>>>>>> string). But this cleans up using this unsafe function directly.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 6/9/14, 11:50 PM, David Holmes wrote:
>>>>>>> On 9/06/2014 11:43 PM, Zhengyu Gu wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> On 6/8/2014 5:59 PM, David Holmes wrote:
>>>>>>>>> On 8/06/2014 10:48 PM, Zhengyu Gu wrote:
>>>>>>>>>> Hi David,
>>>>>>>>>>
>>>>>>>>>> On 6/8/2014 5:03 AM, David Holmes wrote:
>>>>>>>>>>> Hi Zhengyu,
>>>>>>>>>>>
>>>>>>>>>>> Still a bit perplexed by the aim here. Why replace non-null
>>>>>>>>>>> checked strdup calls with non-null os::strdup?
>>>>>>>>>> ::strdup() is not tracked by NMT, but os::strdup() is.
>>>>>>>>> Okay, but it still needs to be checked.
>>>>>>>>>
>>>>>>>> I am not sure what needs to be checked?
>>>>>>> The result of calling os::strdup - else we haven't fixed the
>>>>>>> original bug. Unless you are saying that all uses of os::strdup
>>>>>>> do check the return value somewhere in the call chain?
>>>>>>>
>>>>>>>>>>> If the issue is that the result of strdup must be checked
>>>>>>>>>>> then it must be checked. Why add DuplicateString instead of
>>>>>>>>>>> changing what os::strdup does?
>>>>>>>>>> I replaced strdup()/os::strdup() with DuplicateString(), where
>>>>>>>>>> caller does not check null pointer. If caller checks null
>>>>>>>>>> pointer, I left it alone.
>>>>>>>>> That is not apparent from the webrev eg:
>>>>>>>>>
>>>>>>>>> src/os/windows/vm/perfMemory_windows.cpp
>>>>>>>>>
>>>>>>>> There are several places that are not obvious. Sometimes, I have
>>>>>>>> to go a few levels to find out how the pointers are used.
>>>>>>>>
>>>>>>>> For this particular case, sharedmem_fileName is only referenced
>>>>>>>> by PerfMemory::backing_store_filename(), but there is no caller
>>>>>>>> to this function.
>>>>>>>>
>>>>>>>> $ grep -r backing_store_filename *
>>>>>>>> os/aix/vm/perfMemory_aix.cpp:char*
>>>>>>>> PerfMemory::backing_store_filename() {
>>>>>>>> os/bsd/vm/perfMemory_bsd.cpp:char*
>>>>>>>> PerfMemory::backing_store_filename() {
>>>>>>>> os/linux/vm/perfMemory_linux.cpp:char*
>>>>>>>> PerfMemory::backing_store_filename() {
>>>>>>>> os/solaris/vm/perfMemory_solaris.cpp:char*
>>>>>>>> PerfMemory::backing_store_filename() {
>>>>>>>> os/windows/vm/perfMemory_windows.cpp:char*
>>>>>>>> PerfMemory::backing_store_filename() {
>>>>>>>> share/vm/runtime/perfMemory.hpp: static char*
>>>>>>>> backing_store_filename();
>>>>>>>>
>>>>>>>> I thought parfait should be good at finding these things.
>>>>>>>>
>>>>>>>>>> DuplicateString() mirrors what AllocateHeap() does, and
>>>>>>>>>> os::strdup()
>>>>>>>>>> mirrors os::malloc(). AllocateHeap()/DuplicateString() can
>>>>>>>>>> handle OOM (by default), but os::strdup()/os::malloc() do not.
>>>>>>>>> I don't see the need for this duality here. Why do we need to
>>>>>>>>> dup strings in different memory areas? Why can't os::strdup do
>>>>>>>>> the null check internally and abort if requested?
>>>>>>>>>
>>>>>>>>> Putting it another way if someone writes a new piece of code
>>>>>>>>> where they need to dup an incoming string, what determines
>>>>>>>>> whether they should use os::strdup or DuplicateString?
>>>>>>>>>
>>>>>>>> I agree it is confusing that we have some many memory allocation
>>>>>>>> functions to accomplish similar things, given AllocateHeap() and
>>>>>>>> ReallocateHeap() can also return NULL now. That's also the
>>>>>>>> reason to get me wrong in the first iteration.
>>>>>>>>
>>>>>>>> The purposed approach at least keep os::malloc(), os::realloc()
>>>>>>>> and
>>>>>>>> os::strdup() as replacement of c library's counterparts.
>>>>>>>> Having os::strdup() handles OOM, but not os::malloc() and
>>>>>>>> os::realloc(), seems to me even more confusing and inconsistent.
>>>>>>>>
>>>>>>>> To determine whether uses os::strdup() or DuplicationString(),
>>>>>>>> should be the same way to determine whether uses os::malloc() or
>>>>>>>> AllocateHeap(), in my opinion.
>>>>>>> AllocateHeap should only be being used as the implementation for
>>>>>>> CHeapObj and related types, it is not a general purpose
>>>>>>> allocation interface. So I don't see that the analogy holds. I'd
>>>>>>> rather see only os::strdup used with all the clients doing the
>>>>>>> null check (somewhere in the stack) and handling it
>>>>>>> appropriately.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> -Zhengyu
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> -Zhengyu
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>> On 7/06/2014 3:05 AM, Zhengyu Gu wrote:
>>>>>>>>>>>> Updated webrev introduces a new DuplicateString() function,
>>>>>>>>>>>> that handles OOM, similar to AllocateHeap(), and replaces
>>>>>>>>>>>> the call sites that do not check NULL pointer with this
>>>>>>>>>>>> DuplicateString().
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~zgu/6424123/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> -Zhengyu
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/4/2014 4:32 PM, Zhengyu Gu wrote:
>>>>>>>>>>>>> JVM should avoid C library's strdup() and use os::strdup()
>>>>>>>>>>>>> instead.
>>>>>>>>>>>>> os::strdup() handles OOM, so can avoid JVM crash.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, made limited scope of code cleanup, which makes
>>>>>>>>>>>>> memory ownership more explicit, so they can be claimed by
>>>>>>>>>>>>> object's destructors.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-6424123
>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/6424123/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tests:
>>>>>>>>>>>>> - JPRT
>>>>>>>>>>>>> - vm.quick.testlist on Linux x64.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Zhengyu
>
More information about the hotspot-runtime-dev
mailing list