RFR(xxxs) (jdk10): 8168542: os::realloc should return a valid pointer for input size=0

Thomas Stüfe thomas.stuefe at gmail.com
Wed Mar 1 08:07:49 UTC 2017


Hi,

thank you all for the decision. I updated the webrev in place, no change,
just added the reviewers:

http://cr.openjdk.java.net/~stuefe/webrevs/8168542-os_realloc_size_0/jdk10-webrev.01/

I'll need someone to sponsor this change.

Kind Regards, Thomas



On Wed, Mar 1, 2017 at 5:49 AM, Chris Plummer <chris.plummer at oracle.com>
wrote:

> Hi Thomas,
>
> "first patch" is malloc(0) is treated as malloc(1), right? If that's the
> case, you already know I'm in agreement. Glad to see others are also
> agreeing. Not pretty, but reasonable given the current state of things.
>
> cheers,
>
> Chris
>
>
> On 2/28/17 7:21 AM, Thomas Stüfe wrote:
>
> Thanks Robbin! I'll wait for the others to chime in before continuing work
> on this.
>
> Kind Regards, Thomas
>
> On Tue, Feb 28, 2017 at 12:52 PM, Robbin Ehn <robbin.ehn at oracle.com>
> wrote:
>
>> Thanks for testing the assert approach.
>>
>> Since you found malloc(0) is an "allowed pattern" I'm fine with the first
>> patch.
>> We should add some comments about this in os.hpp
>>
>> And thanks again for the investigation!
>>
>> /Robbin
>>
>>
>> On 02/27/2017 12:20 PM, Thomas Stüfe wrote:
>>
>>> I attempted to "fix" some of the occurrences by skipping the malloc() if
>>> n was zero, but
>>> in all cases the coding ended up less readable than before. (Sorry, I
>>> cannot post this fix attempt, I accidentally deleted my patch).
>>>
>>> Here are some examples of asserts:
>>>
>>> --
>>>
>>> 1) 33 V  [libjvm.so+0xb17625]  G1VerifyYoungCSetIndicesClosur
>>> e::G1VerifyYoungCSetIndicesClosure(unsigned long)+0xbb
>>>  34 V  [libjvm.so+0xb16588]  G1CollectionSet::verify_young_cset_indices()
>>> const+0x1a8
>>>
>>> Happens in a lot of tests. G1CollectionSet::verify_young_cset_indices()
>>> is called when G1CollectionSet::_collection_set_cur_length is zero. I
>>> assume this is fine,
>>> verification could just be skipped at this point, so the fix would be
>>> trivial.
>>>
>>> --
>>>
>>> gc/arguments/TestG1ConcRefinementThreads: VM is called with
>>> -XX:G1ConcRefinementThreads=0. Then, we assert here:
>>>
>>>  32 V  [libjvm.so+0x5adeee]  AllocateHeap(unsigned long, MemoryType,
>>> NativeCallStack const&, AllocFa-XX:G1ConcRefinementThr
>>> eads=0ilStrategy::AllocFailEnum)+0x3b
>>>  33 V  [libjvm.so+0x954689]  ConcurrentG1Refine::create(CardTableEntryClosure*,
>>> int*)+0x197
>>>  34 V  [libjvm.so+0xaf596a]  G1CollectedHeap::initialize()+0x1a4
>>>  35 V  [libjvm.so+0x127e940]  Universe::initialize_heap()+0x62
>>>
>>> Not sure at which layer one would handle this. If
>>> -XX:G1ConcRefinementThreads=0 makes no sense as an option, should this
>>> setting not just be forbidden?
>>>
>>> --
>>> At a number of tests linux numa initialization failed:
>>>
>>>  32 V  [libjvm.so+0x5adeee]  AllocateHeap(unsigned long, MemoryType,
>>> NativeCallStack const&, AllocFailStrategy::AllocFailEnum)+0x3b
>>>  33 V  [libjvm.so+0xbf7aff]  GenericGrowableArray::raw_allo
>>> cate(int)+0x12f
>>>  34 V  [libjvm.so+0x6b6b89]  GrowableArray<int>::GrowableArray(int,
>>> bool, MemoryType)+0x65
>>>  35 V  [libjvm.so+0x105b294]  os::Linux::libnuma_init()+0x17e
>>>
>>> Here, a GrowableArray is initialized with size = 0. This seems to be an
>>> allowed pattern, so GrowableArray would have to be fixed to delay the
>>> malloc until the first real
>>> allocation.
>>>
>>> --
>>>
>>> So, options:
>>>
>>> - we go this way - all the occurrences would have to be fixed, and
>>> programmers would have to be made aware of the changed behavior of
>>> os::malloc(size=0).
>>> - we could only assert for realloc(size=0), but I find this even more
>>> inconsistent than it is now.
>>> - we could rethink the original decision.
>>>
>>> More things to ponder:
>>>
>>> - what about free(NULL), do we want to assert here too? On one hand,
>>> this may indicate an error, possibly a double free(). On the other hand,
>>> passing NULL to free() is even
>>> more a known established pattern than passing size=0 to malloc, and
>>> programmers may just do this expecting os::free() to have the same
>>> semantics as ::free().
>>>
>>> - this patch relies on the fact that we have control about who calls
>>> os::malloc/os::free (similar to the decision to use malloc headers in both
>>> NMT and os.cpp). Should we
>>> ever open up our malloc/free calls to the outside world, we may loose
>>> this control and would have to make sure that os::malloc/os::free behave
>>> like ::malloc/::free. Why
>>> would we open os::malloc to the outside? For instance, to get NMT malloc
>>> coverage over the JDK native libraries. Granted, a lot more would have to
>>> change to enable this.
>>> (Note that we have a malloc statistic in place in our fork which spans
>>> hotspot + native jdk, and we do this by calling os::malloc from the jdk
>>> native code).
>>>
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>> On Wed, Feb 8, 2017 at 10:01 AM, Robbin Ehn <robbin.ehn at oracle.com
>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>
>>>     Hi,
>>>
>>>     On 02/07/2017 02:24 AM, Chris Plummer wrote:
>>>
>>>         I don't think we can push a change to assert when size == 0
>>> without first thoroughly testing it and fixing any places that currently do
>>> that. Testing may turn up
>>>         nothing,
>>>         or it may turn up a rat's nest. Just something to think about
>>> before starting down this path.
>>>
>>>
>>>     Since there is a while since jdk10 will be GA, now is the best time.
>>>
>>>
>>>         I'm not as offended by the size 1 approach as others, and it is
>>> what we already do for malloc(). It probably  makes sense to have malloc
>>> and realloc be consistent
>>>         in this
>>>         regard, regardless of what the native version do by default
>>> (which can vary). Just given the fact that we are even having this
>>> discussion and debating which is best
>>>         tends
>>>         to make me think that caller's of malloc() and realloc() either
>>> didn't even give this topic any thought, or possibly came to differing
>>> conclusions on whether or not it
>>>         mattered if they passed in a size == 0.
>>>
>>>
>>>     I agree that realloc and malloc should do the same.
>>>
>>>     IEEE Std 1003.1, 2004 Edition says:
>>>     "If size is 0 and ptr is not a null pointer, the object pointed to
>>> is freed."
>>>
>>>     So as I said in previously mail, there is nothing wrong with doing
>>> realloc(ptr, 0), but I also don't think we should do it hotspot.
>>>     Doing malloc(0) is undefined, so this really should be at least an
>>> assert.
>>>
>>>     I'm all favor assert for both realloc and malloc.
>>>
>>>     I can do pre-testing, KS, jtreg, tonga or whatever we think is
>>> appropriate.
>>>     (if this is a rat's nest we can change our minds...)
>>>
>>>     /Robbin
>>>
>>>
>>>
>>>
>>>         Chris
>>>
>>>         On 2/6/17 3:49 AM, Dmitry Samersoff wrote:
>>>
>>>             Thomas,
>>>
>>>             I share all David's concerns.
>>>
>>>             Both Linux and BSD return NULL if size for realloc is zero.
>>>
>>>             IMHO, if we call realloc with size == 0 it means a program
>>> error on
>>>             higher level and we should fix it rather than hide.
>>>
>>>             So it might be better to assert size > 0 inside realloc and
>>> return NULL.
>>>
>>>             -Dmitry
>>>
>>>
>>>             On 2017-02-05 13:56, Thomas Stüfe wrote:
>>>
>>>                 Hi David,
>>>
>>>                 some context: the whole issue came into being because
>>> for a long time we
>>>                 had exchanged os::malloc/os::realloc with our own
>>> versions for malloc
>>>                 tracking purposes. This was way before NMT.
>>>
>>>                 When writing the os::malloc/realloc replacements and
>>> associated tests, we
>>>                 took care to mimick the behavior of native
>>> malloc/realloc in a consistent
>>>                 matter. That way we could exchange malloc calls with my
>>> os::malloc on a
>>>                 case by case base without behavior change (or put it
>>> differently, nothing
>>>                 would break if I accidentally miss an instrumentation in
>>> code and leave it
>>>                 a native malloc/realloc)
>>>
>>>                 Now we went back to the OpenJDK version of
>>> os::malloc()/os::realloc() to
>>>                 reduce maintenance, and my cornercase tests fail. Hence
>>> this bug report.
>>>
>>>                 Please find further answers inline.
>>>
>>>                 On Sun, Feb 5, 2017 at 3:50 AM, David Holmes <
>>> david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
>>>                 wrote:
>>>
>>>                     Hi Thomas,
>>>
>>>                     On 4/02/2017 10:09 PM, Thomas Stüfe wrote:
>>>
>>>                         Hi guys,
>>>
>>>                         picking up this trivial issue which got pushed
>>> to jdk10, could I please
>>>                         have a re-review? Oh, also a sponsor.
>>>
>>>                         Issue:
>>>                         https://bugs.openjdk.java.net/browse/JDK-8168542
>>> <https://bugs.openjdk.java.net/browse/JDK-8168542>
>>>
>>>                         webrev:
>>>                         http://cr.openjdk.java.net/~st
>>> uefe/webrevs/8168542-os_reallo <http://cr.openjdk.java.net/~s
>>> tuefe/webrevs/8168542-os_reallo>
>>>                         c_size_0/jdk10-webrev.01/webrev/
>>>
>>>                         For reference, the old discussion back in
>>> october 16:
>>>                         http://mail.openjdk.java.net/p
>>> ipermail/hotspot-runtime-dev/2 <http://mail.openjdk.java.net/
>>> pipermail/hotspot-runtime-dev/2>
>>>
>>>                         016-October/021675.html
>>>
>>>                     I am still rather confused by both existing code and
>>> proposed change - the
>>>                     code just seems wrong if passed in a zero size.
>>> Passing in zero is a bug
>>>                     AFAICS and if buggy code passes in zero then how do
>>> we know what it will do
>>>                     with the returned value from os::realloc ?
>>>
>>>
>>>
>>>                 Currently, in the OpenJDK version, we have the following
>>> behavior:
>>>
>>>                 1 ASSERT
>>>                   a os::malloc
>>>                 size==0: return valid pointer to 1 byte array
>>>                   b os::realloc
>>>                 ptr != NULL && size==0: return NULL
>>>
>>>                 2 !ASSERT
>>>                   a      os::malloc
>>>                 size==0: return valid pointer to 1 byte array
>>>                   b      os::realloc
>>>                 ptr != NULL && size==0: whatever the native realloc
>>> does, which may be
>>>                 returning NULL or returning a valid allocation.
>>>
>>>                 This is inconsistent.
>>>
>>>                 A) 1a and 1b, although arguably similar cases, behave
>>> differently.
>>>                 B) 1b and 2b may behave differently, depending on the
>>> behaviour of the
>>>                 native CRT calls. Which is IMHO worse than case (A). You
>>> do not want
>>>                 different behaviour between ASSERT and !ASSERT.
>>>
>>>                 Additionally, for (B) it is not good that we delegate
>>> cornercase behavior
>>>                 to the native CRT, because that may lead to
>>> os::realloc() to have different
>>>                 behavior between platforms. On one platform,
>>> ::realloc(ptr,0) may return
>>>                 NULL, on another it may return a pointer. So you get
>>> platform dependent
>>>                 errors.
>>>
>>>                 My patch proposes to change the behavior like this:
>>>
>>>                 1 ASSERT
>>>                   a os::malloc
>>>                 size==0: return valid pointer to 1 byte array
>>>                   b os::realloc
>>>                 ptr != NULL && size==0: return valid pointer to 1 byte
>>> array
>>>
>>>                 2 !ASSERT
>>>                   a      os::malloc
>>>                 size==0: return valid pointer to 1 byte array
>>>                   b      os::realloc
>>>                 ptr != NULL && size==0: return valid pointer to 1 byte
>>> array
>>>
>>>                 This is more consistent.
>>>
>>>                 -----------
>>>
>>>                 Aside from the consistency issue:
>>>
>>>                 I think if someone passes 0 as resize size, this may
>>> mean he calculated the
>>>                 resize size wrong. Note that this does not necessarily
>>> mean a fatal error,
>>>                 if he has the correct notion of buffer size - 0 - and
>>> does not overwrite
>>>                 anything, nothing bad will happen.
>>>
>>>                 In this situation you have three choices
>>>
>>>                 1) return NULL. Caller will either incorrectly access
>>> the pointer and
>>>                 crash, or test the pointer and all cases I saw then
>>> assume OOM. Also bad.
>>>
>>>                 2) assert. This just seems plain wrong. Inconsistent
>>> with CRT behavior, and
>>>                 what do you do in release code?
>>>
>>>                 3) return a 1 byte allocation. That is what I propose.
>>> This may in the
>>>                 worst case lead to memory leaks; but only if caller
>>> expected the behaviour
>>>                 of os::realloc(ptr, 0) to be "free ptr and return NULL",
>>> which I have seen
>>>                 nowhere. If caller is oblivious to giving us size 0, he
>>> will be freeing the
>>>                 memory later.
>>>
>>>                 (Note that you could easily prevent memory leaks by just
>>> returning a
>>>                 special global static sentinel pointer for size==0, but
>>> you have to be sure
>>>                 that all os::realloc calls are matched by os::free
>>> calls.)
>>>
>>>                 I see your point about treating size==0 as an error. But
>>> I am not a big
>>>                 fan, for the mentioned reasons, and would prefer
>>> os::realloc() to behave
>>>                 like native realloc.
>>>
>>>                 But if we follow your suggestion and make size==0 an
>>> assertable offense, we
>>>                 should do so too for malloc(0).
>>>
>>>                 Kind Regards, Thomas
>>>
>>>
>>>
>>>                 David
>>>
>>>
>>>                     The webrev is unchanged from the proposal for jdk9,
>>> just rebased to
>>>
>>>                         jdk10/hs.
>>>
>>>                         @Chris, I decided to not follow your suggestion
>>> to move the comparison
>>>                         into the #ifndef assert. You are right, this is
>>> redundant with the check
>>>                         inside os::malloc(), but as you said, checking
>>> at the entry of
>>>                         os::realloc() makes it clearer.
>>>
>>>                         Thank you!
>>>
>>>                         Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
>


More information about the hotspot-runtime-dev mailing list