RFR(xxxs) (jdk10): 8168542: os::realloc should return a valid pointer for input size=0
Thomas Stüfe
thomas.stuefe at gmail.com
Tue Feb 28 15:21:27 UTC 2017
Thanks Robbin! I'll wait for the others to chime in before continuing work
on this.
Kind Regards, Thomas
On Tue, Feb 28, 2017 at 12:52 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> Thanks for testing the assert approach.
>
> Since you found malloc(0) is an "allowed pattern" I'm fine with the first
> patch.
> We should add some comments about this in os.hpp
>
> And thanks again for the investigation!
>
> /Robbin
>
>
> On 02/27/2017 12:20 PM, Thomas Stüfe wrote:
>
>> I attempted to "fix" some of the occurrences by skipping the malloc() if
>> n was zero, but
>> in all cases the coding ended up less readable than before. (Sorry, I
>> cannot post this fix attempt, I accidentally deleted my patch).
>>
>> Here are some examples of asserts:
>>
>> --
>>
>> 1) 33 V [libjvm.so+0xb17625] G1VerifyYoungCSetIndicesClosur
>> e::G1VerifyYoungCSetIndicesClosure(unsigned long)+0xbb
>> 34 V [libjvm.so+0xb16588] G1CollectionSet::verify_young_cset_indices()
>> const+0x1a8
>>
>> Happens in a lot of tests. G1CollectionSet::verify_young_cset_indices()
>> is called when G1CollectionSet::_collection_set_cur_length is zero. I
>> assume this is fine,
>> verification could just be skipped at this point, so the fix would be
>> trivial.
>>
>> --
>>
>> gc/arguments/TestG1ConcRefinementThreads: VM is called with
>> -XX:G1ConcRefinementThreads=0. Then, we assert here:
>>
>> 32 V [libjvm.so+0x5adeee] AllocateHeap(unsigned long, MemoryType,
>> NativeCallStack const&, AllocFa-XX:G1ConcRefinementThreads=0ilStrategy::
>> AllocFailEnum)+0x3b
>> 33 V [libjvm.so+0x954689] ConcurrentG1Refine::create(CardTableEntryClosure*,
>> int*)+0x197
>> 34 V [libjvm.so+0xaf596a] G1CollectedHeap::initialize()+0x1a4
>> 35 V [libjvm.so+0x127e940] Universe::initialize_heap()+0x62
>>
>> Not sure at which layer one would handle this. If
>> -XX:G1ConcRefinementThreads=0 makes no sense as an option, should this
>> setting not just be forbidden?
>>
>> --
>> At a number of tests linux numa initialization failed:
>>
>> 32 V [libjvm.so+0x5adeee] AllocateHeap(unsigned long, MemoryType,
>> NativeCallStack const&, AllocFailStrategy::AllocFailEnum)+0x3b
>> 33 V [libjvm.so+0xbf7aff] GenericGrowableArray::raw_allo
>> cate(int)+0x12f
>> 34 V [libjvm.so+0x6b6b89] GrowableArray<int>::GrowableArray(int,
>> bool, MemoryType)+0x65
>> 35 V [libjvm.so+0x105b294] os::Linux::libnuma_init()+0x17e
>>
>> Here, a GrowableArray is initialized with size = 0. This seems to be an
>> allowed pattern, so GrowableArray would have to be fixed to delay the
>> malloc until the first real
>> allocation.
>>
>> --
>>
>> So, options:
>>
>> - we go this way - all the occurrences would have to be fixed, and
>> programmers would have to be made aware of the changed behavior of
>> os::malloc(size=0).
>> - we could only assert for realloc(size=0), but I find this even more
>> inconsistent than it is now.
>> - we could rethink the original decision.
>>
>> More things to ponder:
>>
>> - what about free(NULL), do we want to assert here too? On one hand, this
>> may indicate an error, possibly a double free(). On the other hand, passing
>> NULL to free() is even
>> more a known established pattern than passing size=0 to malloc, and
>> programmers may just do this expecting os::free() to have the same
>> semantics as ::free().
>>
>> - this patch relies on the fact that we have control about who calls
>> os::malloc/os::free (similar to the decision to use malloc headers in both
>> NMT and os.cpp). Should we
>> ever open up our malloc/free calls to the outside world, we may loose
>> this control and would have to make sure that os::malloc/os::free behave
>> like ::malloc/::free. Why
>> would we open os::malloc to the outside? For instance, to get NMT malloc
>> coverage over the JDK native libraries. Granted, a lot more would have to
>> change to enable this.
>> (Note that we have a malloc statistic in place in our fork which spans
>> hotspot + native jdk, and we do this by calling os::malloc from the jdk
>> native code).
>>
>>
>> Kind Regards, Thomas
>>
>>
>>
>>
>> On Wed, Feb 8, 2017 at 10:01 AM, Robbin Ehn <robbin.ehn at oracle.com
>> <mailto:robbin.ehn at oracle.com>> wrote:
>>
>> Hi,
>>
>> On 02/07/2017 02:24 AM, Chris Plummer wrote:
>>
>> I don't think we can push a change to assert when size == 0
>> without first thoroughly testing it and fixing any places that currently do
>> that. Testing may turn up
>> nothing,
>> or it may turn up a rat's nest. Just something to think about
>> before starting down this path.
>>
>>
>> Since there is a while since jdk10 will be GA, now is the best time.
>>
>>
>> I'm not as offended by the size 1 approach as others, and it is
>> what we already do for malloc(). It probably makes sense to have malloc
>> and realloc be consistent
>> in this
>> regard, regardless of what the native version do by default
>> (which can vary). Just given the fact that we are even having this
>> discussion and debating which is best
>> tends
>> to make me think that caller's of malloc() and realloc() either
>> didn't even give this topic any thought, or possibly came to differing
>> conclusions on whether or not it
>> mattered if they passed in a size == 0.
>>
>>
>> I agree that realloc and malloc should do the same.
>>
>> IEEE Std 1003.1, 2004 Edition says:
>> "If size is 0 and ptr is not a null pointer, the object pointed to is
>> freed."
>>
>> So as I said in previously mail, there is nothing wrong with doing
>> realloc(ptr, 0), but I also don't think we should do it hotspot.
>> Doing malloc(0) is undefined, so this really should be at least an
>> assert.
>>
>> I'm all favor assert for both realloc and malloc.
>>
>> I can do pre-testing, KS, jtreg, tonga or whatever we think is
>> appropriate.
>> (if this is a rat's nest we can change our minds...)
>>
>> /Robbin
>>
>>
>>
>>
>> Chris
>>
>> On 2/6/17 3:49 AM, Dmitry Samersoff wrote:
>>
>> Thomas,
>>
>> I share all David's concerns.
>>
>> Both Linux and BSD return NULL if size for realloc is zero.
>>
>> IMHO, if we call realloc with size == 0 it means a program
>> error on
>> higher level and we should fix it rather than hide.
>>
>> So it might be better to assert size > 0 inside realloc and
>> return NULL.
>>
>> -Dmitry
>>
>>
>> On 2017-02-05 13:56, Thomas Stüfe wrote:
>>
>> Hi David,
>>
>> some context: the whole issue came into being because for
>> a long time we
>> had exchanged os::malloc/os::realloc with our own
>> versions for malloc
>> tracking purposes. This was way before NMT.
>>
>> When writing the os::malloc/realloc replacements and
>> associated tests, we
>> took care to mimick the behavior of native malloc/realloc
>> in a consistent
>> matter. That way we could exchange malloc calls with my
>> os::malloc on a
>> case by case base without behavior change (or put it
>> differently, nothing
>> would break if I accidentally miss an instrumentation in
>> code and leave it
>> a native malloc/realloc)
>>
>> Now we went back to the OpenJDK version of
>> os::malloc()/os::realloc() to
>> reduce maintenance, and my cornercase tests fail. Hence
>> this bug report.
>>
>> Please find further answers inline.
>>
>> On Sun, Feb 5, 2017 at 3:50 AM, David Holmes <
>> david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
>> wrote:
>>
>> Hi Thomas,
>>
>> On 4/02/2017 10:09 PM, Thomas Stüfe wrote:
>>
>> Hi guys,
>>
>> picking up this trivial issue which got pushed to
>> jdk10, could I please
>> have a re-review? Oh, also a sponsor.
>>
>> Issue:
>> https://bugs.openjdk.java.net/browse/JDK-8168542
>> <https://bugs.openjdk.java.net/browse/JDK-8168542>
>>
>> webrev:
>> http://cr.openjdk.java.net/~st
>> uefe/webrevs/8168542-os_reallo <http://cr.openjdk.java.net/~s
>> tuefe/webrevs/8168542-os_reallo>
>> c_size_0/jdk10-webrev.01/webrev/
>>
>> For reference, the old discussion back in october
>> 16:
>> http://mail.openjdk.java.net/p
>> ipermail/hotspot-runtime-dev/2 <http://mail.openjdk.java.net/
>> pipermail/hotspot-runtime-dev/2>
>>
>> 016-October/021675.html
>>
>> I am still rather confused by both existing code and
>> proposed change - the
>> code just seems wrong if passed in a zero size.
>> Passing in zero is a bug
>> AFAICS and if buggy code passes in zero then how do
>> we know what it will do
>> with the returned value from os::realloc ?
>>
>>
>>
>> Currently, in the OpenJDK version, we have the following
>> behavior:
>>
>> 1 ASSERT
>> a os::malloc
>> size==0: return valid pointer to 1 byte array
>> b os::realloc
>> ptr != NULL && size==0: return NULL
>>
>> 2 !ASSERT
>> a os::malloc
>> size==0: return valid pointer to 1 byte array
>> b os::realloc
>> ptr != NULL && size==0: whatever the native realloc does,
>> which may be
>> returning NULL or returning a valid allocation.
>>
>> This is inconsistent.
>>
>> A) 1a and 1b, although arguably similar cases, behave
>> differently.
>> B) 1b and 2b may behave differently, depending on the
>> behaviour of the
>> native CRT calls. Which is IMHO worse than case (A). You
>> do not want
>> different behaviour between ASSERT and !ASSERT.
>>
>> Additionally, for (B) it is not good that we delegate
>> cornercase behavior
>> to the native CRT, because that may lead to os::realloc()
>> to have different
>> behavior between platforms. On one platform,
>> ::realloc(ptr,0) may return
>> NULL, on another it may return a pointer. So you get
>> platform dependent
>> errors.
>>
>> My patch proposes to change the behavior like this:
>>
>> 1 ASSERT
>> a os::malloc
>> size==0: return valid pointer to 1 byte array
>> b os::realloc
>> ptr != NULL && size==0: return valid pointer to 1 byte
>> array
>>
>> 2 !ASSERT
>> a os::malloc
>> size==0: return valid pointer to 1 byte array
>> b os::realloc
>> ptr != NULL && size==0: return valid pointer to 1 byte
>> array
>>
>> This is more consistent.
>>
>> -----------
>>
>> Aside from the consistency issue:
>>
>> I think if someone passes 0 as resize size, this may mean
>> he calculated the
>> resize size wrong. Note that this does not necessarily
>> mean a fatal error,
>> if he has the correct notion of buffer size - 0 - and
>> does not overwrite
>> anything, nothing bad will happen.
>>
>> In this situation you have three choices
>>
>> 1) return NULL. Caller will either incorrectly access the
>> pointer and
>> crash, or test the pointer and all cases I saw then
>> assume OOM. Also bad.
>>
>> 2) assert. This just seems plain wrong. Inconsistent with
>> CRT behavior, and
>> what do you do in release code?
>>
>> 3) return a 1 byte allocation. That is what I propose.
>> This may in the
>> worst case lead to memory leaks; but only if caller
>> expected the behaviour
>> of os::realloc(ptr, 0) to be "free ptr and return NULL",
>> which I have seen
>> nowhere. If caller is oblivious to giving us size 0, he
>> will be freeing the
>> memory later.
>>
>> (Note that you could easily prevent memory leaks by just
>> returning a
>> special global static sentinel pointer for size==0, but
>> you have to be sure
>> that all os::realloc calls are matched by os::free calls.)
>>
>> I see your point about treating size==0 as an error. But
>> I am not a big
>> fan, for the mentioned reasons, and would prefer
>> os::realloc() to behave
>> like native realloc.
>>
>> But if we follow your suggestion and make size==0 an
>> assertable offense, we
>> should do so too for malloc(0).
>>
>> Kind Regards, Thomas
>>
>>
>>
>> David
>>
>>
>> The webrev is unchanged from the proposal for jdk9,
>> just rebased to
>>
>> jdk10/hs.
>>
>> @Chris, I decided to not follow your suggestion
>> to move the comparison
>> into the #ifndef assert. You are right, this is
>> redundant with the check
>> inside os::malloc(), but as you said, checking at
>> the entry of
>> os::realloc() makes it clearer.
>>
>> Thank you!
>>
>> Kind Regards, Thomas
>>
>>
>>
>>
>>
>>
>>
More information about the hotspot-runtime-dev
mailing list