RFR (M): 8077144: Concurrent mark initialization takes too long
Thomas Schatzl
thomas.schatzl at oracle.com
Mon Mar 7 23:33:05 UTC 2016
Hi Kim,
thanks for your comments.
On Mon, 2016-03-07 at 17:21 -0500, Kim Barrett wrote:
> > On Mar 4, 2016, at 4:52 AM, Thomas Schatzl <
> > thomas.schatzl at oracle.com> wrote:
> >
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8077144
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8077144/webrev/
> > Testing:
> > jprt, vm.gc testlist
>
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.hpp
> 381 void allocate_large_bitmap(BitMap* bitmap, BitMap::idx_t const
> size_in_bits);
>
> const for the size_in_bits here is superfluous.
Done.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 600 // First-time allocation of the bitmaps ensures that they are
> zeroed out. There
> 601 // is no need to clear them and the liveness count data
> manually.
>
> This comment seems misplaced, or even unnecessary.
The intent of this comment has been to avoid questions why the data
structures are not initialized here. Removing it.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2145 SizeTFlagSetting fs(ArrayAllocatorMallocLimit,
> new_array_allocator_malloc_limit);
>
> --- TBD: Are we *sure* we're still single-threaded here? Is there a
> risk of futzing with ArrayAllocatorMallocLimit affecting some
> other thread? Don't bother figuring this out until the need for
> the modification is discussed.
>
> Wouldn't it be better to give ArrayAllocator a new API that
> specifically requests mmap or adjusts the use-malloc/mmap boundary,
> rather than this highly roundabout mechanism with potentially
> non-local effect? Perhaps a constructor argument that changes the
> behavior of should_use_malloc.
>
Yes, I was thinking about this right today, particularly because I need
edthat functionality outside of G1ConcurrentMark today anyway.
My idea has been to add something like this to the alloc/realloc
methods, but I am fine with the constructor.
I am sure that at that time we are only single-threaded btw.
> A separate issue is some of the uses of FlagSetting and friends have
> nothing to do with "flags" in the globals.hpp sense, and seem
> semantically suspect. Maybe an RFE here?
Filed JDK-8151412, I hope this is what you meant.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2144 size_t const new_array_allocator_malloc_limit =
> MIN2((size_t)os::vm_allocation_granularity() * 100,
> ArrayAllocatorMallocLimit);
>
> I wonder why vm_allocation_granularity returns int rather than an
> unsigned type? Doing anything about that is out of scope for this
> change, but unfortunately leads toward the cast. Maybe an RFE here?
Filed JDK-8151413.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2147 ArrayAllocator<BitMap::bm_word_t, mtGC> allocator(false);
>
> The use of ArrayAllocator with false free_in_destructor introduces
> cleanup problems that are not being handled. All the new (in this
> change set) and existing uses of that feature appear to be discarding
> the information needed to free the allocated memory when done with
> it, because they aren't capturing the distinction between malloc'ed
> vs. mmap'ed.
>
> And that feature is being used and then handing the memory off to a
> BitMap, which could itself attempt to free the memory (if resize were
> to be called, which hopefully never happens in the cases at hand)
It does not.
Originally I had an extension of the BitMap::resize() method that
automatically handled this. However this introduced more complexity
there, and potential regressions in behavior, so I created this helper
method in G1ConcurrentMark.
I guess I will see to extending BitMap after all.
> and have no clue about how to do so.
>
> A possible way to deal with this would be to have an ArrayAllocator
> support swap (even better would be C++11 move, but...) and swap the
> external allocator with the one in the BitMap, so that when the
> BitMap is destroyed it will properly deal with the deallocation.
>
> This is a pre-existing issue, with the current change introducing a
> new occurrence of the problem.
This is a known issue.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2166 allocate_large_bitmap(&_region_bm, (BitMap::idx_t)(_g1h
> ->max_regions()));
>
> Why cast the max_regions value? I think currently there is no problem
> (looks like uint => size_t). We might someday turn on things like
> -Wconversion, in which case this cast would simply mask a (potential)
> problem if a problem exists then.
Fixed.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2167 allocate_large_bitmap(&_card_bm, align_size_up_(_g1h
> ->reserved_region().byte_size(), CardTableModRefBS::card_size) /
> CardTableModRefBS::card_size);
>
> Abuse of align_size_up_ macro, which is supposed to be for use in
> places where a constant expression is required, with align_size_up
> function being preferred. Unfortunately, then one runs afoul of the
> not very useful signature for align_size_up. We could spend a lot of
> time discussing shortcomings in that vicinity. Maybe an RFE here?
You mean, in the vicinity of align_size_up? I agree. What kind of RFE
do you suggest here? Do you mind filing an RFE here because apparently
you have lots of issues in mind...
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2169 _count_card_bitmaps = NEW_C_HEAP_ARRAY(BitMap,
> _max_worker_id, mtGC);
>
> Extra space before _max_worker_id.
Fixed.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2176 _count_marked_bytes = Padded2DArray<size_t,
> mtGC>::create_unfreeable(_max_worker_id, _g1h->max_regions(), false);
>
> The call to create_unfreeable appears to be incorrect. The third
> argument is a size_t* to the allocation_size. This call should be
> implicitly converting false to a null pointer, and leaving the new
> force_pretouch argument with its default true value.
>
> This should have been caught by gcc -Wconversion-null, which is
> enabled by default and I don't see any disabling of it in our build
> system. I'm puzzled as to how this builds?
>
> What am I missing here?
You were looking at an old version of the change where this has not
been fixed. Sorry. The code in the local version looks as follows:
size_t dummy;
_count_marked_bytes = Padded2DArray<size_t,
mtGC>::create_unfreeable(_max_worker_id, _g1h->max_regions(), &dummy,
false);
This is the only difference to the version in the webrev.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/gc/g1/g1ConcurrentMark.cpp
> 2193 os::pretouch_memory(count_marked, count_marked +
> (size_t)G1CollectedHeap::heap()->max_regions() * sizeof(size_t));
>
> Another unnecessary cast of max_regions.
Fixed.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/memory/allocation.hpp
> 748 // Indicates whether the last (re-)allocation used the C-heap
> malloc or virtual space.
> 749 bool use_malloc() const { return _use_malloc; }
>
> I think used_malloc would be a better name for a query; use_malloc
> seems like a request/command.
Fixed.
> ---------------------------------------------------------------------
> ---------
> src/share/vm/utilities/bitMap.cpp
> 66 os::pretouch_memory((char*)word_addr(0),
> (char*)word_addr(size()));
>
> Regarding the casts, here and in new G1PretouchInternalBitmapsTask,
> why does pretouch_memory want char* rather than void*? Maybe an RFE
> here? There are a few existing calls that could be cleaned up too.
Potentially because at the time the call had been introduced, char*
fit. Filed JDK-8151414.
> ---------------------------------------------------------------------
> ---------
> test/gc/g1/Test2GbHeap.java
> 28 * Skip test on 32 bit Windows: it typically does not support
> the many and large virtual memory reservations needed.
> 29 * @requires (vm.gc == "G1" | vm.gc == "null") &
> ((sun.arch.data.model != "32") | (os.family != "windows"))
>
> jtreg supports multiple @requires options, implicitly and'ing them
> together (*). Splitting the @requires here would make it more
> readable.
>
> And could the second clause be written as the following?
>
> !((sun.arg.data.model = "32") & (os.family = "windows"))
Fixed.
Thanks again for your review. I will prepare a new webrev later.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list