RFR(s): 8170520: Make Metaspace ChunkManager counters non-atomic
Mikael Gerdin
mikael.gerdin at oracle.com
Wed Nov 30 13:36:40 UTC 2016
Hi Thomas
On 2016-11-30 14:30, Thomas Stüfe wrote:
>
> On Wed, Nov 30, 2016 at 2:22 PM, Mikael Gerdin
> <mikael.gerdin at oracle.com <mailto:mikael.gerdin at oracle.com>> wrote:
>
> Hi Thomas,
>
> On 2016-11-30 14:01, Thomas Stüfe wrote:
>
> Hi Mikael,
>
> On Wed, Nov 30, 2016 at 1:41 PM, Mikael Gerdin
> <mikael.gerdin at oracle.com <mailto:mikael.gerdin at oracle.com>
> <mailto:mikael.gerdin at oracle.com
> <mailto:mikael.gerdin at oracle.com>>> wrote:
>
> Hi Thomas,
>
> On 2016-11-30 08:43, Thomas Stüfe wrote:
>
> Hi all,
>
> please take a look at this small fix. (This is one of the
> smaller fixes in
> preparation for a prototype for JDK-8166690, I try to
> move the
> smaller
> stuff out of the way first).
>
> Issue:
> https://bugs.openjdk.java.net/browse/JDK-8170520
> <https://bugs.openjdk.java.net/browse/JDK-8170520>
> <https://bugs.openjdk.java.net/browse/JDK-8170520
> <https://bugs.openjdk.java.net/browse/JDK-8170520>>
>
> webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8170520-Make-
> <http://cr.openjdk.java.net/%7Estuefe/webrevs/8170520-Make->
>
> <http://cr.openjdk.java.net/~stuefe/webrevs/8170520-Make-
> <http://cr.openjdk.java.net/%7Estuefe/webrevs/8170520-Make->>
>
> Metaspace-ChunkManager-Counters-NonAtomic/webrev.00/webrev/
>
>
> I threw some testing at your change and it seems like you've
> forgotten about calling inc_free_chunks_total for the
> humongous
> chunks in ~SpaceManager. I'm going to run more tests on a
> patched
> build where I just hack in a call to inc_free_chunks_total
> for reach
> humongous chunk.
>
>
> Ouch, I think you are a right, I forgot about that case :(
> What tests
> did you run?
>
>
> I ran an internal test suite, unfortunately.
> It basically runs the JCK, loading each test case in a new class
> loader, thereby triggering insane amounts of class unloading and
> metaspaces to be created.
>
>
> Ah, that explains it. I tried jtreg runtime/Metaspace and
> -XX:+ExecuteInternalVMTests, both ran fine. But we also run nightly
> jck tests, so I'd like to check why I did not see an error. What was
> the exact error you had, did the assert
> in ChunkManager::locked_verify_free_chunks_total() trigger?
I hit the assert in dec_free_chunks_total
# Internal Error (/opt/jprt/T/P1/094027.mgerdin/s/hotspot/src/share/vm/memory/metaspace.cpp:117), pid=7532, tid=7560
# assert(_free_chunks_count > 0 && _free_chunks_total >= v) failed: About to go negative
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x1448fe7] VMError::report_and_die(int, char const*, char const*, char*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned int)+0x137
V [libjvm.so+0x1449ce0] VMError::report_and_die(Thread*, char const*, int, char const*, char const*, char*)+0x30
V [libjvm.so+0x9266a0] report_vm_error(char const*, int, char const*, char const*, ...)+0x60
V [libjvm.so+0x10357ac] ChunkManager::free_chunks_get(unsigned int)+0xcc
V [libjvm.so+0x1035953] ChunkManager::chunk_freelist_allocate(unsigned int)+0x23
V [libjvm.so+0x10376cd] SpaceManager::get_new_chunk(unsigned int)+0x2d
V [libjvm.so+0x103910b] SpaceManager::grow_and_allocate(unsigned int)+0xeb
V [libjvm.so+0x1039439] SpaceManager::allocate_work(unsigned int)+0x1d9
V [libjvm.so+0x10395fc] SpaceManager::allocate(unsigned int)+0x11c
V [libjvm.so+0x1039910] Metaspace::allocate(ClassLoaderData*, unsigned int, bool, MetaspaceObj::Type, Thread*)+0x100
V [libjvm.so+0x1063a51] MethodData::allocate(ClassLoaderData*, methodHandle const&, Thread*)+0x41
V [libjvm.so+0x104edf2] Method::build_interpreter_method_data(methodHandle const&, Thread*)+0x82
V [libjvm.so+0x7728d4] ciMethod::ensure_method_data(methodHandle)+0x2b4
V [libjvm.so+0x772a33] ciMethod::ensure_method_data()+0xe3
V [libjvm.so+0x5d958c] Compilation::compile_java_method()+0x73c
V [libjvm.so+0x5da10d] Compilation::compile_method()+0x23d
V [libjvm.so+0x5dac84] Compilation::Compilation(AbstractCompiler*, ciEnv*, ciMethod*, int, BufferBlob*, DirectiveSet*)+0x3d4
V [libjvm.so+0x5dccfd] Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x13d
V [libjvm.so+0x8b1f93] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x5c3
V [libjvm.so+0x8b309f] CompileBroker::compiler_thread_loop()+0x43f
V [libjvm.so+0x139e168] compiler_thread_entry(JavaThread*, Thread*)+0x28
V [libjvm.so+0x13b0cf0] JavaThread::thread_main_inner()+0x220
V [libjvm.so+0x13b0f8a] JavaThread::run()+0x1ca
V [libjvm.so+0x1121694] thread_native_entry(Thread*)+0x124
C [libpthread.so.0+0x6b5c] start_thread+0xcc
C [libc.so.6+0xf7a5e] clone+0x5e
_free_chunks_total was about to go negative but _free_chunks_count was 63.
>
> About the cleanup for ChunkManager::return_chunks(), should I do this
> as part of this patch or as an extra patch?
I think it would be better to do that as a separate patch.
Thanks
/Mikael
>
>
> It would be great if you could have a look at unifying the
> ChunkManager::return_chunks API for standard and humongous
> chunks so
> that we wouldn't need the special casing for this in
> ~SpaceManager
> That way we could also make humongous_dictionary() private to
> ChunkManager which would improve encapsulation.
>
>
> Yes, this makes sense, I will do that. Thank you for testing!
>
>
> The test passed with my hack to fix the humongous chunks issue but
> I'll continue running further tests and report back if I see any
> further failures.
>
>
> Thanks!
> ..Thomas
>
>
>
> /Mikael
>
>
>
> ..Thomas
>
>
> Thanks
> /Mikael
>
>
>
> Small discussion here:
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-October/021709.html
> <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-October/021709.html>
>
> <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-October/021709.html
> <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-October/021709.html>>
>
> Background: ChunkManager keeps counters
> (_free_chunks_count and
> free_chunks_total). These counters are updated using
> Atomics.
> Because this
> is expensive, code goes thru some lengths to
> accumulate updates,
> which in
> turn makes the code more complicated and error prone as
> necessary. But
> updating the counters atomically is not needed,
> because updates
> happen
> under lock protection anyway.
>
> The proposed fix makes updates non-atomic and moved
> the call to
> inc_free_chunks_total() into
> ChunkManager::return_chunks() -
> close to where
> the chunk is actually returned to the freelist, so the
> time
> window where
> the counters are invalid is very small now.
>
> It also fixes an assert and makes inc_free_chunks_total()
> private because
> now it can be private.
>
> Note that I do not intend to push this into 9, so this
> is for
> the upcoming
> 10 repository. I would like to get some reviews
> nevertheless, so
> this small
> change can be pushed quickly once the 10 repo exists.
>
> Thanks!
>
> Best Regards, Thomas
>
>
>
More information about the hotspot-runtime-dev
mailing list