[13] RFR(L) 8220623: [JVMCI] Update JVMCI to support JVMCI based Compiler compiled into shared library
Stefan Karlsson
stefan.karlsson at oracle.com
Wed Apr 3 06:35:14 UTC 2019
On 2019-04-02 22:41, Vladimir Kozlov wrote:
> I ran Kitchensink with G1 and -Xmx8g. I observed that Remark pause times
> are not consistent even without Graal.
> To see effect I added time spent in JVMCI::do_unloading() to GC log (see
> below [3]). The result is < 1ms - it is less than 1% of a pause time.
Kitchensink isn't really a benchmark, but a stress test. I sent you a
private mail how to run these changes through our internal performance
test setup.
>
> It will have even less effect since I moved JVMCI::do_unloading() from
> serial path to parallel worker thread as Stefan suggested.
>
> Stefan, are you satisfied with these changes now?
Yes, the clean-ups look good. Thanks for cleaning this up.
Kim had some extra comments about a few more places where JVMCI_ONLY
could be used.
I also agree with him that JVMCI::oops_do should not be placed in
JNIHandles::oops_do. I think you should put it where you put the
AOTLoader::oops_do calls.
Thanks,
StefanK
>
> Here is latest delta update which includes previous [1] delta and
> - use CompilerThreadStackSize * 2 for libgraal instead of exact value,
> - removed HandleMark added for debugging (reverted changes in
> jvmtiImpl.cpp),
> - added recent jvmci-8 changes to fix registration of native methods in
> libgraal (jvmciCompilerToVM.cpp)
>
> http://cr.openjdk.java.net/~kvn/8220623/webrev_delta.05/
>
> Thanks,
> Vladimir
>
> [1] http://cr.openjdk.java.net/~kvn/8220623/webrev_delta.04/
> [2] Original webrev http://cr.openjdk.java.net/~kvn/8220623/webrev.03/
> [3] Pauses times from Kitchensink (0.0ms means there were no unloaded
> classes, 'NNN alive' shows how many metadata references were processed):
>
> [1.083s][1554229160638ms][info ][gc,start ] GC(2) Pause Remark
> [1.085s][1554229160639ms][info ][gc ] GC(2)
> JVMCI::do_unloading(): 0 alive 0.000ms
> [1.099s][1554229160654ms][info ][gc ] GC(2) Pause Remark
> 28M->28M(108M) 16.123ms
>
> [3.097s][1554229162651ms][info ][gc,start ] GC(12) Pause Remark
> [3.114s][1554229162668ms][info ][gc ] GC(12)
> JVMCI::do_unloading(): 3471 alive 0.164ms
> [3.148s][1554229162702ms][info ][gc ] GC(12) Pause Remark
> 215M->213M(720M) 51.103ms
>
> [455.111s][1554229614666ms][info ][gc,phases,start] GC(1095) Phase 1:
> Mark live objects
> [455.455s][1554229615010ms][info ][gc ] GC(1095)
> JVMCI::do_unloading(): 4048 alive 0.821ms
> [455.456s][1554229615010ms][info ][gc,phases ] GC(1095) Phase 1:
> Mark live objects 344.107ms
>
> [848.932s][1554230008486ms][info ][gc,phases,start] GC(1860) Phase 1:
> Mark live objects
> [849.248s][1554230008803ms][info ][gc ] GC(1860)
> JVMCI::do_unloading(): 3266 alive 0.470ms
> [849.249s][1554230008803ms][info ][gc,phases ] GC(1860) Phase 1:
> Mark live objects 316.527ms
>
> [1163.778s][1554230323332ms][info ][gc,start ] GC(2627) Pause Remark
> [1163.932s][1554230323486ms][info ][gc ] GC(2627)
> JVMCI::do_unloading(): 3474 alive 0.642ms
> [1163.941s][1554230323496ms][info ][gc ] GC(2627) Pause
> Remark 2502M->2486M(4248M) 163.296ms
>
> [1242.587s][1554230402141ms][info ][gc,phases,start] GC(2734) Phase 1:
> Mark live objects
> [1242.899s][1554230402453ms][info ][gc ] GC(2734)
> JVMCI::do_unloading(): 3449 alive 0.570ms
> [1242.899s][1554230402453ms][info ][gc,phases ] GC(2734) Phase 1:
> Mark live objects 311.719ms
>
> [1364.164s][1554230523718ms][info ][gc,phases,start] GC(3023) Phase 1:
> Mark live objects
> [1364.613s][1554230524167ms][info ][gc ] GC(3023)
> JVMCI::do_unloading(): 3449 alive 0.000ms
> [1364.613s][1554230524167ms][info ][gc,phases ] GC(3023) Phase 1:
> Mark live objects 448.495ms
>
> [1425.222s][1554230584776ms][info ][gc,phases,start] GC(3151) Phase 1:
> Mark live objects
> [1425.587s][1554230585142ms][info ][gc ] GC(3151)
> JVMCI::do_unloading(): 3491 alive 0.882ms
> [1425.587s][1554230585142ms][info ][gc,phases ] GC(3151) Phase 1:
> Mark live objects 365.403ms
>
> [1456.401s][1554230615955ms][info ][gc,phases,start] GC(3223) Phase 1:
> Mark live objects
> [1456.769s][1554230616324ms][info ][gc ] GC(3223)
> JVMCI::do_unloading(): 3478 alive 0.616ms
> [1456.769s][1554230616324ms][info ][gc,phases ] GC(3223) Phase 1:
> Mark live objects 368.643ms
>
> [1806.139s][1554230965694ms][info ][gc,start ] GC(4014) Pause
> Remark
> [1806.161s][1554230965716ms][info ][gc ] GC(4014)
> JVMCI::do_unloading(): 3478 alive 0.000ms
> [1806.163s][1554230965717ms][info ][gc ] GC(4014) Pause
> Remark 1305M->1177M(2772M) 23.190ms
>
>
>
> On 4/1/19 12:34 AM, Stefan Karlsson wrote:
>> On 2019-03-29 17:55, Vladimir Kozlov wrote:
>>> Stefan,
>>>
>>> Do you have a test (and flags) which can allow me to measure effect
>>> of this code on G1 remark pause?
>>
>>
>> -Xlog:gc prints the remark times:
>> [4,296s][info][gc ] GC(89) Pause Remark 4M->4M(28M) 36,412ms
>>
>> StefanK
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 3/29/19 12:36 AM, Stefan Karlsson wrote:
>>>> On 2019-03-29 03:07, Vladimir Kozlov wrote:
>>>>> Hi Stefan,
>>>>>
>>>>> I collected some data on MetadataHandleBlock.
>>>>>
>>>>> First, do_unloading() code is executed only when
>>>>> class_unloading_occurred is 'true' - it is rare case. It should not
>>>>> affect normal G1 remark pause.
>>>>
>>>> It's only rare for applications that don't do dynamic class loading
>>>> and unloading. The applications that do, will be affected.
>>>>
>>>>>
>>>>> Second, I run a test with -Xcomp. I got about 10,000 compilations
>>>>> by Graal and next data at the end of execution:
>>>>>
>>>>> max_blocks = 232
>>>>> max_handles_per_block = 32 (since handles array has 32 elements)
>>>>> max_total_alive_values = 4631
>>>>
>>>> OK. Thanks for the info.
>>>>
>>>> StefanK
>>>>
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 3/28/19 2:44 PM, Vladimir Kozlov wrote:
>>>>>> Thank you, Stefan
>>>>>>
>>>>>> On 3/28/19 12:54 PM, Stefan Karlsson wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>> I started to check the GC code.
>>>>>>>
>>>>>>> ========================================================================
>>>>>>>
>>>>>>> I see that you've added guarded includes in the middle of the
>>>>>>> include list:
>>>>>>> #include "gc/shared/strongRootsScope.hpp"
>>>>>>> #include "gc/shared/weakProcessor.hpp"
>>>>>>> + #if INCLUDE_JVMCI
>>>>>>> + #include "jvmci/jvmci.hpp"
>>>>>>> + #endif
>>>>>>> #include "oops/instanceRefKlass.hpp"
>>>>>>> #include "oops/oop.inline.hpp"
>>>>>>>
>>>>>>> The style we use is to put these conditional includes at the end
>>>>>>> of the include lists.
>>>>>>
>>>>>> okay
>>>>>>
>>>>>>>
>>>>>>> ========================================================================
>>>>>>>
>>>>>>> Could you also change the following:
>>>>>>>
>>>>>>> + #if INCLUDE_JVMCI
>>>>>>> + // Clean JVMCI metadata handles.
>>>>>>> + JVMCI::do_unloading(is_alive_closure(), purged_class);
>>>>>>> + #endif
>>>>>>>
>>>>>>> to:
>>>>>>> + // Clean JVMCI metadata handles.
>>>>>>> + JVMCI_ONLY(JVMCI::do_unloading(is_alive_closure(),
>>>>>>> purged_class);)
>>>>>>>
>>>>>>> to get rid of some of the line noise in the GC files.
>>>>>>
>>>>>> okay
>>>>>>
>>>>>>>
>>>>>>> ========================================================================
>>>>>>>
>>>>>>> In the future we will need version of JVMCI::do_unloading that
>>>>>>> supports concurrent cleaning for ZGC.
>>>>>>
>>>>>> Yes, we need to support concurrent cleaning in a future.
>>>>>>
>>>>>>>
>>>>>>> ========================================================================
>>>>>>>
>>>>>>> What's the performance impact for G1 remark pause with this
>>>>>>> serial walk over the MetadataHandleBlock?
>>>>>>>
>>>>>>> 3275 void G1CollectedHeap::complete_cleaning(BoolObjectClosure*
>>>>>>> is_alive,
>>>>>>> 3276 bool
>>>>>>> class_unloading_occurred) {
>>>>>>> 3277 uint num_workers = workers()->active_workers();
>>>>>>> 3278 ParallelCleaningTask unlink_task(is_alive, num_workers,
>>>>>>> class_unloading_occurred, false);
>>>>>>> 3279 workers()->run_task(&unlink_task);
>>>>>>> 3280 #if INCLUDE_JVMCI
>>>>>>> 3281 // No parallel processing of JVMCI metadata handles for now.
>>>>>>> 3282 JVMCI::do_unloading(is_alive, class_unloading_occurred);
>>>>>>> 3283 #endif
>>>>>>> 3284 }
>>>>>>
>>>>>> There should not be impact if Graal is not used. Only cost of call
>>>>>> (which most likely is inlined in product VM) and check:
>>>>>>
>>>>>> http://hg.openjdk.java.net/metropolis/dev/file/530fc1427d02/src/hotspot/share/jvmci/jvmciRuntime.cpp#l1237
>>>>>>
>>>>>>
>>>>>> If Graal is used it should not have big impact since these
>>>>>> metadata has regular pattern (32 handles per array and array per
>>>>>> MetadataHandleBlock block which are linked in list) and not large.
>>>>>> If there will be noticeable impact - we will work on it as you
>>>>>> suggested by using ParallelCleaningTask.
>>>>>>
>>>>>>>
>>>>>>> ========================================================================
>>>>>>>
>>>>>>> Did you consider adding it as a task for one of the worker
>>>>>>> threads to execute in ParallelCleaningTask?
>>>>>>>
>>>>>>> See how other tasks are claimed by one worker:
>>>>>>> void KlassCleaningTask::work() {
>>>>>>> ResourceMark rm;
>>>>>>>
>>>>>>> // One worker will clean the subklass/sibling klass tree.
>>>>>>> if (claim_clean_klass_tree_task()) {
>>>>>>> Klass::clean_subklass_tree();
>>>>>>> }
>>>>>>
>>>>>> These changes were ported from JDK8u based changes in
>>>>>> graal-jvmci-8 and there are no ParallelCleaningTask in JDK8.
>>>>>>
>>>>>> Your suggestion is interesting and I agree that we should
>>>>>> investigate it.
>>>>>>
>>>>>>>
>>>>>>> ========================================================================
>>>>>>>
>>>>>>> In MetadataHandleBlock::do_unloading:
>>>>>>>
>>>>>>> + if (klass->class_loader_data()->is_unloading()) {
>>>>>>> + // This needs to be marked so that it's no longer scanned
>>>>>>> + // but can't be put on the free list yet. The
>>>>>>> + // ReferenceCleaner will set this to NULL and
>>>>>>> + // put it on the free list.
>>>>>>>
>>>>>>> I couldn't find the ReferenceCleaner in the patch or in the
>>>>>>> source. Where can I find this code?
>>>>>>
>>>>>> I think it is typo (I will fix it) - it references new
>>>>>> HandleCleaner class:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~kvn/8220623/webrev.03/src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HandleCleaner.java.html
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> StefanK
>>>>>>>
>>>>>>> On 2019-03-28 20:15, Vladimir Kozlov wrote:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8220623
>>>>>>>> http://cr.openjdk.java.net/~kvn/8220623/webrev.03/
>>>>>>>>
>>>>>>>> Update JVMCI to support pre-compiled as shared library Graal.
>>>>>>>> Using aoted Graal can offers benefits including:
>>>>>>>> - fast startup
>>>>>>>> - compile time similar to native JIt compilers (C2)
>>>>>>>> - memory usage disjoint from the application Java heap
>>>>>>>> - no profile pollution of JDK code used by the application
>>>>>>>>
>>>>>>>> This is JDK13 port of JVMCI changes done in graal-jvmci-8 [1] up
>>>>>>>> to date.
>>>>>>>> Changes were collected in Metropolis repo [2] and tested there.
>>>>>>>>
>>>>>>>> Changes we reviewed by Oracle Labs (authors of JVMCI and Graal)
>>>>>>>> and our compiler group.
>>>>>>>> Changes in shared code are guarded by #if INCLUDE_JVMCI and
>>>>>>>> JVMCI flags.
>>>>>>>>
>>>>>>>> I ran tier1-tier8 (which includes HotSpot and JDK tests) and it
>>>>>>>> was clean. In this set Graal was tested only in tier3.
>>>>>>>>
>>>>>>>> And I ran all hs-tier3-graal .. hs-tier8-graal Graal tests
>>>>>>>> available in our system. Several issue were found which were
>>>>>>>> present before these changes.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://github.com/graalvm/graal-jvmci-8/commit/49ff2045fb603e35516a3a427d8023c00e1607af
>>>>>>>>
>>>>>>>> [2] http://hg.openjdk.java.net/metropolis/dev/
>>>>>>>
More information about the hotspot-dev
mailing list