Quick question about Metaspace verifications (8185034)

Thomas Stüfe thomas.stuefe at gmail.com
Thu Mar 8 08:42:24 UTC 2018


Hi Coleen,

On Thu, Mar 8, 2018 at 2:48 AM, <coleen.phillimore at oracle.com> wrote:

>
> Hi Thomas,
>
> The bug was confidential because it was for our then-closed arm port and
> for a closed test.   The test was timing out in nightly testing.  It still
> has paths in the description so I don't think I can open the bug.   And
> there isn't very much usable data except:
>
> This looks like an issue that I've come across while developing tests as
> well.
> The problem is that this kind of workload causes the vm to spend a lot of
> time verifying all the metaspace chunks:
> % Run time using "perf record" on Linux.
>  49.48% java libjvm.so [.] ChunkManager::locked_verify_free_chunks_total()
>  47.69% java libjvm.so [.] ChunkManager::locked_verify_free_chunks_count()
> I'll try to figure out the call path that causes
> ChunkManager::locked_verify to be called.
>
> I think fastdebug should be careful about slow metaspace verification,
> because we test with fastdebug.
>
>
Okay.

Do you also run this test for my metaspace chunk merging change? There, I
added a number of verifications; I was careful not to add them into hot
paths. But still, if my new verifications add too much runtime, I should
tone them down. Thankfully that is easy.

What I often do in these situations is a macro like this:

#define DO_EVERY_NTH_TIME(n, code) \
{ static int counter = 0; if ((counter % n) == 0) { code } }

and use it like this:

DO_EVERY_NTH_TIME (100, cm->locked_verify())

that way one can fine tune the number of verifications if they turn out to
be too expensive in the debug build without completely switching them off.
What do you think, would this be acceptable?


Note that I currently work on a patch, roughly based on our previous
discussion, which
- adds counters for all interesting things one might want to know to avoid
having to walk lists and CLDG to get statistic values
- removes all the individual get_xxx_slow() functions and replaces it with
a verify functions which verifies all counters at once. This then needs
only one walk instead of many.

These changes will make verifications in fastdebug faster, since it will
reduce the number of times we walk structures.



> On 3/7/18 1:28 AM, Thomas Stüfe wrote:
>
>>   Hi all,
>>
>> I am currently working on 8185034 to clean up the metaspace coding a bit,
>> partially to work off the complexity debt incurred by 8198423.
>>
>> some quick questions, maybe someone knows:
>>
>> - We have the metaspace_slow_verify which guards a number of expensive
>> verifications (I feel a bit arbitrarily). When looking at the history, I
>> see a change:
>>
>> changeset:   14474:4154f1817a75 14382:5e86124b835d
>> user:        mgerdin
>> date:        Fri Nov 09 00:38:31 2012 +0100
>> summary:     7200229: NPG: possible performance issue exposed by
>> closed/runtime/6559877/Test6559877.java
>> Summary: Reduce the amount of calls to ChunkManager verification code
>> Reviewed-by: jmasa, coleenp
>>
>> 7200229 is a closed bug report, I cannot see it. Any details on this? More
>> important, should I keep guarding expensive verifications not only with
>> ASSERT but also with this always-off switch? And if yes, what counts as
>> expensive? To me, this defies a bit the point of having a debug build.
>>
>> - Speaking of verifications, I see often verification methods (e.g.
>> ChunkManager::verify()) which are not guarded by ASSERT - is there a point
>> to have those functions in release builds? All they do is checking and
>> asserting, which become no-ops.
>>
> There is some GC verification that is done with
> -XX:+Verify{Before,During,After}GC that is exposed in the product and is
> frequently used for diagnosing problems.  I don't know if metaspace
> verification is part of this without looking.


Yes. Universe::verify() invokes MetaspaceAux::verify_free_chunks() which in
the end does a ChunkManager::locked_verify_free_chunks_total() . However,
that function only asserts, it does not guarantee, so it does do nothing in
PRODUCT.


>    If it is, we should keep it available in PRODUCT build.  If not, yes,
> please make it #ifndef PRODUCT.


>
This leads to a new question: I thought we want to remove optimized builds?
If yes, is there a point to #ifndef PRODUCT instead of #ifdef ASSERT?

The more I think of this, the more I think that one of two things makes
sense:
1 make all verifications, including their prototypes and call sites, ASSERT
only.
2 make verifications also available in PRODUCT builds. That means switching
assert to guarantee. But keep all call sites ASSERT only, all save for
those triggered by external switches like  -XX:+Verify{Before,During,Afte
r}GC

Right now, we essentially do (1), since all checks are done with asserts,
even if they are invoked in PRODUCT. In particular,
-XX:+Verify{Before,During,After}GC only does something with the metaspace
in fastdebug builds.

I have a preference for (1). This is what the code does now, so we may as
well make it clear by making all verify() functions DEBUG_ONLY. If we later
think we need verifications in PRODUCT too, e.g. for
-XX:+Verify{Before,During,After}GC, we can add it in a later issue.

What do you think?

Best Regards, Thomas

Coleen
>
>>
>> Thank you,
>>
>> Thomas
>>
>
>


More information about the hotspot-runtime-dev mailing list