RFR(M): 8231460: Performance issue (CodeHeap) with large free blocks

Thu Oct 17 15:36:40 UTC 2019

Hi Andrew, 

Thank you for looking at the change and, of course, for your thoughts and comments. 

Let me cover the easy part (heap.cpp:383 - 419) first:
======================================================
I fully agree to your suggestions. Your wording is much clearer and easier to understand. Your suggestions will be contained in the next webrev iteration. I will not create a separate webrev just for that. 

About the particular test:
==========================
Yes, the test is completely synthetic in freeing thousands of CodeHeap blocks in a row. Such a situation will most likely never occur in a real-world workload. I was digging into this only because the test timed out on some of our machines. 

I'm not sure if we need another synthetic test to model the real world. It would be possible as well to add the timing to some real-world test(s) and analyze what's happening there. Would JBB2005 or JBB2013 be close enough to the real world?

About the risk of segment map fragmentation:
============================================
Yes, there will be some fragmentation. I never observed fragmentation to become excessive, however. And there is reason for that: each time a HeapBlock allocation is satisfied out of the FreeBlock list, that segment map interval is re-initialized. Simply put: block allocation inherently provides self-healing for segment map fragmentation due to block deallocation. Knowing that, your "horror scenario" (all segment map entries converging to one) is not a factor.

If concerns remain, defragmenting the segment table is of course possible. I expect it to be not extremely expensive. The expensive part would be fragmentation tracking and deciding when to defragment. I do have an idea for that. Let me try it out...

Thanks,
Lutz

On 17.10.19, 13:19, "Andrew Dinn" <adinn at redhat.com> wrote:

    Hi Lutz,

    On 17/10/2019 08:28, Schmidt, Lutz wrote:

    > may I please request reviews for this fix, addressing actually two
    > current performance issues in CodeHeap management. One issue (the stupid
    > one) could be eliminated completely. The other one, in the context of
    > maintaining the code heap segment map, was tamed considerably.
    > 
    > From old to new, the following absolute times and speedup factors
    > were
    observed:
    > 
    >                   speedup      absolute time [Milliseconds]
    >                                  old version  new version
    > ppc  fastdebug      4,700         69,593.484       14.833  (30,000 calls)
    > ppc  release        4,600         70,069.517       15.215  (30,000 calls)
    > s390 fastdebug      3,500          7,778.500        2.220  (40,000 calls)
    > s390 release        6,700          6,935.371        1.026  (40,000 calls)
    > 
    > 
    > These results are far from clean room measurements. On the other
    > hand, the improvements are so extreme that it can't be just measurement
    > errors. And is it important if the factor is 4k or 3k?
    > 
    > The performance gain is achieved by taking advantage of properties
    > of the CodeHeap segment map, not by improved coding techniques. You can
    > find more detail in the bug description and as inline documentation in
    > the source code.
    > 
    > Bug:    https://bugs.openjdk.java.net/browse/JDK-8231460 
    > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8231460.00/ 
    > 
    > Please have a look.

    This is an interesting optimization and I think it may well be worth
    having. The test results are very impressive looking. However, I don't
    suppose the test bears a strong relation to what actually happens at
    runtime. I think it would be useful to create a test that exercises the
    code cache with one (or several?) more realistic profiles for addition
    and deletion of blocks. You could synthesize something that generates a
    variety of different size methods and also does periodic fees of some
    subset of the live methods. Alternatively, you could maybe modify the
    JVM to produce some traces which can be used guide such a test.

    The one concern I have about the optimization is that over time
    splitting and fusing of blocks is going to lead to fragmentation in the
    segment maps. Am I correct to assume that given enough split and fuse
    operations (for a varying ranges of block sizes) that eventually all the
    live entries in segment maps will converge to 1? (i.e. the hop count
    will tend towards the maximum possible).

    If that is so then this will add a gradually increasing penalty to
    traversals from a code address to the code header block. Better testing
    might help to quantify how 'gradual' this is and also how much of a
    penalty the extra traversal cost is in real deployments. If either/both
    of those raises concerns then is there some way you could track and,
    where appropriate, reverse this fragmentation by fully reinitialising a
    merged map to remove an excess of embedded hops?

    I have only one code comment (the rest is admirably clear):

    heap.cpp:383 - 419

    I think the extra detail in this comment is worth having. However, the
    comment could benefit from a few edits:

    @403-404:
     *  - The contents of the segment map elements has to be interpreted
     *   as unsigned integer."

    hmm?

     *  - elements of the segment map (byte) array are interpreted
     *    as unsigned integers"

    @ 405-407:
     *  - Each segment map element contain the distance (in segment size units)
     *    of the related segment to the start of the block.
     *    be subtracted from the current index to get closer to the start.

    I assume that last line got pasted in from below by mistake. I'd reword
    the preceding two lines as follows:

     *  - Element values normally identify an offset backwards (in segment
     *    size units) from the associated segment towards the start of
     *    the block.

    regards,

    Andrew Dinn
    -----------
    Senior Principal Software Engineer
    Red Hat UK Ltd
    Registered in England and Wales under Company Registration No. 03798903
    Directors: Michael Cunningham, Michael ("Mike") O'Neill