RFR(S): 8223444: Improve CodeHeap Free Space Management

Tobias Hartmann tobias.hartmann at oracle.com
Mon May 20 07:29:39 UTC 2019


Hi Lutz,

Just wondering if we shouldn't fail in CodeHeap::verify() instead of just printing a warning?

Also, in heap.cpp:528, why do you need 'res'? You could just update found_block in line 578, right?

Best regards,
Tobias

On 17.05.19 21:27, Vladimir Kozlov wrote:
> Hi Lutz,
> 
> Testing passed.
> I think you need second review because changes are not trivial.
> 
> Thanks,
> Vladimir
> 
> On 5/16/19 4:17 PM, Vladimir Kozlov wrote:
>> On 5/16/19 3:16 PM, Schmidt, Lutz wrote:
>>> Hi Vladimir,
>>>
>>> I have implemented that "added safety feature". What it does:
>>>   - During reserve() and expand_by(), all newly committed heap memory is initialized with
>>> badCodeHeapNewVal.
>>>   - Whenever a FreeBlock is added to the free block list, it's memory (except for the header
>>> part) is initialized with badCodeHeapNewVal.
>>>   - During verify(), it is checked that all heap memory in free blocks is initialized as expected.
>>>
>>> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev,
>>> based on current jdk/jdk, is here:
>>> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/
>>
>> Good. I will run testing on it.
>>
>>>
>>> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a
>>> new HeapBlock to be initialized with a bit pattern forming a negative value when accessed as int.
>>> It cost me quite some time to find that out. And it's bad code, in my opinion. I would suggest to
>>> just delete that assert.
>>
>> Yes, I agree. Such implicit dependence is bad and useless.
>> We use reset_to() in AOT where scopes should already have valid offsets:
>> http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152
>> I am not sure how we pass this assert in AOT case.
>>
>>>
>>> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same.
>>> But I did some testing/tracing. That additional free block (just one) is consumed by subsequent
>>> allocate() calls and eventually vanishes.
>>> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no
>>> longer need to be the last block before _next_segment. That is a prerequisite if we want to get
>>> rid of issues like the one described in JDK-8223770. We could just allocate a generously large
>>> block for stubs and at the end deallocate_tail them.
>>
>> Okay, it sounds good.
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Sorry for the long text.
>>> Lutz
>>>
>>> On 15.05.19, 19:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>>>
>>>      On 5/14/19 10:53 PM, Schmidt, Lutz wrote:
>>>      > Hi Vladimir,
>>>      >
>>>      > thank you for your comments. About filling CodeHeap with bad values after split_block:
>>>      >   - in deallocate_tail, the leading part must remain intact. It contains valid code.
>>>      >   - in search_freelist, one free block is split into two. There I could invalidate the
>>> contents of both parts
>>>      Thank you for explaining.
>>>      >   - If you want added safety, wouldn't it then be better to invalidate the block contents
>>> during add_to_freelist()? You could then be sure there is no executable code in a free block.
>>>      Yes, it is preferable.
>>>      An other note (after looking more on changes). You changed where freed tail goes. Originally
>>> it was added to next block
>>>      _next_segment (make it larger) and you created separate small block. Is not it create more
>>> fragmentation?
>>>      Thanks,
>>>      Vladimir
>>>      >
>>>      > Regards,
>>>      > Lutz
>>>      >
>>>      > On 14.05.19, 23:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>>>      >
>>>      >      On 5/14/19 1:09 PM, Schmidt, Lutz wrote:
>>>      >      > Hi Vladimir,
>>>      >      >
>>>      >      > I had the same thought re atomicity. memset() is not consistent even on one
>>> platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we
>>> have byte atomicity on all supported platforms, even with memset().
>>>      >      >
>>>      >      > It's a different thing with sequence of initialization. Do we really depend on
>>> byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit
>>> byte loop. Not on x86, but on ppc with its weak memory ordering.
>>>      >
>>>      >      Okay, if it is byte copy I am fine with it.
>>>      >
>>>      >      >
>>>      >      > About segment map marking:
>>>      >      > There is a short description how the segment map works in heap.cpp, right before
>>> CodeHeap::find_start().
>>>      >      > In short: each segment map element contains an (unsigned) index which, when
>>> subtracted from that element index, addresses the segment map element where the heap block
>>> starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly
>>> formed heap block, the leading part remains valid.
>>>      >      >
>>>      >      > Segmap  before             after
>>>      >      > Index    split             split
>>>      >      >    I       0 <- block start   0 <- block start (now shorter)
>>>      >      >    I+1     1                  1    each index 0..9 still points
>>>      >      >    I+2     2                  2    back to the block start
>>>      >      >    I+3     3                  3
>>>      >      >    I+4     4                  4
>>>      >      >    I+5     5                  5
>>>      >      >    I+6     6                  6
>>>      >      >    I+7     7                  7
>>>      >      >    I+8     8                  8
>>>      >      >    I+9     9                  9
>>>      >      >    I+10    10                 0 <- new block start
>>>      >      >    I+11    11                 1
>>>      >      >    I+12    12                 2
>>>      >      >    I+13    13                 3
>>>      >      >    I+14    14                 4
>>>      >      >    I+15    0 <- block start   0 <- block start
>>>      >      >    I+16    1                  1
>>>      >      >    I+17    2                  2
>>>      >      >    I+18    3                  3
>>>      >      >    I+19    4                  4
>>>      >      >
>>>      >      > There is a (very short) description about what's happening at the very end of
>>> search_freelist(). split_block() is called there as well. Would you like to see a similar comment
>>> in deallocate_tail()?
>>>      >
>>>      >      Thank you, I forgot about that first block mapping is still valid.
>>>      >
>>>      >      What about storing bad value (in debug mode) only in second part and not both parts?
>>>      >
>>>      >      >
>>>      >      > Once I have your response, I will create a new webrev reflecting your input. I need
>>> to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The
>>> checks can't be done at that place. In addition, I will add one line of comment and rename a
>>> local variable. That's it.
>>>      >
>>>      >      Okay.
>>>      >
>>>      >      Thanks,
>>>      >      Vladimir
>>>      >
>>>      >      >
>>>      >      > Thanks,
>>>      >      > Lutz
>>>      >      >
>>>      >      >
>>>      >      > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov"
>>> <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of vladimir.kozlov at oracle.com> wrote:
>>>      >      >
>>>      >      >      Good.
>>>      >      >
>>>      >      >      Do we need to be concern about atomicity of marking? We know that memset() is
>>> not atomic (may be I am wrong here).
>>>      >      >      An other thing is I did not get logic in deallocate_tail(). split_block()
>>> marks only second half of split segments as
>>>      >      >      used and (after call) store bad values in it. What about first part? May be
>>> add comment.
>>>      >      >
>>>      >      >      Thanks,
>>>      >      >      Vladimir
>>>      >      >
>>>      >      >      On 5/14/19 3:47 AM, Schmidt, Lutz wrote:
>>>      >      >      > Dear all,
>>>      >      >      >
>>>      >      >      > May I please request reviews for my change?
>>>      >      >      > Bug:    https://bugs.openjdk.java.net/browse/JDK-8223444
>>>      >      >      > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/
>>>      >      >      >
>>>      >      >      > What this change is all about:
>>>      >      >      > ------------------------------
>>>      >      >      > While working on another topic, I came across the code in
>>> share/memory/heap.cpp. I applied some small changes which I would call improvements.
>>>      >      >      >
>>>      >      >      > Furthermore, and in particular with these changes, the platform-specific
>>> parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small
>>> free blocks. Heap block allocation does not create free blocks smaller than
>>> CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size.
>>> If it is too small, such free blocks will never be re-allocated. The only chance for them to
>>> vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the
>>> beginning of) the free list, slowing down the free block search.
>>>      >      >      >
>>>      >      >      > The following free block counts have been found after running JVM98 with
>>> different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the
>>> CodeHeap state at VM shutdown.
>>>      >      >      >
>>>      >      >      > JDK-8223444 not applied
>>>      >      >      > =======================
>>>      >      >      >
>>>      >      >      >          Segment  |  free blocks with CodeCacheMinBlockLength=
>>>      >      >      >            Size   |       1      2      3      4      6      8
>>>      >      >      > -----------------+-------------------------------------------
>>>      >      >      > aarch      128   |       0    153     75     30     38      2
>>>      >      >      > ppc        128   |       0    149     98     59     14      2
>>>      >      >      > ppcle      128   |       0    219    161    110     69     34
>>>      >      >      > s390       256   |       0    142     93     59     30     10
>>>      >      >      > x86        128   |       0    215    157    118     42     11
>>>      >      >      >
>>>      >      >      >
>>>      >      >      > JDK-8223444 applied
>>>      >      >      > ===================
>>>      >      >      >
>>>      >      >      >          Segment  |  free blocks with CodeCacheMinBlockLength=  |  suggested
>>>      >      >      >            Size   |       1      2      3      4      6      8  |   setting
>>>      >      >      > -----------------+---------------------------------------------+------------
>>>      >      >      > aarch      128   |     221    115     80     36      7      1  |     6
>>>      >      >      > ppc        128   |     245    152    101     54     14      4  |     6
>>>      >      >      > ppcle      128   |     243    144     89     72     20      5  |     6
>>>      >      >      > s390       256   |     168     60     67      8      6      2  |     4
>>>      >      >      > x86        128   |     223    139     83     50     11      2  |     6
>>>      >      >      >
>>>      >      >      > Thank you for your time and opinion!
>>>      >      >      > Lutz
>>>      >      >      >
>>>      >      >      >
>>>      >      >      >
>>>      >      >      >
>>>      >      >
>>>      >      >
>>>      >
>>>      >
>>>


More information about the hotspot-compiler-dev mailing list