RFR(S): 8223444: Improve CodeHeap Free Space Management

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri May 17 19:27:03 UTC 2019


Hi Lutz,

Testing passed.
I think you need second review because changes are not trivial.

Thanks,
Vladimir

On 5/16/19 4:17 PM, Vladimir Kozlov wrote:
> On 5/16/19 3:16 PM, Schmidt, Lutz wrote:
>> Hi Vladimir,
>>
>> I have implemented that "added safety feature". What it does:
>>   - During reserve() and expand_by(), all newly committed heap memory is initialized with badCodeHeapNewVal.
>>   - Whenever a FreeBlock is added to the free block list, it's memory (except for the header part) is initialized with 
>> badCodeHeapNewVal.
>>   - During verify(), it is checked that all heap memory in free blocks is initialized as expected.
>>
>> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, based on current 
>> jdk/jdk, is here:
>> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/
> 
> Good. I will run testing on it.
> 
>>
>> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a new HeapBlock to be 
>> initialized with a bit pattern forming a negative value when accessed as int. It cost me quite some time to find that 
>> out. And it's bad code, in my opinion. I would suggest to just delete that assert.
> 
> Yes, I agree. Such implicit dependence is bad and useless.
> We use reset_to() in AOT where scopes should already have valid offsets:
> http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152
> I am not sure how we pass this assert in AOT case.
> 
>>
>> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. But I did some 
>> testing/tracing. That additional free block (just one) is consumed by subsequent allocate() calls and eventually 
>> vanishes.
>> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no longer need to be 
>> the last block before _next_segment. That is a prerequisite if we want to get rid of issues like the one described in 
>> JDK-8223770. We could just allocate a generously large block for stubs and at the end deallocate_tail them.
> 
> Okay, it sounds good.
> 
> Thanks,
> Vladimir
> 
>>
>> Sorry for the long text.
>> Lutz
>>
>> On 15.05.19, 19:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>>
>>      On 5/14/19 10:53 PM, Schmidt, Lutz wrote:
>>      > Hi Vladimir,
>>      >
>>      > thank you for your comments. About filling CodeHeap with bad values after split_block:
>>      >   - in deallocate_tail, the leading part must remain intact. It contains valid code.
>>      >   - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts
>>      Thank you for explaining.
>>      >   - If you want added safety, wouldn't it then be better to invalidate the block contents during 
>> add_to_freelist()? You could then be sure there is no executable code in a free block.
>>      Yes, it is preferable.
>>      An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next 
>> block
>>      _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation?
>>      Thanks,
>>      Vladimir
>>      >
>>      > Regards,
>>      > Lutz
>>      >
>>      > On 14.05.19, 23:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>>      >
>>      >      On 5/14/19 1:09 PM, Schmidt, Lutz wrote:
>>      >      > Hi Vladimir,
>>      >      >
>>      >      > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's 
>> not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, 
>> even with memset().
>>      >      >
>>      >      > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized 
>> before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its 
>> weak memory ordering.
>>      >
>>      >      Okay, if it is byte copy I am fine with it.
>>      >
>>      >      >
>>      >      > About segment map marking:
>>      >      > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start().
>>      >      > In short: each segment map element contains an (unsigned) index which, when subtracted from that element 
>> index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a 
>> heap block range to describe a newly formed heap block, the leading part remains valid.
>>      >      >
>>      >      > Segmap  before             after
>>      >      > Index    split             split
>>      >      >    I       0 <- block start   0 <- block start (now shorter)
>>      >      >    I+1     1                  1    each index 0..9 still points
>>      >      >    I+2     2                  2    back to the block start
>>      >      >    I+3     3                  3
>>      >      >    I+4     4                  4
>>      >      >    I+5     5                  5
>>      >      >    I+6     6                  6
>>      >      >    I+7     7                  7
>>      >      >    I+8     8                  8
>>      >      >    I+9     9                  9
>>      >      >    I+10    10                 0 <- new block start
>>      >      >    I+11    11                 1
>>      >      >    I+12    12                 2
>>      >      >    I+13    13                 3
>>      >      >    I+14    14                 4
>>      >      >    I+15    0 <- block start   0 <- block start
>>      >      >    I+16    1                  1
>>      >      >    I+17    2                  2
>>      >      >    I+18    3                  3
>>      >      >    I+19    4                  4
>>      >      >
>>      >      > There is a (very short) description about what's happening at the very end of search_freelist(). 
>> split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()?
>>      >
>>      >      Thank you, I forgot about that first block mapping is still valid.
>>      >
>>      >      What about storing bad value (in debug mode) only in second part and not both parts?
>>      >
>>      >      >
>>      >      > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway 
>> because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In 
>> addition, I will add one line of comment and rename a local variable. That's it.
>>      >
>>      >      Okay.
>>      >
>>      >      Thanks,
>>      >      Vladimir
>>      >
>>      >      >
>>      >      > Thanks,
>>      >      > Lutz
>>      >      >
>>      >      >
>>      >      > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" 
>> <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of vladimir.kozlov at oracle.com> wrote:
>>      >      >
>>      >      >      Good.
>>      >      >
>>      >      >      Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I 
>> am wrong here).
>>      >      >      An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of 
>> split segments as
>>      >      >      used and (after call) store bad values in it. What about first part? May be add comment.
>>      >      >
>>      >      >      Thanks,
>>      >      >      Vladimir
>>      >      >
>>      >      >      On 5/14/19 3:47 AM, Schmidt, Lutz wrote:
>>      >      >      > Dear all,
>>      >      >      >
>>      >      >      > May I please request reviews for my change?
>>      >      >      > Bug:    https://bugs.openjdk.java.net/browse/JDK-8223444
>>      >      >      > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/
>>      >      >      >
>>      >      >      > What this change is all about:
>>      >      >      > ------------------------------
>>      >      >      > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some 
>> small changes which I would call improvements.
>>      >      >      >
>>      >      >      > Furthermore, and in particular with these changes, the platform-specific parameter 
>> CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block 
>> allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the 
>> minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance 
>> for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) 
>> the free list, slowing down the free block search.
>>      >      >      >
>>      >      >      > The following free block counts have been found after running JVM98 with different 
>> CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown.
>>      >      >      >
>>      >      >      > JDK-8223444 not applied
>>      >      >      > =======================
>>      >      >      >
>>      >      >      >          Segment  |  free blocks with CodeCacheMinBlockLength=
>>      >      >      >            Size   |       1      2      3      4      6      8
>>      >      >      > -----------------+-------------------------------------------
>>      >      >      > aarch      128   |       0    153     75     30     38      2
>>      >      >      > ppc        128   |       0    149     98     59     14      2
>>      >      >      > ppcle      128   |       0    219    161    110     69     34
>>      >      >      > s390       256   |       0    142     93     59     30     10
>>      >      >      > x86        128   |       0    215    157    118     42     11
>>      >      >      >
>>      >      >      >
>>      >      >      > JDK-8223444 applied
>>      >      >      > ===================
>>      >      >      >
>>      >      >      >          Segment  |  free blocks with CodeCacheMinBlockLength=  |  suggested
>>      >      >      >            Size   |       1      2      3      4      6      8  |   setting
>>      >      >      > -----------------+---------------------------------------------+------------
>>      >      >      > aarch      128   |     221    115     80     36      7      1  |     6
>>      >      >      > ppc        128   |     245    152    101     54     14      4  |     6
>>      >      >      > ppcle      128   |     243    144     89     72     20      5  |     6
>>      >      >      > s390       256   |     168     60     67      8      6      2  |     4
>>      >      >      > x86        128   |     223    139     83     50     11      2  |     6
>>      >      >      >
>>      >      >      > Thank you for your time and opinion!
>>      >      >      > Lutz
>>      >      >      >
>>      >      >      >
>>      >      >      >
>>      >      >      >
>>      >      >
>>      >      >
>>      >
>>      >
>>


More information about the hotspot-compiler-dev mailing list