RFR(S): 8223444: Improve CodeHeap Free Space Management

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu May 16 23:17:38 UTC 2019


On 5/16/19 3:16 PM, Schmidt, Lutz wrote:
> Hi Vladimir,
> 
> I have implemented that "added safety feature". What it does:
>   - During reserve() and expand_by(), all newly committed heap memory is initialized with badCodeHeapNewVal.
>   - Whenever a FreeBlock is added to the free block list, it's memory (except for the header part) is initialized with badCodeHeapNewVal.
>   - During verify(), it is checked that all heap memory in free blocks is initialized as expected.
> 
> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, based on current jdk/jdk, is here:
> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/

Good. I will run testing on it.

> 
> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a new HeapBlock to be initialized with a bit pattern forming a negative value when accessed as int. It cost me quite some time to find that out. And it's bad code, in my opinion. I would suggest to just delete that assert.

Yes, I agree. Such implicit dependence is bad and useless.
We use reset_to() in AOT where scopes should already have valid offsets:
http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152
I am not sure how we pass this assert in AOT case.

> 
> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. But I did some testing/tracing. That additional free block (just one) is consumed by subsequent allocate() calls and eventually vanishes.
> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no longer need to be the last block before _next_segment. That is a prerequisite if we want to get rid of issues like the one described in JDK-8223770. We could just allocate a generously large block for stubs and at the end deallocate_tail them.

Okay, it sounds good.

Thanks,
Vladimir

> 
> Sorry for the long text.
> Lutz
> 
> On 15.05.19, 19:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> 
>      On 5/14/19 10:53 PM, Schmidt, Lutz wrote:
>      > Hi Vladimir,
>      >
>      > thank you for your comments. About filling CodeHeap with bad values after split_block:
>      >   - in deallocate_tail, the leading part must remain intact. It contains valid code.
>      >   - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts
>      
>      Thank you for explaining.
>      
>      >   - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block.
>      
>      Yes, it is preferable.
>      
>      An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next block
>      _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation?
>      
>      Thanks,
>      Vladimir
>      
>      >
>      > Regards,
>      > Lutz
>      >
>      > On 14.05.19, 23:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >
>      >      On 5/14/19 1:09 PM, Schmidt, Lutz wrote:
>      >      > Hi Vladimir,
>      >      >
>      >      > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset().
>      >      >
>      >      > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering.
>      >
>      >      Okay, if it is byte copy I am fine with it.
>      >
>      >      >
>      >      > About segment map marking:
>      >      > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start().
>      >      > In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid.
>      >      >
>      >      > Segmap  before             after
>      >      > Index    split             split
>      >      >    I       0 <- block start   0 <- block start (now shorter)
>      >      >    I+1     1                  1    each index 0..9 still points
>      >      >    I+2     2                  2    back to the block start
>      >      >    I+3     3                  3
>      >      >    I+4     4                  4
>      >      >    I+5     5                  5
>      >      >    I+6     6                  6
>      >      >    I+7     7                  7
>      >      >    I+8     8                  8
>      >      >    I+9     9                  9
>      >      >    I+10    10                 0 <- new block start
>      >      >    I+11    11                 1
>      >      >    I+12    12                 2
>      >      >    I+13    13                 3
>      >      >    I+14    14                 4
>      >      >    I+15    0 <- block start   0 <- block start
>      >      >    I+16    1                  1
>      >      >    I+17    2                  2
>      >      >    I+18    3                  3
>      >      >    I+19    4                  4
>      >      >
>      >      > There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()?
>      >
>      >      Thank you, I forgot about that first block mapping is still valid.
>      >
>      >      What about storing bad value (in debug mode) only in second part and not both parts?
>      >
>      >      >
>      >      > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it.
>      >
>      >      Okay.
>      >
>      >      Thanks,
>      >      Vladimir
>      >
>      >      >
>      >      > Thanks,
>      >      > Lutz
>      >      >
>      >      >
>      >      > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of vladimir.kozlov at oracle.com> wrote:
>      >      >
>      >      >      Good.
>      >      >
>      >      >      Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here).
>      >      >      An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as
>      >      >      used and (after call) store bad values in it. What about first part? May be add comment.
>      >      >
>      >      >      Thanks,
>      >      >      Vladimir
>      >      >
>      >      >      On 5/14/19 3:47 AM, Schmidt, Lutz wrote:
>      >      >      > Dear all,
>      >      >      >
>      >      >      > May I please request reviews for my change?
>      >      >      > Bug:    https://bugs.openjdk.java.net/browse/JDK-8223444
>      >      >      > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/
>      >      >      >
>      >      >      > What this change is all about:
>      >      >      > ------------------------------
>      >      >      > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements.
>      >      >      >
>      >      >      > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search.
>      >      >      >
>      >      >      > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown.
>      >      >      >
>      >      >      > JDK-8223444 not applied
>      >      >      > =======================
>      >      >      >
>      >      >      >          Segment  |  free blocks with CodeCacheMinBlockLength=
>      >      >      >            Size   |       1      2      3      4      6      8
>      >      >      > -----------------+-------------------------------------------
>      >      >      > aarch      128   |       0    153     75     30     38      2
>      >      >      > ppc        128   |       0    149     98     59     14      2
>      >      >      > ppcle      128   |       0    219    161    110     69     34
>      >      >      > s390       256   |       0    142     93     59     30     10
>      >      >      > x86        128   |       0    215    157    118     42     11
>      >      >      >
>      >      >      >
>      >      >      > JDK-8223444 applied
>      >      >      > ===================
>      >      >      >
>      >      >      >          Segment  |  free blocks with CodeCacheMinBlockLength=  |  suggested
>      >      >      >            Size   |       1      2      3      4      6      8  |   setting
>      >      >      > -----------------+---------------------------------------------+------------
>      >      >      > aarch      128   |     221    115     80     36      7      1  |     6
>      >      >      > ppc        128   |     245    152    101     54     14      4  |     6
>      >      >      > ppcle      128   |     243    144     89     72     20      5  |     6
>      >      >      > s390       256   |     168     60     67      8      6      2  |     4
>      >      >      > x86        128   |     223    139     83     50     11      2  |     6
>      >      >      >
>      >      >      > Thank you for your time and opinion!
>      >      >      > Lutz
>      >      >      >
>      >      >      >
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >
>      >
>      
> 


More information about the hotspot-compiler-dev mailing list