RFR(S): 8223444: Improve CodeHeap Free Space Management
Schmidt, Lutz
lutz.schmidt at sap.com
Wed May 15 12:53:59 UTC 2019
Hi Erik,
had not heard that. Just to make sure I understand your statement correctly: on SPARC, memset does something fancy which causes random byte values to become visible to concurrent readers. Correct?
As you state, sufficiently capable compilers will recognize explicit initialization loops and convert them to memset calls. Replacing the loop by a memset call therefore does not introduce a new risk. If at all, it changes probabilities.
Furthermore, I do not see concurrent readers. Segment map initialization is only called when the CodeHeap is expanded or when it is cleared in its entirety.
If anybody has I remaining concern, I’m ready to revert from memset to explicit loop.
Thanks for your heads up!
Lutz
From: Erik Osterlund <erik.osterlund at oracle.com>
Date: Wednesday, 15. May 2019 at 09:22
To: Lutz Schmidt <lutz.schmidt at sap.com>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>, "hotspot-compiler-dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR(S): 8223444: Improve CodeHeap Free Space Management
Hi,
Didn’t read the full conversation, only the part where it was claimed we have byte atomicity on all platforms with memset.
That’s not true. On SPARC, the default is to use block initializing stores (BIS), which exposes concurrent readers to out-of-thin-air values that were never in the original memory state or the new memory state.
Moreover, the compiler is allowed to, and will also in practice, transform a loop that performs memset, into a memset call as long as the memory being memset is non-volatile.
These problems were observed in card tables and led to the introduction of
memset_with_concurrent_readers() to force the memset not to use BIS.
Just a FYI.
/Erik
On 15 May 2019, at 07:53, Schmidt, Lutz <lutz.schmidt at sap.com<mailto:lutz.schmidt at sap.com>> wrote:
Hi Vladimir,
thank you for your comments. About filling CodeHeap with bad values after split_block:
- in deallocate_tail, the leading part must remain intact. It contains valid code.
- in search_freelist, one free block is split into two. There I could invalidate the contents of both parts.
- If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block.
Regards,
Lutz
On 14.05.19, 23:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>> wrote:
On 5/14/19 1:09 PM, Schmidt, Lutz wrote:
Hi Vladimir,
I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset().
It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering.
Okay, if it is byte copy I am fine with it.
About segment map marking:
There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start().
In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid.
Segmap before after
Index split split
I 0 <- block start 0 <- block start (now shorter)
I+1 1 1 each index 0..9 still points
I+2 2 2 back to the block start
I+3 3 3
I+4 4 4
I+5 5 5
I+6 6 6
I+7 7 7
I+8 8 8
I+9 9 9
I+10 10 0 <- new block start
I+11 11 1
I+12 12 2
I+13 13 3
I+14 14 4
I+15 0 <- block start 0 <- block start
I+16 1 1
I+17 2 2
I+18 3 3
I+19 4 4
There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()?
Thank you, I forgot about that first block mapping is still valid.
What about storing bad value (in debug mode) only in second part and not both parts?
Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it.
Okay.
Thanks,
Vladimir
Thanks,
Lutz
On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" <hotspot-compiler-dev-bounces at openjdk.java.net<mailto:hotspot-compiler-dev-bounces at openjdk.java.net> on behalf of vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>> wrote:
Good.
Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here).
An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as
used and (after call) store bad values in it. What about first part? May be add comment.
Thanks,
Vladimir
On 5/14/19 3:47 AM, Schmidt, Lutz wrote:
Dear all,
May I please request reviews for my change?
Bug: https://bugs.openjdk.java.net/browse/JDK-8223444
Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/
What this change is all about:
------------------------------
While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements.
Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search.
The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown.
JDK-8223444 not applied
=======================
Segment | free blocks with CodeCacheMinBlockLength=
Size | 1 2 3 4 6 8
-----------------+-------------------------------------------
aarch 128 | 0 153 75 30 38 2
ppc 128 | 0 149 98 59 14 2
ppcle 128 | 0 219 161 110 69 34
s390 256 | 0 142 93 59 30 10
x86 128 | 0 215 157 118 42 11
JDK-8223444 applied
===================
Segment | free blocks with CodeCacheMinBlockLength= | suggested
Size | 1 2 3 4 6 8 | setting
-----------------+---------------------------------------------+------------
aarch 128 | 221 115 80 36 7 1 | 6
ppc 128 | 245 152 101 54 14 4 | 6
ppcle 128 | 243 144 89 72 20 5 | 6
s390 256 | 168 60 67 8 6 2 | 4
x86 128 | 223 139 83 50 11 2 | 6
Thank you for your time and opinion!
Lutz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20190515/f69ff81f/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list