RFR(XL): 8203469: Faster safepoints

Fri Feb 8 22:52:10 UTC 2019

I am doing the same - so not done reviewing yet ...

Thought I would send a few minor comments - this is on the v06_02 (in case you are looking
at line numbers) - because I would very much like this to be the way going forward if it passes
reviews.

1. safepoint.cpp
I thought with the JFR change you could remove the try_stable_load_state with InactiveSafepointCounter? When would you be in this code and not with a valid safepoint_count?

2. _waiting_to_block and _current_jni_active_count are no longer volatile - assumed only read/modified by VMThread - could you add assertions in the accessors that this is the VMThread?

3. safepoint.cpp line 413: "Keep event from now." -> what does this comment mean?
   line 815: "safepoint callback" -> can you update the comment since we removed the "callback" mechanism - I do realize that we still have the block_if_requested.

4. Do you have stress tests with high thread counts? Does the updated mechanism for _thread_in_vm work as well there? They all complete and no latency issues?

5. Do you test with ThreadLocalHandshakes off and on?

Future Cleanup Wishlist:
1. TBIVM:
   - clean up code that is trying to do an explicit safepoint_poll,
   at that point the TBIVM could also skip the safepoint check in the constructor ?
2. when do we check safepoint_safe_with and native is not walkable (or !has_last_Java_frame ?)
   why is this not an assertion? When do we transition to native without make_walkable?
3. ThreadInVMfromUnknown - what if not in _thread_in_native? Then we don't
transition to _thread_in_vm and we are not in the state we think we are - when
does this happen and what state are we left in?

> On Feb 8, 2019, at 5:37 PM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
> Robbin,
> 
> Because this is a completely different way of solving this problem, I don't
> think I can review this incrementally. That means another crawl through
> review and might even mean another round of whiteboard diagrams...
> 
> A proper review will obviously take me longer than I planned, but I wanted
> you know that I'm starting to look at it from the beginning... :-)
> 
> Dan
> 
> 
> On 2/7/19 11:05 AM, Robbin Ehn wrote:
>> But there is a 'better' way.
>> Before I added the more graceful "_vm_wait->wait();" semaphore in the while
>> (_waiting_to_block > 0) { loop, it was a just a busy spin using the same
>> back-off as the rolling forward loop. It turns out that we mostly never spin
>> here at all, when all java threads are stop the callbacks is often already done.
>> So the addition of the semaphore have no impact on our benchmarks and is mostly
>> unused. This is because most threads are in java which we need to spin-wait
>> since they can elide into native without doing a callback. My proposed re-base
>> removes the the callbacks completely and let the vm thread do all thread
>> accounting. All that the stopping threads needs to do is write state and
>> safepoint id, everything else is handle by the vm thread. We trade 2 atomics +
>> a local store per thread against doing 2 stores per thread by the vm thread.
>> This makes it possible for a thread in vm to transition into blocked WITHOUT
>> safepoint poll. Just set thread_blocked and promise to do safepoint poll when
>> leaving that state.
>> 
>> v06_2
>> Full:
>> http://cr.openjdk.java.net/~rehn/8203469/v06_2/full/
>> Inc against v05:
>> http://cr.openjdk.java.net/~rehn/8203469/v06_2/inc/
>> Inc against v06_1:
>> http://cr.openjdk.java.net/~rehn/8203469/v06_2/rebase_inc/
>> 
>> v06_2 simplifies and removes ~200 LOC with same performance.
>> If there is a case with a thread in vm taking long time, it will already
>> screw-up latency and thus should be fixed regardless of v06_1 vs v06_2. So I
>> see no reason why we should not push v06_2.
>> 
>> Passes stress test and t1-5.
>> 
>> Thanks, Robbin
>