RFC: JEP JDK-8221828: New Invoke Bindings

Fri Oct 11 11:50:15 UTC 2019

Hi Vitaly,

On 10/11/19 1:24 PM, Vitaly Davidovich wrote:
> Hi Erik,
>
> This sounds like a great idea! You touch on this in the JEP, but 
> avoiding global safepoints for ICStub patching would be great.
> http://openjdk.5641.n7.nabble.com/InlineCacheBuffer-GuaranteedSafepointInterval-td229138.html 
> is a few years old but I think largely still true today.

Glad you brought this up. Yes. In fact, this is how this whole thing 
started for me. When I implemented concurrent class unloading for ZGC 
back in JDK 12, I found it quite annoying that our concurrent inline 
cache cleaning may trigger safepoints. I measured that after cleaning 
~140 IC stubs (due to a rather pessimistic buffer sizing), we would run 
out of buffer space and safepoint. Since I work in the ZGC project, I 
get irrationally upset when I see things being done in safepoints that 
don't need it, and am willing to walk very far to get rid of them. So I 
started looking into whether these IC stubs can be freed concurrently 
instead. I implemented a pipelined three-phased global handshaking 
scheme for safely reclaiming IC stubs and CompiledICHolders concurrently 
in the service thread, on architectures that support instruction cache 
coherency, which remedied the safepointing problem on x86_64. But not 
all architectures have instruction cache coherency, so I was annoyed by 
the limited portability of my solution, and maintaining multiple 
different life cycles of IC stubs, and making inline caches even more 
complicated than they already are. That's when I had finally had enough 
of inline cache problems, shelved that idea and decided to instead get 
rid of inline caches, as they no longer seem to solve a problem that is 
current, yet cause headache on a daily basis.

> How confident are you that hardware’s branch target buffer (BTB) will 
> neutralize the loss of direct jumps? In large applications, with lots 
> of code and deep call graphs, I’d be concerned that BTB is exhausted 
> due to sheer number of entries needed.

I have run a bunch of workloads (some very code cache heavy), even 
without inlining to stress the dispatching mechanism more than 
reasonable, without observing any noticeable differences with my new 
mechanism compared to the old mechanism. So I am fairly optimistic at 
this point. What I have been fighting more with is start-up performance. 
I generate special unique numbers for each Method*, which stresses 
startup performance surprisingly much. But I am happy with where that is 
at right now after throwing a bunch of tricks on that code. Having said 
that, I am still in a prototyping phase, and have more evaluation to be 
done, and will of course have a close look at how that pans out.

> Of course this is just speculation.

...I see what you did there.

/Erik

>
> Thanks
>
> On Fri, Oct 11, 2019 at 5:03 AM <erik.osterlund at oracle.com 
> <mailto:erik.osterlund at oracle.com>> wrote:
>
>     Hi,
>
>     I prepared a new JEP [1], about rewriting the method invocation
>     mechanisms in HotSpot. In particular, the aim is to remove inline
>     caches, in an effort to make our code more robust and maintainable
>     (remove 3 nmethod states, ~10000 LOC concurrent code patching stuff,
>     make redefinition and unloading much more straight forward).
>     Instead, it
>     is to be replaced with a table driven approach, leaving speculation
>     machinery to the hardware instead of software. More details are in
>     the
>     JEP description.
>
>     Feedback is more than welcome.
>
>     Thanks,
>     /Erik
>
>     [1] https://bugs.openjdk.java.net/browse/JDK-8221828
>
> -- 
> Sent from my phone