Hooking up the array mismatch stub as an intrinsic in the template interpreter

Fri Apr 15 15:00:35 UTC 2016

> On 15 Apr 2016, at 16:01, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:
> 
> 
> 
> On 4/15/16 9:07 AM, Paul Sandoz wrote:
>>> On 15 Apr 2016, at 14:12, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:
>>> 
>>> 
>>> I don't know why we'd add even more assembly code to the interpreter.  Why doesn't the JIT optimize this function instead? By adding a stub in the interpreter does that prevent the JIT from inlining this function since it's not invocation counted?
>>> 
>> I have updated the webrev with C1 support [1] and determined, eyeballing generated code, that the stub call gets inlined for C1 and C2 and appears unaffected by the wiring up of that same stub in the template interpreter.
>> 
>> A stub was added and wired up to C2 with the intention to wire that up to C1, and possible to the interpreter. One reason for the latter was because of the performance results presented in the last email (potentially ~200x over the current approach, and ~35x improvement over the original Java code). Does that matter? would you be concerned about that?
> 
> What workload is this running?
> 

byte[] array comparison of 1024 bytes. It was just a quick smoke test that the intrinsic was working as expected.

> What results do you get with ref workload?

See results starting with the “base_”.

For more details about the benchmarks with some existing analysis see here:

  http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021225.html <http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021225.html>

>> Array equality is quite a fundamental operation so i was concerned about such a regression in the interpreter.
> 
> The interpreter is mostly run during startup time so we'd like to see some workload results with perhaps the startup_3 benchmarks set.
> 

Ok.

> Again, we are trying to not have special case assembly code in the interpreter.   Adding these sorts of special optimizations to the compilers makes a lot more sense.
> 

Yes.

I talked a bit off-line with Aleksey. I was off base about the compact string work, adding the intrinsic to the interpreter will not help here because a dependency issue is at VM genesis time. This can be worked around by aliasing intrinsics to avoid the class dependency.

So, i will withdraw the additions to the template interpreter and focus the webrev on C1.

Thanks,
Paul.