RFR: 8320649: C2: Optimize scoped values [v7]

Emanuel Peter epeter at openjdk.org
Sat Feb 10 21:23:07 UTC 2024


On Thu, 8 Feb 2024 15:41:00 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> Ok, this is all for today. I only looked at the first half. Maybe you can react to my comments this far, and then I can continue?
>> 
>> Maybe a more general question:
>> How bad would it be if we simply did not inline the "get" call, but directly transformed the graph? Is that feasible? Or would it be a performance hit because we could not inspect the profiling inside "get"? It is just a lot of code here, with lots of potential for bugs.
>> 
>> I also wonder about test coverage. How many tests do we have?
>> Fuzzer tests currently do not use ScopedValues. So we would never get any really strange patterns.
>
>> Ok, this is all for today. I only looked at the first half. Maybe you can react to my comments this far, and then I can continue?
> 
> Yes, sounds good. Thanks for the new review.
> 
>> Maybe a more general question: How bad would it be if we simply did not inline the "get" call, but directly transformed the graph? Is that feasible? Or would it be a performance hit because we could not inspect the profiling inside "get"? It is just a lot of code here, with lots of potential for bugs.
> 
> It's probably feasible. I think we want to avoid the slow path (when there's a missed in the cache) as much as possible because having a call or some complicated logic to handle the cache miss will affect all sort of optimizations. So we would have to speculate and then take a trap if that fails. Given we have profile data in the get(), it seems unfortunate to not use it and have it guide the decisions. The pattern matching is complicated for sure but it's also fairly mechanical. The java code being matches is fairly simple. What can affect pattern matching is whether one of the 3 branches can be never taken. There doesn't seem to be that many ways it could break.
> 
> The most fragile part I would say is that the ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode pair have to stay together. If some transformation changes the IR so we can't go from one to the other, then expanding the ScopedValue.get() fails.
> 
>> I also wonder about test coverage. How many tests do we have? Fuzzer tests currently do not use ScopedValues. So we would never get any really strange patterns.
> 
> That's also a weakness for this. Obviously, it's a new feature so there's not that much code using it. There are some functional tests and benchmarks.

@rwestrel I'm getting some test timeouts with `-server -Xcomp` for `compiler/c2/irTests/TestScopedValue.java`, on our testing infrastructure. It ran for `12min` Do you get the same?

**TEST FAILURE (Commit 9)**
`java/lang/ScopedValue/StressStackOverflow.java#no-TieredCompilation` fails on `linux-aarch64` (so far no other platform):


# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000ffff9332a57c, pid=1260224, tid=1260240
#
# JRE version: Java(TM) SE Runtime Environment (23.0) (build 23-internal-2024-02-08-0653493.emanuel.peter.jdk-review)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23-internal-2024-02-08-0653493.emanuel.peter.jdk-review, mixed mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
# Problematic frame:
# V  [libjvm.so+0xa8657c]  PhaseIdealLoop::get_early_ctrl(Node*)+0x2a8

...

Current CompileTask:
C2:196   61   !         StressStackOverflow$1::run (108 bytes)

Stack: [0x0000ffff71606000,0x0000ffff71804000],  sp=0x0000ffff717feba0,  free space=2018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xa8657c]  PhaseIdealLoop::get_early_ctrl(Node*)+0x2a8  (loopnode.hpp:1139)
V  [libjvm.so+0xa86884]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0x84  (loopnode.cpp:251)
V  [libjvm.so+0xa868f4]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xf4  (loopnode.cpp:277)
V  [libjvm.so+0xa868f4]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xf4  (loopnode.cpp:277)
V  [libjvm.so+0xa868f4]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xf4  (loopnode.cpp:277)
V  [libjvm.so+0xa8e89c]  PhaseIdealLoop::test_and_load_from_cache(Node*, Node*, Node*, Node*, float, float, Node*, Node*&, Node*&, Node*&)+0x2fc  (loopnode.cpp:5158)
V  [libjvm.so+0xa8f168]  PhaseIdealLoop::expand_sv_get_hits_in_cache_and_load_from_cache(ScopedValueGetHitsInCacheNode*)+0x558  (loopnode.cpp:5068)
V  [libjvm.so+0xa8f528]  PhaseIdealLoop::expand_scoped_value_get_nodes()+0x48  (loopnode.cpp:4934)
V  [libjvm.so+0xa9ae30]  PhaseIdealLoop::build_and_optimize()+0xd70  (loopnode.cpp:4875)
V  [libjvm.so+0x59a284]  Compile::optimize_loops(PhaseIterGVN&, LoopOptsMode)+0x284  (loopnode.hpp:1113)
V  [libjvm.so+0x5a44e8]  Compile::Optimize()+0x568  (compile.cpp:2519)
V  [libjvm.so+0x5a5834]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0xbf4  (compile.cpp:864)
V  [libjvm.so+0x4d516c]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x138  (c2compiler.cpp:142)
V  [libjvm.so+0x5ab468]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x998  (compileBroker.cpp:2305)
V  [libjvm.so+0x5adf14]  CompileBroker::compiler_thread_loop()+0x374  (compileBroker.cpp:1964)
V  [libjvm.so+0x821e74]  JavaThread::thread_main_inner() [clone .part.0]+0xa0  (javaThread.cpp:721)
V  [libjvm.so+0xd176a8]  Thread::call_run()+0xa8  (thread.cpp:221)
V  [libjvm.so+0xba2338]  thread_native_entry(Thread*)+0xd8  (os_linux.cpp:817)
C  [libpthread.so.0+0x7928]  start_thread+0x188

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16966#issuecomment-1937170614


More information about the hotspot-compiler-dev mailing list