RFR: 8320649: C2: Optimize scoped values

Tobias Hartmann thartmann at openjdk.org
Wed Dec 6 07:55:32 UTC 2023


On Tue, 5 Dec 2023 08:33:08 GMT, Roland Westrelin <roland at openjdk.org> wrote:

> This change implements C2 optimizations for calls to
> ScopedValue.get(). Indeed, in:
> 
> 
> v1 = scopedValue.get();
> ...
> v2 = scopedValue.get();
> 
> 
> `v2` can be replaced by `v1` and the second call to `get()` can be
> optimized out. That's true whatever is between the 2 calls unless a
> new mapping for `scopedValue` is created in between (when that happens
> no optimizations is performed for the method being compiled). Hoisting
> a `get()` call out of loop for a loop invariant `scopedValue` should
> also be legal in most cases.
> 
> `ScopedValue.get()` is implemented in java code as a 2 step process. A
> cache is attached to the current thread object. If the `ScopedValue`
> object is in the cache then the result from `get()` is read from
> there. Otherwise a slow call is performed that also inserts the
> mapping in the cache. The cache itself is lazily allocated. One
> `ScopedValue` can be hashed to 2 different indexes in the cache. On a
> cache probe, both indexes are checked. As a consequence, the process
> of probing the cache is a multi step process (check if the cache is
> present, check first index, check second index if first index
> failed). If the cache is populated early on, then when the method that
> calls `ScopedValue.get()` is compiled, profile reports the slow path
> as never taken and only the read from the cache is compiled.
> 
> To perform the optimizations, I added 3 new node types to C2:
> 
> - the pair
>   ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode for
>   the cache probe
>   
> - a cfg node ScopedValueGetResultNode to help locate the result of the
>   `get()` call in the IR graph.
> 
> In pseudo code, once the nodes are inserted, the code of a `get()` is:
> 
> 
> hits_in_the_cache = ScopedValueGetHitsInCache(scopedValue)
> if (hits_in_the_cache) {
>   res = ScopedValueGetLoadFromCache(hits_in_the_cache);
> } else {
>   res = ..; //slow call possibly inlined. Subgraph can be arbitray complex
> }
> res = ScopedValueGetResult(res)
> 
> 
> In the snippet:
> 
> 
> v1 = scopedValue.get();
> ...
> v2 = scopedValue.get();
> 
> 
> Replacing `v2` by `v1` is then done by starting from the
> `ScopedValueGetResult` node for the second `get()` and looking for a
> dominating `ScopedValueGetResult` for the same `ScopedValue`
> object. When one is found, it is used as a replacement. Eliminating
> the second `get()` call is achieved by making
> `ScopedValueGetHitsInCache` always successful if there's a dominating
> `ScopedValueGetResult` and replacing its companion
> `ScopedValueGetLoadFromCache` by the dominating
> `ScopedValueGetResult`.
> 
> Hoisting a `g...

No review yet, I just performed some quick testing.

The optimized build fails:


[2023-12-05T16:26:12,957Z] open/src/hotspot/share/opto/loopnode.cpp:4745: error: undefined reference to 'ScopedValueGetHitsInCacheNode::verify() const'
[2023-12-05T16:26:12,960Z] open/src/hotspot/share/opto/loopnode.cpp:4761: error: undefined reference to 'ScopedValueGetLoadFromCacheNode::verify() const'
[2023-12-05T16:26:12,964Z] open/src/hotspot/share/opto/loopnode.cpp:4908: error: undefined reference to 'ScopedValueGetHitsInCacheNode::verify() const'
[2023-12-05T16:26:12,967Z] open/src/hotspot/share/opto/loopnode.cpp:4911: error: undefined reference to 'ScopedValueGetLoadFromCacheNode::verify() const'
[2023-12-05T16:26:12,976Z] open/src/hotspot/share/opto/loopopts.cpp:3935: error: undefined reference to 'ScopedValueGetHitsInCacheNode::verify() const'
[2023-12-05T16:26:15,455Z] collect2: error: ld returned 1 exit status


`compiler/c2/irTests/TestScopedValue.java` fails with `-Xcomp` on Linux x64:


# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f8de687d1b9, pid=270115, tid=270131
#
# JRE version: Java(TM) SE Runtime Environment (22.0) (fastdebug build 22-internal-2023-12-05-1616186.tobias.hartmann.jdk2)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-internal-2023-12-05-1616186.tobias.hartmann.jdk2, compiled mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x12931b9]  PhaseIdealLoop::get_early_ctrl(Node*)+0x4c9

Current CompileTask:
C2:30390 8110    b  4       compiler.c2.irTests.TestScopedValue::testFastPath13 (28 bytes)

Stack: [0x00007f8dc4353000,0x00007f8dc4453000],  sp=0x00007f8dc444d6c0,  free space=1001k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x12931b9]  PhaseIdealLoop::get_early_ctrl(Node*)+0x4c9  (loopnode.hpp:1139)
V  [libjvm.so+0x1293d95]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0x75  (loopnode.cpp:251)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x1293df6]  PhaseIdealLoop::set_subtree_ctrl(Node*, bool) [clone .part.0]+0xd6  (node.hpp:399)
V  [libjvm.so+0x12960e0]  PhaseIdealLoop::test_and_load_from_cache(Node*, Node*, Node*, Node*, float, float, Node*, Node*&, Node*&, Node*&)+0x820  (loopnode.cpp:4900)
V  [libjvm.so+0x1296b6a]  PhaseIdealLoop::expand_get_from_sv_cache(ScopedValueGetHitsInCacheNode*)+0x82a  (loopnode.cpp:4822)
V  [libjvm.so+0x1297473]  PhaseIdealLoop::expand_scoped_value_get_nodes()+0x243  (loopnode.cpp:4737)
V  [libjvm.so+0x12a45ed]  PhaseIdealLoop::build_and_optimize()+0xf0d  (loopnode.cpp:4672)
V  [libjvm.so+0x9f4ea2]  PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x432  (loopnode.hpp:1113)
V  [libjvm.so+0x9ed945]  Compile::optimize_loops(PhaseIterGVN&, LoopOptsMode)+0x75  (compile.cpp:2248)
V  [libjvm.so+0x9f0253]  Compile::Optimize()+0xfd3  (compile.cpp:2500)
V  [libjvm.so+0x9f37e1]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1c21  (compile.cpp:860)
V  [libjvm.so+0x83eca7]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1e7  (c2compiler.cpp:134)
V  [libjvm.so+0x9ff17c]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x92c  (compileBroker.cpp:2299)
V  [libjvm.so+0x9ffe08]  CompileBroker::compiler_thread_loop()+0x468  (compileBroker.cpp:1958)
V  [libjvm.so+0xeb93bc]  JavaThread::thread_main_inner()+0xcc  (javaThread.cpp:720)
V  [libjvm.so+0x17992c6]  Thread::call_run()+0xb6  (thread.cpp:220)
V  [libjvm.so+0x14a30f7]  thread_native_entry(Thread*)+0x127  (os_linux.cpp:787)


`compiler/c2/irTests/TestScopedValue.java` fails with `-ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation` on Linux x64:


Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method "public static void compiler.c2.irTests.TestScopedValue.testFastPath7()" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={}, counts={}, failOn={"_#C#CALL_OF_METHOD#_", "slowGet"}, applyIfPlatform={}, applyIfPlatformOr={}, applyIfOr={}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - failOn: Graph contains forbidden nodes:
         * Constraint 1: "(\\d+(\\s){2}(Call.*Java.*)+(\\s){2}===.*slowGet )"
           - Matched forbidden node:
             * 501  CallStaticJava  === 370 6 7 8 1 (648 1 1 1 1 1 ) [[ 502 503 504 ]] # Static  java.lang.ScopedValue::slowGet 


`compiler/c2/irTests/TestScopedValue.java` fails with `-XX:TypeProfileLevel=222` on AArch64:


# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/System/Volumes/Data/mesos/work_dir/slaves/0db9c48f-6638-40d0-9a4b-bd9cc7533eb8-S29331/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/e2db05c4-923c-4a63-923b-5f9870681cc5/runs/c74e986d-15f1-46dc-822b-a41d12c079e0/workspace/open/src/hotspot/share/opto/callGenerator.cpp:929), pid=44590, tid=26115
#  Error: assert(in->Opcode() == Op_LoadP || in->Opcode() == Op_LoadN) failed

Current CompileTask:
C2:766  689    b  4       compiler.c2.irTests.TestScopedValue::testFastPath1 (30 bytes)

Stack: [0x00000001719ec000,0x0000000171bef000],  sp=0x0000000171beb0c0,  free space=2044k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x1130268]  VMError::report_and_die(int, char const*, char const*, char*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x564  (callGenerator.cpp:929)
V  [libjvm.dylib+0x1130a88]  VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*)+0x0
V  [libjvm.dylib+0x5618b0]  print_error_for_unit_test(char const*, char const*, char*)+0x0
V  [libjvm.dylib+0x396ea0]  LateInlineScopedValueCallGenerator::process_result(GraphKit&)+0x2534
V  [libjvm.dylib+0x38f8dc]  CallGenerator::do_late_inline_helper()+0x660
V  [libjvm.dylib+0x4cd2bc]  Compile::inline_scoped_value_calls(PhaseIterGVN&)+0x570
V  [libjvm.dylib+0x4c6944]  Compile::Optimize()+0x210
V  [libjvm.dylib+0x4c54bc]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1228
V  [libjvm.dylib+0x38a590]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1e0
V  [libjvm.dylib+0x4e2f48]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x854
V  [libjvm.dylib+0x4e238c]  CompileBroker::compiler_thread_loop()+0x348
V  [libjvm.dylib+0x8bb170]  JavaThread::thread_main_inner()+0x1dc
V  [libjvm.dylib+0x1076548]  Thread::call_run()+0xf4
V  [libjvm.dylib+0xe39138]  thread_native_entry(Thread*)+0x138
C  [libsystem_pthread.dylib+0x726c]  _pthread_start+0x94


`compiler/c2/irTests/TestScopedValue.java` fails with `-XX:+UnlockDiagnosticVMOptions -XX:TieredStopAtLevel=3 -XX:+StressLoopInvariantCodeMotion -XX:+StressRangeCheckElimination -XX:+StressLinearScan` on AArch64:


compiler.lib.ir_framework.shared.TestRunException: There was an error while invoking @Run method private void compiler.c2.irTests.TestScopedValue.testFastPath1Runner() throws java.lang.Exception
	at compiler.lib.ir_framework.test.CustomRunTest.invokeTest(CustomRunTest.java:162)
	at compiler.lib.ir_framework.test.CustomRunTest.run(CustomRunTest.java:87)
	at compiler.lib.ir_framework.test.TestVM.runTests(TestVM.java:822)
	at compiler.lib.ir_framework.test.TestVM.start(TestVM.java:249)
	at compiler.lib.ir_framework.test.TestVM.main(TestVM.java:164)
Caused by: java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:118)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at compiler.lib.ir_framework.test.CustomRunTest.invokeTest(CustomRunTest.java:159)
	... 4 more
Caused by: java.lang.RuntimeException: should be compiled
	at compiler.c2.irTests.TestScopedValue.testFastPath1Runner(TestScopedValue.java:87)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	... 6 more


`compiler/c2/TestUnsignedByteCompare.java` and `compiler/codegen/TestSignedMultiplyLong.java` fail with `-Duse.JTREG_TEST_THREAD_FACTORY=Virtual -XX:-VerifyContinuations` intermittent on Windows x64:


# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/System/Volumes/Data/mesos/work_dir/slaves/0db9c48f-6638-40d0-9a4b-bd9cc7533eb8-S29331/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/e2db05c4-923c-4a63-923b-5f9870681cc5/runs/c74e986d-15f1-46dc-822b-a41d12c079e0/workspace/open/src/hotspot/share/opto/compile.cpp:813), pid=29127, tid=26371
#  assert(IncrementalInline || (_late_inlines.length() == 0 && !has_mh_late_inlines())) failed: incremental inlining is off

Current CompileTask:
C2:5797 3627    b        java.lang.System$2::scopedValueCache (4 bytes)

Stack: [0x0000000171694000,0x0000000171897000],  sp=0x0000000171894bc0,  free space=2050k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x1130268]  VMError::report_and_die(int, char const*, char const*, char*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x564  (compile.cpp:813)
V  [libjvm.dylib+0x1130a88]  VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*)+0x0
V  [libjvm.dylib+0x5618b0]  print_error_for_unit_test(char const*, char const*, char*)+0x0
V  [libjvm.dylib+0x4c5794]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1500
V  [libjvm.dylib+0x38a590]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1e0
V  [libjvm.dylib+0x4e2f48]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x854
V  [libjvm.dylib+0x4e238c]  CompileBroker::compiler_thread_loop()+0x348
V  [libjvm.dylib+0x8bb170]  JavaThread::thread_main_inner()+0x1dc
V  [libjvm.dylib+0x1076548]  Thread::call_run()+0xf4
V  [libjvm.dylib+0xe39138]  thread_native_entry(Thread*)+0x138
C  [libsystem_pthread.dylib+0x726c]  _pthread_start+0x94


Just let me know if you need any more information.

-------------

Changes requested by thartmann (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16966#pullrequestreview-1766883038


More information about the hotspot-compiler-dev mailing list