RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v13]

Cesar Soares Lucas cslucas at openjdk.org
Fri May 26 17:46:03 UTC 2023


On Tue, 23 May 2023 17:19:23 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>>> I verified that the new test cases do trigger SR+NSR scenario.
>>> 
>>> How do you test that deoptimization works as expected?
>>> 
>> 
>> I have a copy of the tests in AllocationMergesTests.java in a separate file (not included in this PR) and I run the tests with a tool that compares the output of the test with RAM enabled and disabled. So, the way I test that deoptimization worked is basically just making sure the tests that "deoptimize" have the same output with RAM enabled and disabled.
>> 
>>> Diagnostic output is still hard to read. On one hand, it's too verbose when it comes to PcDesc/ScopeDesc sections ("pc-bytecode offsets" and "scopes") in nmethod output (enabled either w/ `-XX:+PrintAssembly` or `-XX:CompileCommand=print,...`). On the other hand, it lacks some important details, like `selector` and `merge_ptr` location information which is essential to make sense of debug information at a safepoint in the code.
>>> 
>> 
>> I'll take care of that. I was testing only with PrintDebugInfo.
>> 
>>> FTR `_skip_rematerialization` flag is unused now.
>>> 
>> 
>> yeah, I forgot to remove that. Thanks.
>> 
>>> Speaking of `_only_merge_candidate` flag, I find it easier about the code when the property being tracked is whether the `ObjectValue` is referenced from corresponding JVM state or not. (Maybe call it `is_root()`?) So, `ScopeDesc::objects_to_rematerialize()` would skip everything not referenced from JVM state, but then unconditionally accept anything returned by `ObjectMergeValue::select()` which doesn't need to adjust the flag before returning selected object. Also, it's safer to track the flag status for every `ObjectValues`, even for `ObjectMergeValue`.
>>> 
>> 
>> Sounds like a good idea. I'll do that. Thanks.
>> 
>>> Are you sure there's no way to end up with nested `ObjectMergeValue`s in presence of iterative EA?
>> 
>> I don't think so. This current patch only handle Phis that don't have NULL as input. As part of the reduction process we set at least one of the reducible Phi inputs to NULL. Therefore, subsequent iterations of EA won't reduce the same Phi.
>
>> So, the way I test that deoptimization worked is basically just making sure the tests that "deoptimize" have the same output with RAM enabled and disabled.
> 
> Please, enhance `AllocationMergesTests` to cover deoptimization (e.g., using WhiteBox API or additional run w/ -XX:+DeoptimizeALot) and ensure that tests are sensitive enough to fail when wrong state is rematerialized.

Hi @iwanowww - I pushed some changes to address your latest feedback.

> Please, enhance AllocationMergesTests to cover deoptimization (e.g., using WhiteBox API or additional run w/ -XX:+DeoptimizeALot) and ensure that tests are sensitive enough to fail when wrong state is rematerialized.

I added the "+DeoptimizeALot" flag on the tests execution. I also refactored the tests so that each test is executed in the Interpreter and in C2 with the same parameters so that we can confirm that result of the test with RAM enabled is correct.

> Please, add asserts to catch such situation and a check which bails out compilation (triggering recompilation w/ ReduceAllocationMerges turned off) if it happens with product binaries.

I added a new static method `ConnectionGraph::verify_ram_nodes` that does some verification in the inputs and users of RAM nodes. I decided to call the method after each iteration of EA->IGVN->MacroNodeElimination so that we also check that IGVN or `eliminate_macro_nodes` transformations didn't mess with RAM nodes.

> I didn't propose exactly that, but I like your idea. I'm not against having it cached on ScopeValue side (and serialized in debug info), but implementing it as a query on ScopeDesc does look like a better alternative. [...]

I ended up implementing this a little bit different from what I mentioned earlier. I had some problems with the approach that I described before... In current approach I set the `is_root` flag of ObjectValue's right before the object pool is serialized in output.cpp.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1564718582


More information about the hotspot-dev mailing list