RFR: 8286302: Port JEP 425 to PPC64 [v7]

Leonid Mesnik lmesnik at openjdk.org
Tue Nov 22 02:51:28 UTC 2022


On Mon, 21 Nov 2022 13:59:54 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> Hi,
>> 
>> this is the port of [JEP 425: Virtual Threads (Preview)](https://openjdk.org/jeps/425)) to PPC64.
>> More precisely it is the port of vm continuations in hotspot to PPC64. It allows to run with `-XX:+VMContinuations` which is a prerequisit for 'real' virtual threads (oxymoron?).
>> 
>> Most of the shared code changes are related to a new platform dependent constant `frame::metadata_words_at_top`. It is either added or subtracted to a frame address or size.
>> 
>> The following is supposed to explain (without real life details) why it is needed in addition to the existing `frame::metadata_words`. The diagram shows a frame at `SP` and its stack arguments. The caller frame is located at `callers_SP`.
>> 
>> 
>>   X86 / AARCH64                             PPC64:
>> 
>>   :                 :                       :                 :
>>   :                 :                       :                 :
>>   |                 |                       |                 |
>>   |-----------------|                       |-----------------|
>>   |                 |                       |                 |
>>   | stack arguments |                       | stack arguments |
>>   |                 |<- callers_SP          |                 |
>>   ===================                       |-----------------|
>>   |                 |                       |                 |
>>   | metadata at bottom |                       | metadata at top    |
>>   |                 |                       |                 |<- callers_SP
>>   |-----------------|                       ===================
>>   |                 |                       |                 |
>>   |                 |                       |                 |
>>   |                 |                       |                 |
>>   |                 |                       |                 |
>>   |                 |<- SP                  |                 |
>>   ===================                       |-----------------|
>>                                             |                 |
>>                                             | metadata at top    |
>>                                             |                 |<- SP
>>                                             ===================
>> 
>> 
>> On X86 and AARCH64 metadata (that's return address, saved SP/FP, etc.) is stored at the frame bottom (`metadata at bottom`). On PPC64 it is stored at the frame top (`metadata at top`) where it affects size and address calculations. Shared code deals with this by making use of the platform dependent constant `frame::metadata_words_at_top`.
>> 
>> * size required to 'freeze' a single frame with its stack arguments in a `StackChunk`:
>>   `sizeof(frame) + sizeof(stack arguments) + frame::metadata_words_at_top`
>> 
>> * address of stack arguments:
>>   `callers_SP + frame::metadata_words_at_top`
>> 
>> * The value of `frame::metadata_words_at_top` is 0 words on X86 and AARCH64. On PPC64 it is 4 words.
>> 
>> Please refer to comments I've added for more details (e.g. the comment on StackChunkFrameStream<frame_kind>::frame_size()). Recently I've given a talk about vm continuations and the PPC64 port to my colleagues. It's rather an overview than a deep dive. [The slides](http://cr.openjdk.java.net/~rrich/webrevs/2022/8286302/202210_loom_ppc64_port.pdf) might serve as well as an intro to the matter.
>> 
>> The pr includes the new test jdk/jdk/internal/vm/Continuation/BasicExp.java which I wrote while doing the port. The test cases vary from simple to not-so-simple. One of the main features is that a CompilationPolicy passed as argument controls which frames are compiled/interpreted when freezing by defining a sliding window of compiled / interpreted frames which produces interesting transitions with and without stack arguments. There is overlap with Basic.java and Fuzz.java. Let me know wether to remove or keep BasicExp.java. Runtime with fastdebug: 2m on Apple M1 Pro and 3m on Intel Xeon E5-2660 v3 @ 2.60GHz. Note that -XX:+VerifyContinuations is explicitly set as a found it very useful, it increases the runtime quite a bit though.
>> 
>> Testing: the change passed our CI testing: most JCK and JTREG tests, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite and SAP specific tests with fastdebug and release builds on the standard platforms plus PPC64. These tests do include hotspot_loom and jdk_loom JTREG tests which I've also run with TEST_VM_OPTS="-XX:+VerifyContinuations" on X86_64, PPC64, and AARCH64.
>> 
>> Thanks, Richard.
>
> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Cleanup BasicExp test

Not a complete review but some notes about test style.

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 129:

> 127:  * @summary Collection of basic continuation tests. CompilationPolicy controls which frames in a sequence should be compiled when calling Continuation.yield().
> 128:  * @requires vm.continuations
> 129:  * @requires vm.flavor == "server" & (vm.opt.TieredCompilation == null | vm.opt.TieredCompilation == false)

Isn't vm.opt.TieredCompilation != true better?

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 136:

> 134:  * @run driver jdk.test.lib.helpers.ClassFileInstaller jdk.test.whitebox.WhiteBox
> 135:  *
> 136:  * @run main/othervm/timeout=300   --enable-preview -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -Xbootclasspath/a:.

It is better to use tag @enablePreview so --enable-preview is not needed.
Also, please remove -XX:+UnlockDiagnosticVMOptions where it is not required.

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 193:

> 191:             compLevel = CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION;
> 192:             // // Run tests with C1 compilations
> 193:             // compLevel = CompilerWhiteBoxTest.COMP_LEVEL_FULL_PROFILE;

Any reasons why the test is executed with C2 only? And is -XX:-TieredCompilation required if optimization level and compilation are controlled with WB?

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 224:

> 222:         compPolicy.print(); System.out.println();
> 223: 
> 224:         new ContinuationRunYieldRunTest().runTestCase(3, compPolicy);

I think it would be better to split the test into several smaller testcases so timeout=300 is not needed. The fine granularity allows better parallelization of execution. The 5 minutes timeout means a very long-time test that couldn't be split.

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 369:

> 367:         }
> 368: 
> 369:         public void log_dontjit() {

Please use camelCase for all functions.

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 484:

> 482:                     log_dontjit("Exc: " + e);
> 483:                 }
> 484:                 if (callSystemGC) System.gc();

It would be better to call WB.fullGC() not System,gc() to ensure that GC is called.

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 504:

> 502:         }
> 503: 
> 504:         static final long i1=1; static final long i2=2; static final long i3=3;

Please fix identation.

test/jdk/jdk/internal/vm/Continuation/BasicExp.java line 729:

> 727:             log_dontjit("Continuation running on thread " + Thread.currentThread());
> 728:             long res = ord101_recurse_dontinline(0, i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, i11);
> 729:             if (res != i1+i2+i3+i4+i5+i6+i7+i8+i9+i10+i11) {

Please fix the indentation. There are  several places were 'i1+i2' should be fixed to 'i1 + i2 '.

-------------

Changes requested by lmesnik (Reviewer).

PR: https://git.openjdk.org/jdk/pull/10961


More information about the hotspot-compiler-dev mailing list