From duke at openjdk.java.net  Tue Feb  1 03:43:34 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Tue, 1 Feb 2022 03:43:34 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent.
Message-ID: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>

8251505: Use of types in compiler shared code should be consistent.

-------------

Commit messages:
 - Remove JFR changes
 - Make the type of methodReclaimedCount consistent
 - Minor formatting
 - Use signed integers
 - 8251505: Use of types in compiler shared code should be consistent.

Changes: https://git.openjdk.java.net/jdk/pull/7294/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8251505
  Stats: 33 lines in 9 files changed: 2 ins; 0 del; 31 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From stuefe at openjdk.java.net  Tue Feb  1 05:57:10 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 1 Feb 2022 05:57:10 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files
In-Reply-To: <fa97b38d-44a7-dec7-503b-f0f29bcafcd7@oracle.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
 <fa97b38d-44a7-dec7-503b-f0f29bcafcd7@oracle.com>
Message-ID: <KaUCyMZ1dd4fp8zmUN9gJS_0_jOZWT21w6t_xQqB--E=.0bcd5768-660f-4ab5-a9a1-73b929d54ba3@github.com>

On Mon, 31 Jan 2022 22:12:30 GMT, David Holmes <david.holmes at oracle.com> wrote:

> > That's a valid concern. I've also asked myself this question when I had initially started using some assertions. We should not crash again during error reporting. I've therefore tried to be as conservative as possible and added bailouts instead, also in loops when reading data. But of course, this is just a best effort and by no means a guarantee to be safe (especially in terms of crashes). What could be alternatives to make this better?
> 
> If the parsing code turns out to be very problematic in a signal handling context, then we could disable it in that context. So we really want to try and do a lot of testing by throwing random signals at the VM and see what breaks.
> 

Source information in hs-err file stacks can be tremendously useful. Lets try the retry-callstack-dumping without features idea in case of a secondary crash, outlined above, first.

> > > Secondly, on the same issue the use of unified logging within this code seems even more likely to be problematic - I'm not aware of us currently using UL during error reporting. It may work in basic usecases but if it triggers logfile rotation or other more complex actions what then?
> > 
> > 
> > I haven't thought about this before. To be honest, I think UL printing of the `dwarf` tag is only useful during development when adding something new to the parser or when debugging. I don't see much value of these messages otherwise - even less for a Java user. As a first step, I could change the logs from `log_X()` to `log_develop_X()` but that just shifts the problem to non-product builds. Another option (or additional thing) could be to guard the log messages with a new develop flag that's disabled by default. By setting it for development, we accept that it might be unsafe which should be fine.
> 
> I think changing the logging to develop only is a reasonable step. I don't see logging of crash handling / error reporting as generally useful for the end user.

I think the right way to go longterm would be to give us a minimalistic safe logging API for these cases (signal handling, pre-initialization) or make UL safe to use always.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7126

From thartmann at openjdk.java.net  Tue Feb  1 08:29:38 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Tue, 1 Feb 2022 08:29:38 GMT
Subject: [jdk18] RFR: 8278871: [JVMCI] assert((uint)reason < 2*
 _trap_hist_limit) failed: oob
Message-ID: <Y0-PDD8xKa5lCZ-SMWan5r3MyZeD4shbLdyAdGHK7Uo=.1a1f1561-54ec-456e-8107-c10083ddfbd4@github.com>

Backport of [JDK-8278871](https://bugs.openjdk.java.net/browse/JDK-8278871). Applies cleanly.

-------------

Commit messages:
 - 8278871: [JVMCI] assert((uint)reason < 2* _trap_hist_limit) failed: oob

Changes: https://git.openjdk.java.net/jdk18/pull/114/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=114&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8278871
  Stats: 21 lines in 5 files changed: 9 ins; 4 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk18/pull/114.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/114/head:pull/114

PR: https://git.openjdk.java.net/jdk18/pull/114

From stuefe at openjdk.java.net  Tue Feb  1 08:58:27 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 1 Feb 2022 08:58:27 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend
Message-ID: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>

NMT backend can be further simplified and cleaned out.

- some entry points require NMT_TrackingLevel as arguments, some use the global tracking level. Ultimately, every part of NMT always uses the global tracking level, so in many cases the explicit parameter can be removed and the global tracking level can be used instead.
- `MemTracker::malloc_header_size(level)` + `MemTracker::malloc_footer_size(level)` are fused into `MemTracker::overhead_per_malloc()`
- when adding to `MallocSiteTable`, caller gets back a shortcut to the entry. That shortcut is stored verbatim in the malloc header. It consists of two 16-bit values (bucket index and chain position). That tupel finds its way into many argument lists. It can be simplified into single 32-bit opaque marker. Code outside the MallocSiteTable does not need to know what it is.
- Currently, the `MallocHeader` class contains a lot of logic. It accounts (in constructor) and de-accounts (in `MallocHeader::release()`). It would simplify code if `MallocHeader` were just a dumb data carrier and the `MallocTracker` would do the actual work.
- `MallocHeader` can be simplified, almost all members made constant and modifying accessors removed.
- In some places we handle inputptr=NULL gracefully where we should assert instead
- Expressions like `MemTracker::tracking_level() != NMT_off` can be simplified to `MemTracker::enabled()`.
- MemTracker::malloc_base (all variants) can be removed. Note that we have MallocTracker::malloc_header, which achieves the same and does not require casting to the header.

Testing:

- GHAs
- manually ran NMT gtests (all NMT modes) and NMT jtreg tests on Ubuntu x64
- SAP nightlies ran through. Note that since 8275301 "Unify C-heap buffer overrun checks into NMT" NMT is enabled by default in debug builds, so it gets a lot more workout in tests now.

Note that I wanted to manually verify that the gdb "call pp" command still works in order to not break Zhengyu's recent addition, but found its already broken. I filed https://bugs.openjdk.java.net/browse/JDK-8281023 and am preparing a separate patch.

-------------

Commit messages:
 - pp should handle NULL correctly
 - remove mostly unused MallocTracker accessors for header members
 - Remove use of NMT level and simplify malloc+realloc+free
 - dumb down malloc header
 - mst bucket+pos=marker
 - remove malloc_base

Changes: https://git.openjdk.java.net/jdk/pull/7283/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7283&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281015
  Stats: 266 lines in 10 files changed: 49 ins; 147 del; 70 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7283.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7283/head:pull/7283

PR: https://git.openjdk.java.net/jdk/pull/7283

From stuefe at openjdk.java.net  Tue Feb  1 09:05:33 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 1 Feb 2022 09:05:33 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not work
Message-ID: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>

JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.

On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:


(gdb) call pp(0x7ffff010b030)

"Executing pp"

Thread 2 "java" received signal SIGSEGV, Segmentation fault.
0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
180 assert(addr != NULL, "Invalid address");


On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:


if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {


This patch:
- fixes uses of SafeFetch
- changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
- adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)

Tests:
- I tested manually that pp works with different levels of NMT (Linux x64)
- GHAs in process

-------------

Commit messages:
 - Fix NMT integration into pp() debug command

Changes: https://git.openjdk.java.net/jdk/pull/7297/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7297&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281023
  Stats: 28 lines in 3 files changed: 9 ins; 3 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7297.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7297/head:pull/7297

PR: https://git.openjdk.java.net/jdk/pull/7297

From tschatzl at openjdk.java.net  Tue Feb  1 09:18:12 2022
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Tue, 1 Feb 2022 09:18:12 GMT
Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes
In-Reply-To: <SDUK8k1OZZU7qha8dERVEj34cNap1YPEqEv92b1hCxw=.908aecb5-b941-48c6-a690-49690f2359ad@github.com>
References: <SDUK8k1OZZU7qha8dERVEj34cNap1YPEqEv92b1hCxw=.908aecb5-b941-48c6-a690-49690f2359ad@github.com>
Message-ID: <I1ul07k2nIPIc5g-izPNQD9CGM8LXlBGtWDx84fjWS0=.08738fb2-4898-42e6-969c-4b90b77f7247@github.com>

On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the HotSpot Style Guide change process.
> 
> The current process involves gathering consensus among the HotSpot Group
> Members.  That's fine for changes of substance.  But it seems overly weighty
> for editorial changes that don't affect the substance of the guide, but only
> it's clarity or accuracy.
> 
> The proposed change would permit the normal PR process to be used for such
> changes, but require the requisite reviewers to additionally be HotSpot Group
> Members.
> 
> Note that there have already been a couple of changes that effectively
> followed the proposed new process.
> https://bugs.openjdk.java.net/browse/JDK-8274169
> https://bugs.openjdk.java.net/browse/JDK-8280182
> 
> This is a modification of the Style Guide, so rough consensus among the
> HotSpot Group members is required to make this change. Only Group members
> should vote for approval (via the github PR), though reasoned objections or
> comments from anyone will be considered. A decision on this proposal will not
> be made before Monday 14-Feb-2022 at 12h00 UTC.
> 
> Since we're piggybacking on github PRs here, please use the PR review process
> to approve (click on Review Changes > Approve), rather than sending a "vote:
> yes" email reply that would be normal for a CFV.

Marked as reviewed by tschatzl (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7281

From duke at openjdk.java.net  Tue Feb  1 11:09:20 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Tue, 1 Feb 2022 11:09:20 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
Message-ID: <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>

On Mon, 31 Jan 2022 22:35:35 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix popframe failures
>
> src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 429:
> 
>> 427: #else
>> 428:     warning("UseROPProtection specified, but not supported in the VM.");
>> 429: #endif
> 
> If we issue these warnings should `_rop_protection` still be set true?

As per this conversation: https://github.com/openjdk/jdk/pull/6334#discussion_r791722292

The idea was, the user is explicitly asking for asking for pac-ret so we should honour that. Whereas standard would only enable what is supported for that system.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Tue Feb  1 12:15:11 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Tue, 1 Feb 2022 12:15:11 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent.
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <vpP-hHvxVGTb9XVp3oRH-cAPpT7caar1di-gSUzZrCY=.da7a5c2c-b4a1-4c51-9452-4d88b1aaa35d@github.com>

On Tue, 1 Feb 2022 03:35:13 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

> 8251505: Use of types in compiler shared code should be consistent.

Changes requested by eastig at github.com (no known OpenJDK username).

src/hotspot/share/utilities/elfFile.cpp line 94:

> 92: }
> 93: 
> 94: bool FileReader::set_position(int64_t offset) {

You introduce a bug here.
`fseek` declaration:

int fseek ( FILE * stream, long int offset, int origin );


`fseek` will read only 32 bits of `offset` if `sizeof(long)==32`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From dholmes at openjdk.java.net  Tue Feb  1 12:45:12 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 1 Feb 2022 12:45:12 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
Message-ID: <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>

On Tue, 1 Feb 2022 11:05:46 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 429:
>> 
>>> 427: #else
>>> 428:     warning("UseROPProtection specified, but not supported in the VM.");
>>> 429: #endif
>> 
>> If we issue these warnings should `_rop_protection` still be set true?
>
> As per this conversation: https://github.com/openjdk/jdk/pull/6334#discussion_r791722292
> 
> The idea was, the user is explicitly asking for asking for pac-ret so we should honour that. Whereas standard would only enable what is supported for that system.

But we can't honour that because it is not supported. Further, the suggestion in the referenced discussion seemed to be based on the assumption that doing so would be harmless because it is NOP based, but you have indicated that may not be the case and so it may actually lead to a crash!

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From mdoerr at openjdk.java.net  Tue Feb  1 13:30:40 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Tue, 1 Feb 2022 13:30:40 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
Message-ID: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>

PPC64 implementation of JDK-8277180.

-------------

Commit messages:
 - 8281043: Intrinsify recursive ObjectMonitor locking for PPC64

Changes: https://git.openjdk.java.net/jdk/pull/7305/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7305&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281043
  Stats: 21 lines in 1 file changed: 8 ins; 3 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7305.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7305/head:pull/7305

PR: https://git.openjdk.java.net/jdk/pull/7305

From duke at openjdk.java.net  Tue Feb  1 13:47:12 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Tue, 1 Feb 2022 13:47:12 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
 <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
Message-ID: <igLdlFO5Kqt5hIOJuHH43frWNFF3HT6o6zfxEspSiZs=.4f03831b-26be-42ba-b2a7-591034105cb2@github.com>

On Tue, 1 Feb 2022 12:42:26 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> As per this conversation: https://github.com/openjdk/jdk/pull/6334#discussion_r791722292
>> 
>> The idea was, the user is explicitly asking for asking for pac-ret so we should honour that. Whereas standard would only enable what is supported for that system.
>
> But we can't honour that because it is not supported. Further, the suggestion in the referenced discussion seemed to be based on the assumption that doing so would be harmless because it is NOP based, but you have indicated that may not be the case and so it may actually lead to a crash!

Before I change anything - @theRealAph you had an opinion here too...

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From hseigel at openjdk.java.net  Tue Feb  1 14:13:51 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 1 Feb 2022 14:13:51 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
Message-ID: <hp07hAXZ6rztsfy8cpCMQg45z6bE8nwvM3m6Xivt6a8=.1b32583c-0620-4dd0-9ce1-31840b5e201a@github.com>

> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
> 
> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  changes to address some review comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7248/files
  - new: https://git.openjdk.java.net/jdk/pull/7248/files/8cc29ebe..ca2097e4

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=00-01

  Stats: 49 lines in 8 files changed: 3 ins; 25 del; 21 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7248.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7248/head:pull/7248

PR: https://git.openjdk.java.net/jdk/pull/7248

From hseigel at openjdk.java.net  Tue Feb  1 14:13:57 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 1 Feb 2022 14:13:57 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <N1PYSaAN2qV-fa9T7pl-kiY7FmuLAIVAHj6-i6XIA2U=.ee81e8bf-e9f9-4f34-b5cb-60175cfb1a15@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <N1PYSaAN2qV-fa9T7pl-kiY7FmuLAIVAHj6-i6XIA2U=.ee81e8bf-e9f9-4f34-b5cb-60175cfb1a15@github.com>
Message-ID: <k7XEln_td2bO6CTDhqvQtmb1uO_FuaiqrYJMZGVe98U=.efa3dabc-f6ff-480a-9db4-f339e8c0ab16@github.com>

On Fri, 28 Jan 2022 04:55:48 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   changes to address some review comments
>
> src/hotspot/share/runtime/os.hpp line 533:
> 
>> 531:   // platforms that support such things.  This calls shutdown() and then aborts.
>> 532:   static void abort(bool dump_core, void *siginfo, const void *context);
>> 533:   static void abort(bool dump_core);
> 
> I don't understand why the change to the default arg was needed. There should be no conflict between `os::abort()` and `::abort()`.

I reverted the abort() changes.  Thanks for correcting this.

> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 97:
> 
>> 95: FORBID_C_FUNCTION(FILE*    fopen(const char*, const char*),           "use os::fopen");
>> 96: FORBID_C_FUNCTION(int      fsync(int),                                "use os::fsync");
>> 97: FORBID_C_FUNCTION(int      ftruncate(int, off_t),                     "use os::ftruncate");
> 
> Shouldn't this be ftruncate for BSD and ftruncate64 for other Posix (not sure what Windows has)?

Platform agnostic code would call ftruncate(), not ftruncate64().  So I think this is correct as is.

> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 99:
> 
>> 97: FORBID_C_FUNCTION(int      ftruncate(int, off_t),                     "use os::ftruncate");
>> 98: FORBID_C_FUNCTION(void     funlockfile(FILE *),                       "use os::funlockfile");
>> 99: FORBID_C_FUNCTION(off_t    lseek(int, off_t, int),                    "use os::lseek");
> 
> Similarly there should be a lseek64 definition too.

Like ftruncate(), platform agnostic code would call lseek(), not lseek64().  So I think this is correct as is.

> src/hotspot/share/utilities/ostream.cpp line 615:
> 
>> 613: 
>> 614: PRAGMA_DIAG_PUSH
>> 615: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(write);
> 
> Why do we not call os::write here?

fixed

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From hseigel at openjdk.java.net  Tue Feb  1 14:13:52 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 1 Feb 2022 14:13:52 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability
In-Reply-To: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
Message-ID: <y1O-MGq93IdG7gQbFFY5soZQTrzdOdsrXR4vodD-Dg0=.11934586-fd78-4781-9fbc-fd218103c767@github.com>

On Thu, 27 Jan 2022 19:18:10 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
> 
> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
> 
> Thanks, Harold

The second commit contains minor changes in response to review comments.  The changes include removing unneeded "os::" from os_*.cpp files, reverting all changes to os_aix.cpp, reverting changes to abort(), removing "#include <dirent.h>" and related funcations from compilerWarnings_gcc.hpp, and changing "::write()" to "os::write()" in ostream.cpp.

This update does not address bigger issues such as structure and placement concerns and whether or not to do this change at all.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From hseigel at openjdk.java.net  Tue Feb  1 14:13:53 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 1 Feb 2022 14:13:53 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <MPA-R00PtUfcCA8VMSferiL1zp16MSgcJ8UaH18ohn8=.93b4ea2b-452f-4745-b936-9dbf9deac45a@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <N1PYSaAN2qV-fa9T7pl-kiY7FmuLAIVAHj6-i6XIA2U=.ee81e8bf-e9f9-4f34-b5cb-60175cfb1a15@github.com>
 <MPA-R00PtUfcCA8VMSferiL1zp16MSgcJ8UaH18ohn8=.93b4ea2b-452f-4745-b936-9dbf9deac45a@github.com>
Message-ID: <lA0oCSAMrTQ2jSBQqDLIJ0k7YXxmL0iVvSW_NETNpk0=.779fde95-1853-4efd-b914-79baa7c653e8@github.com>

On Fri, 28 Jan 2022 19:32:20 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> src/hotspot/os/aix/os_aix.cpp line 2499:
>> 
>>> 2497:   struct dirent *ptr;
>>> 2498: 
>>> 2499:   dir = os::opendir(path);
>> 
>> Just to clarify, as we are in the scope of the os class both `opendir` and `os::opendir` are the same thing here - and similarly for other code in the os class - right?
>
> Yes, that's correct.  So an unqualified opendir here should not trigger a forbidden warning.

I removed "os:::" from the os class files.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From hseigel at openjdk.java.net  Tue Feb  1 14:13:59 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 1 Feb 2022 14:13:59 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <A8HCSM3ayIKy_c-4D1_uoWkEO92-H9SsPib3JoL-snU=.f88d4bd5-90f4-4cfe-9a54-3bce73a05521@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <N1PYSaAN2qV-fa9T7pl-kiY7FmuLAIVAHj6-i6XIA2U=.ee81e8bf-e9f9-4f34-b5cb-60175cfb1a15@github.com>
 <MPA-R00PtUfcCA8VMSferiL1zp16MSgcJ8UaH18ohn8=.93b4ea2b-452f-4745-b936-9dbf9deac45a@github.com>
 <A8HCSM3ayIKy_c-4D1_uoWkEO92-H9SsPib3JoL-snU=.f88d4bd5-90f4-4cfe-9a54-3bce73a05521@github.com>
Message-ID: <dI0Yji28R681uqH_qnKmcPY28p0bSVqvVjqvPPAXRF0=.0bbdce24-4da4-4833-a1b4-f395e891d096@github.com>

On Sat, 29 Jan 2022 07:09:18 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> We only compile AIX with xclang these days.  I don't know how our "xlc" compiler platform mechanism interacts with our "gcc" (which is really both gcc and clang) compiler platform, or if it interacts, or if it should.  But none of that matters for the dirent.h problem.  The problem there is that it's a system header, irrespective of what compiler is being used, and it has this problem.  So whether we need this NULL cruft here depends on whether AIX with xclang uses this file or not.  One option would be to just not deal with the dirent stuff yet, saving that for a followup focused on that problem.
>
> Sorry, I'm confused. We build AIX with xlc. I don't believe we even include this file on AIX. How does this help AIX?

I removed the changes for the dirent functions and removed the above code.  I also reverted all changes to os_aix.cpp.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From hseigel at openjdk.java.net  Tue Feb  1 14:14:00 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 1 Feb 2022 14:14:00 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <ccy8hD10bNUW4vDLsI8a_EsAbgBa24ckIMIjh8RGhbs=.42ad99f3-bf97-4e68-8ca4-ee4e58e45d75@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <ccy8hD10bNUW4vDLsI8a_EsAbgBa24ckIMIjh8RGhbs=.42ad99f3-bf97-4e68-8ca4-ee4e58e45d75@github.com>
Message-ID: <vFXg6jSemPAx7fmWH0SBHmnZuS78Tf8xfk6un0vdHNI=.8452c16d-2454-43dc-a66b-6aa0fa8e6769@github.com>

On Fri, 28 Jan 2022 19:33:21 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   changes to address some review comments
>
> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 114:
> 
>> 112: 
>> 113: #define FORBID_C_FUNCTION(signature, alternative)
>> 114: #define PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(name)
> 
> These aren't needed.  The default empty definitions in compilerWarnings.hpp cover this case.

fixed

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From chagedorn at openjdk.java.net  Tue Feb  1 14:56:10 2022
From: chagedorn at openjdk.java.net (Christian Hagedorn)
Date: Tue, 1 Feb 2022 14:56:10 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files
In-Reply-To: <fa97b38d-44a7-dec7-503b-f0f29bcafcd7@oracle.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
 <fa97b38d-44a7-dec7-503b-f0f29bcafcd7@oracle.com>
Message-ID: <E2hfGlG_BXtlyvoS-0ijMibFTv9aMd2L0_eKg3UD_4A=.6e26b041-9d95-4dca-a677-cf25dc32b047@github.com>

On Mon, 31 Jan 2022 22:12:30 GMT, David Holmes <david.holmes at oracle.com> wrote:

> Hi Christian,
> 
> Sorry for the delay in coming back to this, I wanted to see what other feedback arose.

No problem, thanks for your feedback David!

> > > That's a valid concern. I've also asked myself this question when I had initially started using some assertions. We should not crash again during error reporting. I've therefore tried to be as conservative as possible and added bailouts instead, also in loops when reading data. But of course, this is just a best effort and by no means a guarantee to be safe (especially in terms of crashes). What could be alternatives to make this better?
> > 
> > 
> > If the parsing code turns out to be very problematic in a signal handling context, then we could disable it in that context. So we really want to try and do a lot of testing by throwing random signals at the VM and see what breaks.
> 
> Source information in hs-err file stacks can be tremendously useful. Lets try the retry-callstack-dumping without features idea in case of a secondary crash, outlined above, first.

Should we still handle that in a separate RFE later or should this go along with this patch/prerequisite? What do you think?

> > > > Secondly, on the same issue the use of unified logging within this code seems even more likely to be problematic - I'm not aware of us currently using UL during error reporting. It may work in basic usecases but if it triggers logfile rotation or other more complex actions what then?
> > > 
> > > 
> > > I haven't thought about this before. To be honest, I think UL printing of the `dwarf` tag is only useful during development when adding something new to the parser or when debugging. I don't see much value of these messages otherwise - even less for a Java user. As a first step, I could change the logs from `log_X()` to `log_develop_X()` but that just shifts the problem to non-product builds. Another option (or additional thing) could be to guard the log messages with a new develop flag that's disabled by default. By setting it for development, we accept that it might be unsafe which should be fine.
> > 
> > 
> > I think changing the logging to develop only is a reasonable step. I don't see logging of crash handling / error reporting as generally useful for the end user.
> 
> I think the right way to go longterm would be to give us a minimalistic safe logging API for these cases (signal handling, pre-initialization) or make UL safe to use always.

That would be ideal if UL usage could be made safe in the future for these cases. But as for now, I will start by changing the logging to develop to limit UL usage to debug builds only which does not affect end users anymore.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7126

From zgu at openjdk.java.net  Tue Feb  1 14:58:13 2022
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Tue, 1 Feb 2022 14:58:13 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work
In-Reply-To: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
Message-ID: <tPpDiCW8Lne_aeFrYinpQUMXj7rtlfdnDJQALim1WlA=.69ccba88-a8a8-44eb-af97-0370fedeb32d@github.com>

On Tue, 1 Feb 2022 08:36:42 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
> 
> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
> 
> 
> (gdb) call pp(0x7ffff010b030)
> 
> "Executing pp"
> 
> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
> 180 assert(addr != NULL, "Invalid address");
> 
> 
> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
> 
> 
> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
> 
> 
> This patch:
> - fixes uses of SafeFetch
> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
> 
> Tests:
> - I tested manually that pp works with different levels of NMT (Linux x64)
> - GHAs in process

Changes requested by zgu (Reviewer).

src/hotspot/share/services/virtualMemoryTracker.cpp line 699:

> 697:   walk_virtual_memory(&walker);
> 698:   return walker.region();
> 699: }

Snapshot the region is for avoiding race pointed out by Ioi in code review, because other thread might release the region after walk.

src/hotspot/share/utilities/debug.cpp line 505:

> 503:       //  is handled quietly by the VM, but it will trip up the debugger. gdb will catch the signal and disable
> 504:       //  the pp() command for further use.
> 505:       // In order to avoid that, before invoking pp(), switch off SIGSEGV handling with "handle SIGSEGV nostop".

Ah, my .gdbinit has `handle SIGSEGV nostop noprint pass`

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From mdoerr at openjdk.java.net  Tue Feb  1 15:16:50 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Tue, 1 Feb 2022 15:16:50 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v2]
In-Reply-To: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
Message-ID: <vWtcRXoZWW6LVMdFrcIJ26yfq_t0l7AYcYAC4OauHBk=.8e040689-3a2c-48ac-acad-9c44797fdd26@github.com>

> PPC64 implementation of JDK-8277180.

Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:

  Shorter and better redable recursions increment sequence.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7305/files
  - new: https://git.openjdk.java.net/jdk/pull/7305/files/2428acce..1eec2373

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7305&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7305&range=00-01

  Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7305.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7305/head:pull/7305

PR: https://git.openjdk.java.net/jdk/pull/7305

From prappo at openjdk.java.net  Tue Feb  1 16:31:38 2022
From: prappo at openjdk.java.net (Pavel Rappo)
Date: Tue, 1 Feb 2022 16:31:38 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
Message-ID: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>

While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.

Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

-------------

Commit messages:
 - Initial commit

Changes: https://git.openjdk.java.net/jdk/pull/7311/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7311&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281057
  Stats: 18 lines in 5 files changed: 0 ins; 0 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7311.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7311/head:pull/7311

PR: https://git.openjdk.java.net/jdk/pull/7311

From mdoerr at openjdk.java.net  Tue Feb  1 17:30:24 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Tue, 1 Feb 2022 17:30:24 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames
Message-ID: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>

s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

-------------

Commit messages:
 - 8281061: [s390] JFR runs into assertions while validating interpreter frames

Changes: https://git.openjdk.java.net/jdk/pull/7312/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281061
  Stats: 12 lines in 2 files changed: 2 ins; 3 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7312.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7312/head:pull/7312

PR: https://git.openjdk.java.net/jdk/pull/7312

From darcy at openjdk.java.net  Tue Feb  1 17:34:10 2022
From: darcy at openjdk.java.net (Joe Darcy)
Date: Tue, 1 Feb 2022 17:34:10 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <64qmLKz3G6wYRdnA0Y65Fho7oUZqint8EKD-a1GXkJY=.4536f347-7bcd-4d74-b6e3-01441cf41f2c@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

Marked as reviewed by darcy (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From iris at openjdk.java.net  Tue Feb  1 17:45:12 2022
From: iris at openjdk.java.net (Iris Clark)
Date: Tue, 1 Feb 2022 17:45:12 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <jQn3zjSvZz1CMHBdaI8JiAqhNgbsM03NXzcYhMdVfUk=.358c2a31-4030-4935-83c8-5b337a0796e6@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

Marked as reviewed by iris (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From thartmann at openjdk.java.net  Tue Feb  1 17:45:23 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Tue, 1 Feb 2022 17:45:23 GMT
Subject: [jdk18] Integrated: 8278871: [JVMCI] assert((uint)reason < 2*
 _trap_hist_limit) failed: oob
In-Reply-To: <Y0-PDD8xKa5lCZ-SMWan5r3MyZeD4shbLdyAdGHK7Uo=.1a1f1561-54ec-456e-8107-c10083ddfbd4@github.com>
References: <Y0-PDD8xKa5lCZ-SMWan5r3MyZeD4shbLdyAdGHK7Uo=.1a1f1561-54ec-456e-8107-c10083ddfbd4@github.com>
Message-ID: <LEPXb1smnWGfke7tIsYInHTJnGa3XxUHFeFSwFOg12E=.be7db9bf-0668-4810-a9ce-afcdc9ef9736@github.com>

On Tue, 1 Feb 2022 08:20:29 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:

> Backport of [JDK-8278871](https://bugs.openjdk.java.net/browse/JDK-8278871). Applies cleanly. Fix request is pending.

This pull request has now been integrated.

Changeset: 2531c332
Author:    Tobias Hartmann <thartmann at openjdk.org>
URL:       https://git.openjdk.java.net/jdk18/commit/2531c332f89c5faedf71ce1737373581c9abf905
Stats:     21 lines in 5 files changed: 9 ins; 4 del; 8 mod

8278871: [JVMCI] assert((uint)reason < 2* _trap_hist_limit) failed: oob

Backport-of: 6f0e8da6d3bef340299e48977d5e17d05eabe682

-------------

PR: https://git.openjdk.java.net/jdk18/pull/114

From duke at openjdk.java.net  Tue Feb  1 18:29:36 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Tue, 1 Feb 2022 18:29:36 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v2]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <QnYUr1lR7rEFSWuKKbHC-6x-lqk7TufFXwbHAUddPDA=.6909511f-26f5-45d7-bb1e-1a9d06ce0c42@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:

  Fix a regression

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/b08834de..5c0e4349

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=00-01

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From duke at openjdk.java.net  Tue Feb  1 18:32:44 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Tue, 1 Feb 2022 18:32:44 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v3]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <heyWRzc917-1PLb8Est9-zp_qO-IXk1QpY87ZqwWTm8=.db1ac360-8c55-4534-9571-1819166c501e@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:

  Fix a regression

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/5c0e4349..d775761d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From aph at openjdk.java.net  Tue Feb  1 18:38:21 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Tue, 1 Feb 2022 18:38:21 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
Message-ID: <HbfthVhtbYa9SGgArnj2xCzFv0BfPb6dZ87Wz6pcUoM=.12208f6b-db6e-4b83-8e8d-e5b4c406fff7@github.com>

On Mon, 31 Jan 2022 22:25:38 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix popframe failures
>
> src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 417:
> 
>> 415:     // Enable PAC if this code has been built with branch-protection and the CPU/OS supports it.
>> 416: #ifdef __ARM_FEATURE_PAC_DEFAULT
>> 417:     if (_features & CPU_PACA) {
> 
> Style nit: no implicit booleans - expand as "if ( A & B != 0)"

Oh yuck, really? This is my punishment for not paying attention to the style guide dicussions.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Tue Feb  1 18:38:21 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Tue, 1 Feb 2022 18:38:21 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
 <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
Message-ID: <K0ktVZTFjc9Gpij2CBsFjqzxIYLoRy2YRxYmzbGjV_o=.0731f464-b456-4160-b090-d6432038ecdd@github.com>

On Tue, 1 Feb 2022 12:42:26 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> As per this conversation: https://github.com/openjdk/jdk/pull/6334#discussion_r791722292
>> 
>> The idea was, the user is explicitly asking for asking for pac-ret so we should honour that. Whereas standard would only enable what is supported for that system.
>
> But we can't honour that because it is not supported. Further, the suggestion in the referenced discussion seemed to be based on the assumption that doing so would be harmless because it is NOP based, but you have indicated that may not be the case and so it may actually lead to a crash!

Given that the implementation has now changed so much that it's no longer NOP based, I'll go with @dholmes-ora .
One other thing, though: it might be better to say here "but this VM was built without ROP-protection support." That's more informative, IMO.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Tue Feb  1 18:38:49 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Tue, 1 Feb 2022 18:38:49 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v4]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <BR9_hw6LBLj6-vlj--KRH4Y3qRdZ7Kcvcg36pokrS4E=.0658362d-8450-416a-8249-385826fd92b7@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:

  Remove unintentional formatting

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/d775761d..05597245

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From shade at openjdk.java.net  Tue Feb  1 20:58:14 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 1 Feb 2022 20:58:14 GMT
Subject: RFR: 8280867: Cpuid1Ecx feature parsing is incorrect for AMD CPUs
In-Reply-To: <baW3uIWTuaw1ixecUYHVX3NSQwN4U4x_X8Kx0XV9GH4=.eb5dde84-2ad3-40c5-8d38-271ee4cb20a4@github.com>
References: <baW3uIWTuaw1ixecUYHVX3NSQwN4U4x_X8Kx0XV9GH4=.eb5dde84-2ad3-40c5-8d38-271ee4cb20a4@github.com>
Message-ID: <gZ2ZMFzFrNt-PMyBanFIQNSLH7GSHvZfdhWFP6EJXTE=.cc747c0e-18ea-46b4-9483-85c7674fd854@github.com>

On Mon, 31 Jan 2022 11:26:29 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> See discussion in the bug. AFAICS, the fix is to "just" shift the flags by one to match both Intel and AMD specs. I believe this is not a serious bug, because adjacent bits in AMD case are set on modern chips, and Intel detection code only uses `lzcnt` and `prefetchw` out of these flags, both with Intel-specific hacks that are dropped now.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug on TR 3970X (Zen 2)
>  - [x]  Linux x86_64 fastdebug on i5-11500 (Rocket Lake)
>  - [x] Eyeballing `-Xlog:os+cpu` on TR 3970X (Zen 2) -- no change in detected flags
>  - [x]  Eyeballing `-Xlog:os+cpu` on i5-11500 (Rocket Lake) -- no change in detected flags

Right, thanks for reviews!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7287

From shade at openjdk.java.net  Tue Feb  1 20:58:14 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 1 Feb 2022 20:58:14 GMT
Subject: Integrated: 8280867: Cpuid1Ecx feature parsing is incorrect for AMD
 CPUs
In-Reply-To: <baW3uIWTuaw1ixecUYHVX3NSQwN4U4x_X8Kx0XV9GH4=.eb5dde84-2ad3-40c5-8d38-271ee4cb20a4@github.com>
References: <baW3uIWTuaw1ixecUYHVX3NSQwN4U4x_X8Kx0XV9GH4=.eb5dde84-2ad3-40c5-8d38-271ee4cb20a4@github.com>
Message-ID: <cVlmrR6zgtj2zQshGBAA1JX_nAh-gU05jBuDub17I_c=.b0569a2c-9093-4eed-81b7-134fff1eab59@github.com>

On Mon, 31 Jan 2022 11:26:29 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> See discussion in the bug. AFAICS, the fix is to "just" shift the flags by one to match both Intel and AMD specs. I believe this is not a serious bug, because adjacent bits in AMD case are set on modern chips, and Intel detection code only uses `lzcnt` and `prefetchw` out of these flags, both with Intel-specific hacks that are dropped now.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug on TR 3970X (Zen 2)
>  - [x]  Linux x86_64 fastdebug on i5-11500 (Rocket Lake)
>  - [x] Eyeballing `-Xlog:os+cpu` on TR 3970X (Zen 2) -- no change in detected flags
>  - [x]  Eyeballing `-Xlog:os+cpu` on i5-11500 (Rocket Lake) -- no change in detected flags

This pull request has now been integrated.

Changeset: a18beb47
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/a18beb4797a1ca6fc6b31e997be48b2bd91c6ac0
Stats:     9 lines in 1 file changed: 0 ins; 1 del; 8 mod

8280867: Cpuid1Ecx feature parsing is incorrect for AMD CPUs

Reviewed-by: kvn, dlong

-------------

PR: https://git.openjdk.java.net/jdk/pull/7287

From phh at openjdk.java.net  Tue Feb  1 21:09:11 2022
From: phh at openjdk.java.net (Paul Hohensee)
Date: Tue, 1 Feb 2022 21:09:11 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v4]
In-Reply-To: <BR9_hw6LBLj6-vlj--KRH4Y3qRdZ7Kcvcg36pokrS4E=.0658362d-8450-416a-8249-385826fd92b7@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <BR9_hw6LBLj6-vlj--KRH4Y3qRdZ7Kcvcg36pokrS4E=.0658362d-8450-416a-8249-385826fd92b7@github.com>
Message-ID: <lai4axc7x_RerCLOpKEV_31YO3BW-UqRyfH1qKSCdfk=.1e953838-764a-4c5b-80a5-af7d8772445e@github.com>

On Tue, 1 Feb 2022 18:38:49 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove unintentional formatting

Why not use 'jlong' everywhere instead of int64_t? You've got a mix of them (well, compileBroker.* uses jlong), better to be consistent.. Ditto INT64_FORMAT and JLONG_FORMAT. You'd also avoid declaring "declare_integer_type(int64_t)" in vmStructs.cpp.

In gc_globals.hpp, use of intx is likely intentional, so I'd leave it alone. intx resolves to intptr_t (see globalDefinitions.hpp), which is 32 bits on 32-bit systems and 64 bits on 64-bit ones, which is what you want.

-------------

Changes requested by phh (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7294

From duke at openjdk.java.net  Tue Feb  1 22:56:54 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Tue, 1 Feb 2022 22:56:54 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v5]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <Q7SnfYb3wxRUdTc8I3CtHXvVd06c68Wv5pn6WS-I_dk=.a6420bd5-b437-4888-8873-844ee956ca26@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:

  temp

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/05597245..22fbe08a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=03-04

  Stats: 26 lines in 7 files changed: 0 ins; 1 del; 25 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From duke at openjdk.java.net  Tue Feb  1 23:08:48 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Tue, 1 Feb 2022 23:08:48 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v6]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <gDGFz_L8HcQO4lgdodxOoAJ3dlFwnIS4fWudRAstxxo=.15d40df9-50e7-4f6d-8877-2622a9b876c9@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  Use jlong instead of int64_t

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/22fbe08a..bca7b783

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=04-05

  Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From dholmes at openjdk.java.net  Tue Feb  1 23:15:11 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 1 Feb 2022 23:15:11 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v6]
In-Reply-To: <gDGFz_L8HcQO4lgdodxOoAJ3dlFwnIS4fWudRAstxxo=.15d40df9-50e7-4f6d-8877-2622a9b876c9@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <gDGFz_L8HcQO4lgdodxOoAJ3dlFwnIS4fWudRAstxxo=.15d40df9-50e7-4f6d-8877-2622a9b876c9@github.com>
Message-ID: <e5jpDsjdvJXQc97YH2GRXBokrV0Df4gHkow1CEc9ATs=.a84e7374-b125-462f-97ac-ccba0a697c2f@github.com>

On Tue, 1 Feb 2022 23:08:48 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   Use jlong instead of int64_t

Do not use jlong everywhere. We should only be using jlong where we have values that will interact with Java code and so have to be jlong to be compatible. Otherwise for a 64-bit type we should be using int64_t, or uint64_t as appropriate. For a type that should be 32-bit or 64-bit depending on the environment we should use intx, intptr_t or size_t depending on the nature of the variable.

Also do not force-push changes, just commit any changes as normal and push them. When the change is integrated the skara tooling will flatten things into one clean commit. If you force-push you mess up the PR review process.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From dholmes at openjdk.java.net  Wed Feb  2 00:13:12 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 2 Feb 2022 00:13:12 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <k7XEln_td2bO6CTDhqvQtmb1uO_FuaiqrYJMZGVe98U=.efa3dabc-f6ff-480a-9db4-f339e8c0ab16@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <N1PYSaAN2qV-fa9T7pl-kiY7FmuLAIVAHj6-i6XIA2U=.ee81e8bf-e9f9-4f34-b5cb-60175cfb1a15@github.com>
 <k7XEln_td2bO6CTDhqvQtmb1uO_FuaiqrYJMZGVe98U=.efa3dabc-f6ff-480a-9db4-f339e8c0ab16@github.com>
Message-ID: <sEet-0d_JuatqYFpVew0jq15jWWeNia2cLsrGa5vhII=.b01f22ad-c7b8-49ac-8756-31fc419ffee8@github.com>

On Tue, 1 Feb 2022 14:08:40 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 97:
>> 
>>> 95: FORBID_C_FUNCTION(FILE*    fopen(const char*, const char*),           "use os::fopen");
>>> 96: FORBID_C_FUNCTION(int      fsync(int),                                "use os::fsync");
>>> 97: FORBID_C_FUNCTION(int      ftruncate(int, off_t),                     "use os::ftruncate");
>> 
>> Shouldn't this be ftruncate for BSD and ftruncate64 for other Posix (not sure what Windows has)?
>
> Platform agnostic code would call ftruncate(), not ftruncate64().  So I think this is correct as is.

You need to enable the warning for the function that we would use, which we are not supposed to use and that would be `ftruncate64` on Linux.

>> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 99:
>> 
>>> 97: FORBID_C_FUNCTION(int      ftruncate(int, off_t),                     "use os::ftruncate");
>>> 98: FORBID_C_FUNCTION(void     funlockfile(FILE *),                       "use os::funlockfile");
>>> 99: FORBID_C_FUNCTION(off_t    lseek(int, off_t, int),                    "use os::lseek");
>> 
>> Similarly there should be a lseek64 definition too.
>
> Like ftruncate(), platform agnostic code would call lseek(), not lseek64().  So I think this is correct as is.

I disagree - you are not enabling the warnings for all the functions that would be used.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From duke at openjdk.java.net  Wed Feb  2 01:52:46 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Wed, 2 Feb 2022 01:52:46 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:

  Revert "Use jlong instead of int64_t"

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/bca7b783..01f3b1f2

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=05-06

  Stats: 23 lines in 4 files changed: 1 ins; 0 del; 22 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From dlong at openjdk.java.net  Wed Feb  2 03:39:07 2022
From: dlong at openjdk.java.net (Dean Long)
Date: Wed, 2 Feb 2022 03:39:07 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
Message-ID: <abIurbqw_i9sWe2GdJVE324gLf8gtloU3xvU61kTs74=.8a2ff7b6-3f68-4bf7-b9fe-e7f9420d4b8b@github.com>

On Wed, 2 Feb 2022 01:52:46 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Revert "Use jlong instead of int64_t"

How about making the traversal mark type unsigned?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From stuefe at openjdk.java.net  Wed Feb  2 08:22:39 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 2 Feb 2022 08:22:39 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v2]
In-Reply-To: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
Message-ID: <w50g1dEB_VjurAYZCBofEaFcwGr6ctEdSeW4OIY1pDI=.0b3b459e-81fc-4cdf-b36e-e7bc19f597f2@github.com>

> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
> 
> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
> 
> 
> (gdb) call pp(0x7ffff010b030)
> 
> "Executing pp"
> 
> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
> 180 assert(addr != NULL, "Invalid address");
> 
> 
> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
> 
> 
> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
> 
> 
> This patch:
> - fixes uses of SafeFetch
> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
> 
> Tests:
> - I tested manually that pp works with different levels of NMT (Linux x64)
> - GHAs in process

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

  Zhengyus remarks

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7297/files
  - new: https://git.openjdk.java.net/jdk/pull/7297/files/6ad3018a..10dc66ed

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7297&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7297&range=00-01

  Stats: 103 lines in 5 files changed: 56 ins; 33 del; 14 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7297.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7297/head:pull/7297

PR: https://git.openjdk.java.net/jdk/pull/7297

From stuefe at openjdk.java.net  Wed Feb  2 08:27:14 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 2 Feb 2022 08:27:14 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v2]
In-Reply-To: <tPpDiCW8Lne_aeFrYinpQUMXj7rtlfdnDJQALim1WlA=.69ccba88-a8a8-44eb-af97-0370fedeb32d@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <tPpDiCW8Lne_aeFrYinpQUMXj7rtlfdnDJQALim1WlA=.69ccba88-a8a8-44eb-af97-0370fedeb32d@github.com>
Message-ID: <02M2JuFVSa-pZX75zgzYLEpBO7A_kihJZeSws5lrMCc=.e0f89c55-6f74-42b3-9307-d00449cd0730@github.com>

On Tue, 1 Feb 2022 14:55:10 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Zhengyus remarks
>
> Changes requested by zgu (Reviewer).

@zhengyu123 thanks for looking at this.

I rewrote the patch and made printing inline within lock protection. For symmetry, I also moved the printing code on the malloc side to the malloc tracker. Added comments.

Note that I think @iklam was overcautious in the original discussion. We already risk signals in the debugging session by using SafeFetch to read the malloc header. Using a reference to a potentially dead ReservedMemoryRegion runs an additional - very low - risk of signals. So, if we really want to be cautious, we should not print out NMT information at all in pp. But I think NMT is very useful and worth the risk. All we risk is a slightly miffed debugger.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From stuefe at openjdk.java.net  Wed Feb  2 08:30:08 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 2 Feb 2022 08:30:08 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v2]
In-Reply-To: <dI0Yji28R681uqH_qnKmcPY28p0bSVqvVjqvPPAXRF0=.0bbdce24-4da4-4833-a1b4-f395e891d096@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <N1PYSaAN2qV-fa9T7pl-kiY7FmuLAIVAHj6-i6XIA2U=.ee81e8bf-e9f9-4f34-b5cb-60175cfb1a15@github.com>
 <MPA-R00PtUfcCA8VMSferiL1zp16MSgcJ8UaH18ohn8=.93b4ea2b-452f-4745-b936-9dbf9deac45a@github.com>
 <A8HCSM3ayIKy_c-4D1_uoWkEO92-H9SsPib3JoL-snU=.f88d4bd5-90f4-4cfe-9a54-3bce73a05521@github.com>
 <dI0Yji28R681uqH_qnKmcPY28p0bSVqvVjqvPPAXRF0=.0bbdce24-4da4-4833-a1b4-f395e891d096@github.com>
Message-ID: <f8R8L77JgrVuolVV7GbFsaMoE7cC-QpQwz7ONWPcvD8=.0a61af76-bab5-4804-b9bb-399406012b28@github.com>

On Tue, 1 Feb 2022 14:07:49 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Sorry, I'm confused. We build AIX with xlc. I don't believe we even include this file on AIX. How does this help AIX?
>
> I removed the changes for the dirent functions and removed the above code.  I also reverted all changes to os_aix.cpp.

Thank you @hseigel !

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From duke at openjdk.java.net  Wed Feb  2 09:28:48 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Wed, 2 Feb 2022 09:28:48 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v15]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <3RcA40D5i_vhwGr49mD78aBNsaLjV_d13kEsTuL9S5I=.1d9d6d4a-a055-47b1-ae6f-41c26d1fae76@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Fix up nits

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/0b476542..b7925614

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=14
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=13-14

  Stats: 15 lines in 6 files changed: 2 ins; 3 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Wed Feb  2 09:32:14 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Wed, 2 Feb 2022 09:32:14 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <K0ktVZTFjc9Gpij2CBsFjqzxIYLoRy2YRxYmzbGjV_o=.0731f464-b456-4160-b090-d6432038ecdd@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
 <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
 <K0ktVZTFjc9Gpij2CBsFjqzxIYLoRy2YRxYmzbGjV_o=.0731f464-b456-4160-b090-d6432038ecdd@github.com>
Message-ID: <L0_01FsEw9Xxj4aEz-9CTmfrE5D9VVElO8vhzdu7pQA=.7504284f-5c17-4d8d-849c-3ba24bfc6b92@github.com>

On Tue, 1 Feb 2022 18:33:28 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> But we can't honour that because it is not supported. Further, the suggestion in the referenced discussion seemed to be based on the assumption that doing so would be harmless because it is NOP based, but you have indicated that may not be the case and so it may actually lead to a crash!
>
> Given that the implementation has now changed so much that it's no longer NOP based, I'll go with @dholmes-ora .
> One other thing, though: it might be better to say here "but this VM was built without ROP-protection support." That's more informative, IMO.

Ok, I'll fix up as suggested.

The beginning part of that message needs fixing too - UseROPProtection is no longer the name of the flag. I'll switch to:
"ROP-protection specified, but this VM was built without ROP-protection support."

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Wed Feb  2 09:37:15 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Wed, 2 Feb 2022 09:37:15 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <L0_01FsEw9Xxj4aEz-9CTmfrE5D9VVElO8vhzdu7pQA=.7504284f-5c17-4d8d-849c-3ba24bfc6b92@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
 <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
 <K0ktVZTFjc9Gpij2CBsFjqzxIYLoRy2YRxYmzbGjV_o=.0731f464-b456-4160-b090-d6432038ecdd@github.com>
 <L0_01FsEw9Xxj4aEz-9CTmfrE5D9VVElO8vhzdu7pQA=.7504284f-5c17-4d8d-849c-3ba24bfc6b92@github.com>
Message-ID: <oiqgcS6gDWPTBxG8KmAaDjswmAQ0Jl5rEuVtoQJKlRo=.fdf20545-05bb-45fb-bad8-d2ca244880e8@github.com>

On Wed, 2 Feb 2022 09:29:21 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> Given that the implementation has now changed so much that it's no longer NOP based, I'll go with @dholmes-ora .
>> One other thing, though: it might be better to say here "but this VM was built without ROP-protection support." That's more informative, IMO.
>
> Ok, I'll fix up as suggested.
> 
> The beginning part of that message needs fixing too - UseROPProtection is no longer the name of the flag. I'll switch to:
> "ROP-protection specified, but this VM was built without ROP-protection support."

And this change will keep ROP protection enabled if we fall into the "this VM was built without ROP-protection support.". In that case we'll be protecting generated code, but the VM itself won't be protected. This will run without crashing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From prappo at openjdk.java.net  Wed Feb  2 10:01:06 2022
From: prappo at openjdk.java.net (Pavel Rappo)
Date: Wed, 2 Feb 2022 10:01:06 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <1V1_jhMx07iBPq8rzhWP2pwNkb8lBNpeqP3jzA06lf0=.79432eb1-3bf4-4d28-acb7-f5c07bf1f0c4@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

I would appreciate it if serviceability and hotspot could review this PR too.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From aph at openjdk.java.net  Wed Feb  2 10:22:13 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Wed, 2 Feb 2022 10:22:13 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <oiqgcS6gDWPTBxG8KmAaDjswmAQ0Jl5rEuVtoQJKlRo=.fdf20545-05bb-45fb-bad8-d2ca244880e8@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
 <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
 <K0ktVZTFjc9Gpij2CBsFjqzxIYLoRy2YRxYmzbGjV_o=.0731f464-b456-4160-b090-d6432038ecdd@github.com>
 <L0_01FsEw9Xxj4aEz-9CTmfrE5D9VVElO8vhzdu7pQA=.7504284f-5c17-4d8d-849c-3ba24bfc6b92@github.com>
 <oiqgcS6gDWPTBxG8KmAaDjswmAQ0Jl5rEuVtoQJKlRo=.fdf20545-05bb-45fb-bad8-d2ca244880e8@github.com>
Message-ID: <Y73LvR7JcG4SUXZllgPQLmDeElg1XgEyzZCNLLWglw0=.43ea398e-ace9-45e1-97e3-d020adad51fb@github.com>

On Wed, 2 Feb 2022 09:34:20 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> And this change will keep ROP protection enabled if we fall into the "this VM was built without ROP-protection support.". In that case we'll be protecting generated code, but the VM itself won't be protected. This will run without crashing.

That's perfect.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From dholmes at openjdk.java.net  Wed Feb  2 12:08:06 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 2 Feb 2022 12:08:06 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <WImLWgTq8Bu-7ukhDMgsUiOAomHBV1AwNDjmNgoEENM=.f54a96c0-6f3f-4e90-a9f4-ad90ea27f629@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

src/jdk.compiler/share/classes/com/sun/tools/javac/code/Symbol.java line 670:

> 668:      *  modifier is ignored for this test.
> 669:      *
> 670:      *  See JLS 8.4.8.1 (without transitivity) and 8.4.8.4

Any idea what the "(without transitivity)" is referring to here and elsewhere?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From dholmes at openjdk.java.net  Wed Feb  2 12:14:04 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 2 Feb 2022 12:14:04 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <WMhTzudyyKkS8ntlOY_BMQs4vT-0BpxV5DbDw5hO1vM=.2a690c7b-6778-4716-92ad-2189121c5291@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

Hi Pavel,

All the section number changes look good and accurate. I have one query above, and also spotted one existing comment that is not correct.

Thanks,
David

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java line 1793:

> 1791:         }
> 1792: 
> 1793:         // Error if static method overrides instance method (JLS 8.4.8.2).

"overrides" should be "hides"

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7311

From prappo at openjdk.java.net  Wed Feb  2 12:34:05 2022
From: prappo at openjdk.java.net (Pavel Rappo)
Date: Wed, 2 Feb 2022 12:34:05 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <WImLWgTq8Bu-7ukhDMgsUiOAomHBV1AwNDjmNgoEENM=.f54a96c0-6f3f-4e90-a9f4-ad90ea27f629@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
 <WImLWgTq8Bu-7ukhDMgsUiOAomHBV1AwNDjmNgoEENM=.f54a96c0-6f3f-4e90-a9f4-ad90ea27f629@github.com>
Message-ID: <UN-kOvJYC1Yl2zzhgYtR8pCtGYWxxAfAxbYvCc63BmM=.886726eb-b75a-4ef3-8e1f-da7f025f219a@github.com>

On Wed, 2 Feb 2022 12:04:29 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
>> 
>> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.
>
> src/jdk.compiler/share/classes/com/sun/tools/javac/code/Symbol.java line 670:
> 
>> 668:      *  modifier is ignored for this test.
>> 669:      *
>> 670:      *  See JLS 8.4.8.1 (without transitivity) and 8.4.8.4
> 
> Any idea what the "(without transitivity)" is referring to here and elsewhere?

My guess is that "transitivity" here refers to the subclass relationship being the transitive closure of the direct subclass relationship. Could it also be that the "quirk" paragraph starting at com/sun/tools/javac/code/Symbol.java:2057 is relevant here? @mcimadamore?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From prappo at openjdk.java.net  Wed Feb  2 12:46:05 2022
From: prappo at openjdk.java.net (Pavel Rappo)
Date: Wed, 2 Feb 2022 12:46:05 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <WMhTzudyyKkS8ntlOY_BMQs4vT-0BpxV5DbDw5hO1vM=.2a690c7b-6778-4716-92ad-2189121c5291@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
 <WMhTzudyyKkS8ntlOY_BMQs4vT-0BpxV5DbDw5hO1vM=.2a690c7b-6778-4716-92ad-2189121c5291@github.com>
Message-ID: <v6UK9yUd-VyBTrKZRmlSUOllRI2fwhjV5VAzBu7yQcM=.fea1c545-df0a-478a-850d-d5e16d088b2b@github.com>

On Wed, 2 Feb 2022 12:06:39 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
>> 
>> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.
>
> src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java line 1793:
> 
>> 1791:         }
>> 1792: 
>> 1793:         // Error if static method overrides instance method (JLS 8.4.8.2).
> 
> "overrides" should be "hides"

Although you seem to be correct, the error messages and the code around operate using the term "override":

        // Error if static method overrides instance method (JLS 8.4.8.2).
        if ((m.flags() & STATIC) != 0 &&
                   (other.flags() & STATIC) == 0) {
            log.error(TreeInfo.diagnosticPositionFor(m, tree),
                      Errors.OverrideStatic(cannotOverride(m, other)));
            m.flags_field |= BAD_OVERRIDE;
            return;
        }

        // Error if instance method overrides static or final
        // method (JLS 8.4.8.1).
        if ((other.flags() & FINAL) != 0 ||
                 (m.flags() & STATIC) == 0 &&
                 (other.flags() & STATIC) != 0) {
            log.error(TreeInfo.diagnosticPositionFor(m, tree),
                      Errors.OverrideMeth(cannotOverride(m, other),
                                          asFlagSet(other.flags() & (FINAL | STATIC))));
            m.flags_field |= BAD_OVERRIDE;
            return;
        }


        /**
         * compiler.err.override.static=\
         *    {0}\n\
         *    overriding method is static
         */
        public static Error OverrideStatic(Fragment arg0) {
            return new Error("compiler", "override.static", arg0);
        }

Compiler folk, what do you think?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From zgu at openjdk.java.net  Wed Feb  2 13:34:09 2022
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Wed, 2 Feb 2022 13:34:09 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v2]
In-Reply-To: <w50g1dEB_VjurAYZCBofEaFcwGr6ctEdSeW4OIY1pDI=.0b3b459e-81fc-4cdf-b36e-e7bc19f597f2@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <w50g1dEB_VjurAYZCBofEaFcwGr6ctEdSeW4OIY1pDI=.0b3b459e-81fc-4cdf-b36e-e7bc19f597f2@github.com>
Message-ID: <vdcr53OK1o78ZM_jxSqLy-vGv2MBd1nPSg5uQ3PjWHQ=.99fd5cf3-3417-4524-ae82-f0616f906f5d@github.com>

On Wed, 2 Feb 2022 08:22:39 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
>> 
>> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
>> 
>> 
>> (gdb) call pp(0x7ffff010b030)
>> 
>> "Executing pp"
>> 
>> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
>> 180 assert(addr != NULL, "Invalid address");
>> 
>> 
>> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
>> 
>> 
>> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
>> 
>> 
>> This patch:
>> - fixes uses of SafeFetch
>> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
>> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
>> 
>> Tests:
>> - I tested manually that pp works with different levels of NMT (Linux x64)
>> - GHAs in process
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Zhengyus remarks

Changes requested by zgu (Reviewer).

src/hotspot/share/services/mallocTracker.cpp line 301:

> 299: bool MallocTracker::print_pointer_information(const void* p, outputStream* st) {
> 300:   assert(MemTracker::enabled(), "NMT must be enabled");
> 301:   if (CanUseSafeFetchN() && os::is_readable_pointer(p)) {

`os::is_readable_pointer()` uses `CanUseSafeFetch32()`, you may want to check `CanUseSafeFetch32()` instead of `CanUseSafeFetchN()`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From david.holmes at oracle.com  Wed Feb  2 13:39:38 2022
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 2 Feb 2022 23:39:38 +1000
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <v6UK9yUd-VyBTrKZRmlSUOllRI2fwhjV5VAzBu7yQcM=.fea1c545-df0a-478a-850d-d5e16d088b2b@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
 <WMhTzudyyKkS8ntlOY_BMQs4vT-0BpxV5DbDw5hO1vM=.2a690c7b-6778-4716-92ad-2189121c5291@github.com>
 <v6UK9yUd-VyBTrKZRmlSUOllRI2fwhjV5VAzBu7yQcM=.fea1c545-df0a-478a-850d-d5e16d088b2b@github.com>
Message-ID: <a3300b0a-369a-69bc-afc5-b866cbb20122@oracle.com>

On 2/02/2022 10:46 pm, Pavel Rappo wrote:
> On Wed, 2 Feb 2022 12:06:39 GMT, David Holmes <dholmes at openjdk.org> wrote:
> 
>>> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
>>>
>>> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.
>>
>> src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java line 1793:
>>
>>> 1791:         }
>>> 1792:
>>> 1793:         // Error if static method overrides instance method (JLS 8.4.8.2).
>>
>> "overrides" should be "hides"
> 
> Although you seem to be correct, the error messages and the code around operate using the term "override":

Ah yes, I can see now that "overrides" is (incorrectly) used all through 
this code and even in the error messages. It is a subtle distinction.

Cheers,
David
-----

>          // Error if static method overrides instance method (JLS 8.4.8.2).
>          if ((m.flags() & STATIC) != 0 &&
>                     (other.flags() & STATIC) == 0) {
>              log.error(TreeInfo.diagnosticPositionFor(m, tree),
>                        Errors.OverrideStatic(cannotOverride(m, other)));
>              m.flags_field |= BAD_OVERRIDE;
>              return;
>          }
> 
>          // Error if instance method overrides static or final
>          // method (JLS 8.4.8.1).
>          if ((other.flags() & FINAL) != 0 ||
>                   (m.flags() & STATIC) == 0 &&
>                   (other.flags() & STATIC) != 0) {
>              log.error(TreeInfo.diagnosticPositionFor(m, tree),
>                        Errors.OverrideMeth(cannotOverride(m, other),
>                                            asFlagSet(other.flags() & (FINAL | STATIC))));
>              m.flags_field |= BAD_OVERRIDE;
>              return;
>          }
> 
> 
>          /**
>           * compiler.err.override.static=\
>           *    {0}\n\
>           *    overriding method is static
>           */
>          public static Error OverrideStatic(Fragment arg0) {
>              return new Error("compiler", "override.static", arg0);
>          }
> 
> Compiler folk, what do you think?
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/7311

From duke at openjdk.java.net  Wed Feb  2 13:55:42 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Wed, 2 Feb 2022 13:55:42 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v16]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <QE5OEfpY-HbXrbkbfUkOrr8ZQoJKv91RGp66VoGM9hE=.c81de446-7f0c-4976-a335-299befb8595a@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Change pac-ret defaults on non PAC machines

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/b7925614..78da1bd0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=15
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=14-15

  Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From mcimadamore at openjdk.java.net  Wed Feb  2 14:41:11 2022
From: mcimadamore at openjdk.java.net (Maurizio Cimadamore)
Date: Wed, 2 Feb 2022 14:41:11 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <UN-kOvJYC1Yl2zzhgYtR8pCtGYWxxAfAxbYvCc63BmM=.886726eb-b75a-4ef3-8e1f-da7f025f219a@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
 <WImLWgTq8Bu-7ukhDMgsUiOAomHBV1AwNDjmNgoEENM=.f54a96c0-6f3f-4e90-a9f4-ad90ea27f629@github.com>
 <UN-kOvJYC1Yl2zzhgYtR8pCtGYWxxAfAxbYvCc63BmM=.886726eb-b75a-4ef3-8e1f-da7f025f219a@github.com>
Message-ID: <aJ00U1l0eLSxbcwgCGbBPIHQR1sLBEspPokv8YGfv5A=.809830d0-87b4-4b19-8887-a2e64d9eb94e@github.com>

On Wed, 2 Feb 2022 12:31:04 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

>> src/jdk.compiler/share/classes/com/sun/tools/javac/code/Symbol.java line 670:
>> 
>>> 668:      *  modifier is ignored for this test.
>>> 669:      *
>>> 670:      *  See JLS 8.4.8.1 (without transitivity) and 8.4.8.4
>> 
>> Any idea what the "(without transitivity)" is referring to here and elsewhere?
>
> My guess is that "transitivity" here refers to the subclass relationship being the transitive closure of the direct subclass relationship. Could it also be that the "quirk" paragraph starting at com/sun/tools/javac/code/Symbol.java:2057 is relevant here? @mcimadamore?

First, this class contains many references to 8.4.6.x - which should really be 8.4.8.x - not just this one.

I'm not 100% sure about the "without transitivity" comment, but if I had to guess I'd say that it refers to the fact that the checks described in 8.4.8.3 are missing from this routine. More specifically, this section:


It is a compile-time error if a class or interface C has a member method m1 and there exists a method m2 declared in C or a superclass or superinterface of C, A, such that all of the following are true:
* m1 and m2 have the same name.
* m2 is accessible (?6.6) from C.
* The signature of m1 is not a subsignature (?8.4.2) of the signature of m2 as a member of the supertype of C that names A.
* The declared signature of m1 or some method m1 overrides (directly or indirectly) has the same erasure as the declared signature of m2 or some method m2 overrides (directly or indirectly). <----------

As you can see, the last bullet introduces some sort of global requirement across the inheritance chain; this constraint was necessary after Java 5, as generics require the introduction of bridge methods, and it is possible, in some extreme cases, for a subclass to override (accidentally) a bridge method. The JLS doesn't say the word "bridge method" anywhere, but this is what this check morally does.

Now, in an early version of the Java compiler (5 and 6, IIRC), we used to check for clashes with bridge methods at code generation time. So, the checks in the compiler frontend, such as Symbol::overrides did not really have to concerb with that expensive side of the override check.

But, as the implementation matured, it became clearer that (a) the code-generation clash check was not enough to detect all problematic cases and (b) detecting clashes at code generation time was *too late*, especially for clients of the compiler API which might only run the "analyze" step. For these reasons, staring from Java 7, the frontend also has a more expensive check which supports 8.4.8.3 in full (Check::checkOverrideClashes).

Of course, not being the author of that comment, this is only my best guess as to what that could mean :-)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From prappo at openjdk.java.net  Wed Feb  2 15:44:11 2022
From: prappo at openjdk.java.net (Pavel Rappo)
Date: Wed, 2 Feb 2022 15:44:11 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <aJ00U1l0eLSxbcwgCGbBPIHQR1sLBEspPokv8YGfv5A=.809830d0-87b4-4b19-8887-a2e64d9eb94e@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
 <WImLWgTq8Bu-7ukhDMgsUiOAomHBV1AwNDjmNgoEENM=.f54a96c0-6f3f-4e90-a9f4-ad90ea27f629@github.com>
 <UN-kOvJYC1Yl2zzhgYtR8pCtGYWxxAfAxbYvCc63BmM=.886726eb-b75a-4ef3-8e1f-da7f025f219a@github.com>
 <aJ00U1l0eLSxbcwgCGbBPIHQR1sLBEspPokv8YGfv5A=.809830d0-87b4-4b19-8887-a2e64d9eb94e@github.com>
Message-ID: <UUu5q8Dn5IYiFIbNbHPHnsxYPQzYeH9k7uTDs0AXHQI=.68cbd3f7-c69a-4561-949b-f2b478f228d0@github.com>

On Wed, 2 Feb 2022 14:37:39 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> My guess is that "transitivity" here refers to the subclass relationship being the transitive closure of the direct subclass relationship. Could it also be that the "quirk" paragraph starting at com/sun/tools/javac/code/Symbol.java:2057 is relevant here? @mcimadamore?
>
> First, this class contains many references to 8.4.6.x - which should really be 8.4.8.x - not just this one.
> 
> I'm not 100% sure about the "without transitivity" comment, but if I had to guess I'd say that it refers to the fact that the checks described in 8.4.8.3 are missing from this routine. More specifically, this section:
> 
> 
> It is a compile-time error if a class or interface C has a member method m1 and there exists a method m2 declared in C or a superclass or superinterface of C, A, such that all of the following are true:
> * m1 and m2 have the same name.
> * m2 is accessible (?6.6) from C.
> * The signature of m1 is not a subsignature (?8.4.2) of the signature of m2 as a member of the supertype of C that names A.
> * The declared signature of m1 or some method m1 overrides (directly or indirectly) has the same erasure as the declared signature of m2 or some method m2 overrides (directly or indirectly). <----------
> 
> As you can see, the last bullet introduces some sort of global requirement across the inheritance chain; this constraint was necessary after Java 5, as generics require the introduction of bridge methods, and it is possible, in some extreme cases, for a subclass to override (accidentally) a bridge method. The JLS doesn't say the word "bridge method" anywhere, but this is what this check morally does.
> 
> Now, in an early version of the Java compiler (5 and 6, IIRC), we used to check for clashes with bridge methods at code generation time. So, the checks in the compiler frontend, such as Symbol::overrides did not really have to concerb with that expensive side of the override check.
> 
> But, as the implementation matured, it became clearer that (a) the code-generation clash check was not enough to detect all problematic cases and (b) detecting clashes at code generation time was *too late*, especially for clients of the compiler API which might only run the "analyze" step. For these reasons, staring from Java 7, the frontend also has a more expensive check which supports 8.4.8.3 in full (Check::checkOverrideClashes).
> 
> Of course, not being the author of that comment, this is only my best guess as to what that could mean :-)

FWIW, I found a related bug: https://bugs.openjdk.java.net/browse/JDK-4362349. It might be responsible for that "(without transitivity)" caveat.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From duke at openjdk.java.net  Wed Feb  2 16:03:48 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Wed, 2 Feb 2022 16:03:48 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v17]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Update copyrights to 2022

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/78da1bd0..6255d4c8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=16
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=15-16

  Stats: 16 lines in 16 files changed: 0 ins; 0 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From hseigel at openjdk.java.net  Wed Feb  2 18:20:49 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Wed, 2 Feb 2022 18:20:49 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v3]
In-Reply-To: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
Message-ID: <GSkrSQ0nzL_c_4EB_XoHiPOjZuFbPwEkwMFezPH-TRI=.8428cf61-df5e-4767-a0a2-6303b242186f@github.com>

> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
> 
> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  Add warnings for ftruncate64 and lseek64

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7248/files
  - new: https://git.openjdk.java.net/jdk/pull/7248/files/ca2097e4..dd1820eb

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=01-02

  Stats: 12 lines in 3 files changed: 12 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7248.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7248/head:pull/7248

PR: https://git.openjdk.java.net/jdk/pull/7248

From phh at openjdk.java.net  Wed Feb  2 20:58:02 2022
From: phh at openjdk.java.net (Paul Hohensee)
Date: Wed, 2 Feb 2022 20:58:02 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
Message-ID: <rJI4EXWBsl_njvhrUXhTSUKKS0by-eYv0qmiJL4THy8=.010047db-b9b8-49ea-a4ad-85067c109884@github.com>

On Wed, 2 Feb 2022 01:52:46 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Revert "Use jlong instead of int64_t"

Hi, David, I stand corrected. Is there a document somewhere about the policy, and has anyone gone through Hotspot to remove improper use of jlong?

So, belay my jlong suggestion, but now compileBroker.* should use int64_t. I think my gc_globals.hpp comment still stands.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From phh at openjdk.java.net  Wed Feb  2 21:20:10 2022
From: phh at openjdk.java.net (Paul Hohensee)
Date: Wed, 2 Feb 2022 21:20:10 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
Message-ID: <J_qbkgDJcq7HuNTjiQFrsWeq8LRmOgIilsk_iKMatQ8=.4b0103a1-c5cb-486e-a3bd-76d1cb5983a6@github.com>

On Wed, 2 Feb 2022 01:52:46 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Revert "Use jlong instead of int64_t"

Marked as reviewed by phh (Reviewer).

The traversal mark type is signed right now, so I'd leave it signed for this PR and file another one if we want to change it to unsigned. There are quite a few places where signed types are used for values that are intuitively unsigned. One reason I can think of to keep using signed types is that it's easy to detect overflow/wrap-around: just check the sign bit. Allows a bit of time to check for overflow/wrap-around without keeping an old value around.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From rrich at openjdk.java.net  Wed Feb  2 21:22:10 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Wed, 2 Feb 2022 21:22:10 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v2]
In-Reply-To: <vWtcRXoZWW6LVMdFrcIJ26yfq_t0l7AYcYAC4OauHBk=.8e040689-3a2c-48ac-acad-9c44797fdd26@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
 <vWtcRXoZWW6LVMdFrcIJ26yfq_t0l7AYcYAC4OauHBk=.8e040689-3a2c-48ac-acad-9c44797fdd26@github.com>
Message-ID: <oJSjtCAZVWpIQ1gAghuyEKVkr2XfG7ubGDYyRj9TWaE=.33543c01-da92-4a0c-b838-a09048ad172b@github.com>

On Tue, 1 Feb 2022 15:16:50 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of JDK-8277180.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Shorter and better redable recursions increment sequence.

Hi Martin,

the change looks good. Have you tested it with a quick micro benchmark?

The copyright needs to be updated.

Cheers, Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7305

From duke at openjdk.java.net  Wed Feb  2 21:49:15 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Wed, 2 Feb 2022 21:49:15 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
Message-ID: <MucgZQWNcWfdbfYng1XzM_VH9ZJ5UBqNaa-tKzeyXUc=.527b7152-27e4-424b-b425-cddaf64723a8@github.com>

On Wed, 2 Feb 2022 01:52:46 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Revert "Use jlong instead of int64_t"

Other missed places to change:

jvmci/jvmciEnv.hpp:  long get_long_at(JVMCIPrimitiveArray array, int index);
services/memReporter.hpp:  inline long diff_in_current_scale(size_t s1, size_t s2) const {
services/memReporter.hpp:    long amount = (long)(s1 - s2);
services/memReporter.hpp:    long scale = (long)_scale;
services/memReporter.cpp:  long amount_diff = diff_in_current_scale(current_amount, early_amount);
services/memReporter.cpp:  long reserved_diff = diff_in_current_scale(current_reserved, early_reserved);
services/memReporter.cpp:  long committed_diff = diff_in_current_scale(current_committed, early_committed);
services/memReporter.cpp:      long overhead_diff = diff_in_current_scale(_current_baseline.malloc_tracking_overhead(),
services/memReporter.cpp:  long diff_used = diff_in_current_scale(current_stats.used(),
services/memReporter.cpp:  long diff_waste = diff_in_current_scale(current_waste, early_waste);
runtime/vmThread.cpp:  long interval_ms = SafepointTracing::time_since_last_safepoint_ms();

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From david.holmes at oracle.com  Wed Feb  2 22:13:34 2022
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 3 Feb 2022 08:13:34 +1000
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <rJI4EXWBsl_njvhrUXhTSUKKS0by-eYv0qmiJL4THy8=.010047db-b9b8-49ea-a4ad-85067c109884@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
 <rJI4EXWBsl_njvhrUXhTSUKKS0by-eYv0qmiJL4THy8=.010047db-b9b8-49ea-a4ad-85067c109884@github.com>
Message-ID: <e1c5a6bf-f967-747c-c0a4-f930c5cc525c@oracle.com>

On 3/02/2022 6:58 am, Paul Hohensee wrote:
> On Wed, 2 Feb 2022 01:52:46 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:
> 
>>> 8251505: Use of types in compiler shared code should be consistent.
>>
>> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
>>
>>    Revert "Use jlong instead of int64_t"
> 
> Hi, David, I stand corrected. Is there a document somewhere about the policy, and has anyone gone through Hotspot to remove improper use of jlong?

Hi Paul,

Sorry no documented policy, it is just something that a number of folk 
have raised in "recent" years about Java type pollution (mainly jlong) 
in various places in the VM. People have been making the switch 
piecemeal as different areas get worked on.

Cheers,
David

> So, belay my jlong suggestion, but now compileBroker.* should use int64_t. I think my gc_globals.hpp comment still stands.
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/7294

From david.holmes at oracle.com  Wed Feb  2 22:14:57 2022
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 3 Feb 2022 08:14:57 +1000
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v7]
In-Reply-To: <MucgZQWNcWfdbfYng1XzM_VH9ZJ5UBqNaa-tKzeyXUc=.527b7152-27e4-424b-b425-cddaf64723a8@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <ELqPJgm9ZkmKqxPE60lkrkivbplyZpCzB_NnyEL8zpE=.33725cd0-6679-4fd6-8acd-28a1f77dab35@github.com>
 <MucgZQWNcWfdbfYng1XzM_VH9ZJ5UBqNaa-tKzeyXUc=.527b7152-27e4-424b-b425-cddaf64723a8@github.com>
Message-ID: <2c0f5a4e-6c60-100b-f0d3-a0a047926aef@oracle.com>

On 3/02/2022 7:49 am, Evgeny Astigeevich wrote:
> On Wed, 2 Feb 2022 01:52:46 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:
> 
>>> 8251505: Use of types in compiler shared code should be consistent.
>>
>> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
>>
>>    Revert "Use jlong instead of int64_t"
> 
> Other missed places to change:
> 
> jvmci/jvmciEnv.hpp:  long get_long_at(JVMCIPrimitiveArray array, int index);
> services/memReporter.hpp:  inline long diff_in_current_scale(size_t s1, size_t s2) const {
> services/memReporter.hpp:    long amount = (long)(s1 - s2);
> services/memReporter.hpp:    long scale = (long)_scale;
> services/memReporter.cpp:  long amount_diff = diff_in_current_scale(current_amount, early_amount);
> services/memReporter.cpp:  long reserved_diff = diff_in_current_scale(current_reserved, early_reserved);
> services/memReporter.cpp:  long committed_diff = diff_in_current_scale(current_committed, early_committed);
> services/memReporter.cpp:      long overhead_diff = diff_in_current_scale(_current_baseline.malloc_tracking_overhead(),
> services/memReporter.cpp:  long diff_used = diff_in_current_scale(current_stats.used(),
> services/memReporter.cpp:  long diff_waste = diff_in_current_scale(current_waste, early_waste);
> runtime/vmThread.cpp:  long interval_ms = SafepointTracing::time_since_last_safepoint_ms();

Other than jvmci these are not "compiler shared code" - other cleanups 
in other areas will need their own RFE.

Cheers,
David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/7294

From cjplummer at openjdk.java.net  Wed Feb  2 22:48:12 2022
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Wed, 2 Feb 2022 22:48:12 GMT
Subject: RFR: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <vD7wBS8OeTqUV4QtcgDRwINKV4IltyDdzqeF4Zfgd9o=.4066e2c1-8d5b-4b71-88ea-e58da9b6c8d3@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

`com/sun/jdi/ReferenceType.java` changes look good.

-------------

Marked as reviewed by cjplummer (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7311

From duke at openjdk.java.net  Thu Feb  3 00:03:53 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Thu, 3 Feb 2022 00:03:53 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v8]
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>

> 8251505: Use of types in compiler shared code should be consistent.

Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:

  Fix JVMCIEnv::get_long_at

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7294/files
  - new: https://git.openjdk.java.net/jdk/pull/7294/files/01f3b1f2..2c0eb15f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7294&range=06-07

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7294/head:pull/7294

PR: https://git.openjdk.java.net/jdk/pull/7294

From phh at openjdk.java.net  Thu Feb  3 00:08:09 2022
From: phh at openjdk.java.net (Paul Hohensee)
Date: Thu, 3 Feb 2022 00:08:09 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v8]
In-Reply-To: <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>
Message-ID: <z3hcLkwSu-nLabBB6ftX2msyWxzfN55oLEI674qZYXQ=.159ac6fd-2150-459e-b200-64ca82cf12b6@github.com>

On Thu, 3 Feb 2022 00:03:53 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix JVMCIEnv::get_long_at

Lgtm.

-------------

Marked as reviewed by phh (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7294

From dholmes at openjdk.java.net  Thu Feb  3 01:56:15 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 3 Feb 2022 01:56:15 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v14]
In-Reply-To: <Y73LvR7JcG4SUXZllgPQLmDeElg1XgEyzZCNLLWglw0=.43ea398e-ace9-45e1-97e3-d020adad51fb@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <B0Uq8FlB1tzZYgPGJuiKFlBwYtLOyfZDvKg-c92S7ss=.726cf230-373a-4afe-b37b-8fe977f9d8b3@github.com>
 <cZrItxwKFe8rg7-UPKcdI0IN4LbCe9EjVoRg_mli4v4=.ed7993a7-9ba1-40e3-84fa-b9b671352dc5@github.com>
 <CMLktPRTckFYGJ2T5qnG4bMZ1Nz_zx87X-CvW5OBd8U=.8effa526-d736-4d81-9111-b607bfd91c33@github.com>
 <NBdOT35ZQo_rb5t8H3OSHle_OGQEMlR_nsu2quY3PZA=.390b67c7-c532-41fa-9c12-87221d311c3e@github.com>
 <K0ktVZTFjc9Gpij2CBsFjqzxIYLoRy2YRxYmzbGjV_o=.0731f464-b456-4160-b090-d6432038ecdd@github.com>
 <L0_01FsEw9Xxj4aEz-9CTmfrE5D9VVElO8vhzdu7pQA=.7504284f-5c17-4d8d-849c-3ba24bfc6b92@github.com>
 <oiqgcS6gDWPTBxG8KmAaDjswmAQ0Jl5rEuVtoQJKlRo=.fdf20545-05bb-45fb-bad8-d2ca244880e8@github.com>
 <Y73LvR7JcG4SUXZllgPQLmDeElg1XgEyzZCNLLWglw0=.43ea398e-ace9-45e1-97e3-d020adad51fb@github.com>
Message-ID: <hVRM-YVODZ6u6cs-_Y4CRjxbcLehu-a_6M5tVArNd3I=.b0180f0a-2ba6-42cd-adc8-dc492ae09f0f@github.com>

On Wed, 2 Feb 2022 10:18:38 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> And this change will keep ROP protection enabled if we fall into the "this VM was built without ROP-protection support.". In that case we'll be protecting generated code, but the VM itself won't be protected. This will run without crashing.
>
>> And this change will keep ROP protection enabled if we fall into the "this VM was built without ROP-protection support.". In that case we'll be protecting generated code, but the VM itself won't be protected. This will run without crashing.
> 
> That's perfect.

Okay, that makes sense. Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From minqi at openjdk.java.net  Thu Feb  3 02:51:13 2022
From: minqi at openjdk.java.net (Yumin Qi)
Date: Thu, 3 Feb 2022 02:51:13 GMT
Subject: RFR: 8278753: Runtime crashes with access violation during
 JNI_CreateJavaVM call
In-Reply-To: <VDKQApKfLuLb84ncVs16N5X8qPv6D3zksRpL7S755p4=.776bb459-8f11-4994-97f5-c0240cf22828@github.com>
References: <VDKQApKfLuLb84ncVs16N5X8qPv6D3zksRpL7S755p4=.776bb459-8f11-4994-97f5-c0240cf22828@github.com>
Message-ID: <Cspufl9xVZBHZfxHDcBB1yq5UzR7RgyNGtUc7rUNXvw=.f179e42f-1040-4cb5-8ec4-418ca8d3439e@github.com>

On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi <minqi at openjdk.org> wrote:

> Please review,
>   When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result.
>   The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle.
> 
>   Tests: tier1,4,7 in test
>    Manually tested user case, and checked output of jimage list <modules> for jlinked files using --compress=2.
> 
> Thanks
> Yumin

Since no further update, I will integrate tomorrow.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7206

From dlong at openjdk.java.net  Thu Feb  3 04:20:41 2022
From: dlong at openjdk.java.net (Dean Long)
Date: Thu, 3 Feb 2022 04:20:41 GMT
Subject: RFR: 8271055: Crash during deoptimization with
 "assert(bb->is_reachable()) failed: getting result from unreachable
 basicblock" with -XX:+VerifyStack
Message-ID: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>

Reproduced the problem with a new JASM test rather than relying on idiosyncrasies of javac.
The fix is to not look at the next instruction (which might be the beginning of an unreachable block) if the current instruction doesn't fall through (like "goto"!).

-------------

Commit messages:
 - Don't look at next bytecode if the current bytecode doesn't fall through

Changes: https://git.openjdk.java.net/jdk/pull/7331/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7331&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8271055
  Stats: 121 lines in 3 files changed: 120 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7331.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7331/head:pull/7331

PR: https://git.openjdk.java.net/jdk/pull/7331

From stuefe at openjdk.java.net  Thu Feb  3 05:37:47 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Feb 2022 05:37:47 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v2]
In-Reply-To: <vdcr53OK1o78ZM_jxSqLy-vGv2MBd1nPSg5uQ3PjWHQ=.99fd5cf3-3417-4524-ae82-f0616f906f5d@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <w50g1dEB_VjurAYZCBofEaFcwGr6ctEdSeW4OIY1pDI=.0b3b459e-81fc-4cdf-b36e-e7bc19f597f2@github.com>
 <vdcr53OK1o78ZM_jxSqLy-vGv2MBd1nPSg5uQ3PjWHQ=.99fd5cf3-3417-4524-ae82-f0616f906f5d@github.com>
Message-ID: <nCLdmx-4IUhCnXDUXQvVK9Q_7-vluFFxumv0JlCTgPQ=.7f98f5c6-63af-463d-9a5c-9789de555112@github.com>

On Wed, 2 Feb 2022 13:30:26 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Zhengyus remarks
>
> src/hotspot/share/services/mallocTracker.cpp line 301:
> 
>> 299: bool MallocTracker::print_pointer_information(const void* p, outputStream* st) {
>> 300:   assert(MemTracker::enabled(), "NMT must be enabled");
>> 301:   if (CanUseSafeFetchN() && os::is_readable_pointer(p)) {
> 
> `os::is_readable_pointer()` uses `CanUseSafeFetch32()`, you may want to check `CanUseSafeFetch32()` instead of `CanUseSafeFetchN()`.

Good point. Done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From stuefe at openjdk.java.net  Thu Feb  3 05:37:45 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Feb 2022 05:37:45 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v3]
In-Reply-To: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
Message-ID: <H7ccLquGF86Bte9vH4-D3ZtNkz31w65TxZ7Z3hTLPIY=.ac1473ea-a2b7-43d9-bec0-076c6ac44610@github.com>

> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
> 
> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
> 
> 
> (gdb) call pp(0x7ffff010b030)
> 
> "Executing pp"
> 
> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
> 180 assert(addr != NULL, "Invalid address");
> 
> 
> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
> 
> 
> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
> 
> 
> This patch:
> - fixes uses of SafeFetch
> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
> 
> Tests:
> - I tested manually that pp works with different levels of NMT (Linux x64)
> - GHAs in process

Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:

  Use CanSafeFetch32

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7297/files
  - new: https://git.openjdk.java.net/jdk/pull/7297/files/10dc66ed..10a24978

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7297&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7297&range=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7297.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7297/head:pull/7297

PR: https://git.openjdk.java.net/jdk/pull/7297

From vlivanov at openjdk.java.net  Thu Feb  3 06:52:11 2022
From: vlivanov at openjdk.java.net (Vladimir Ivanov)
Date: Thu, 3 Feb 2022 06:52:11 GMT
Subject: RFR: 8271055: Crash during deoptimization with
 "assert(bb->is_reachable()) failed: getting result from unreachable
 basicblock" with -XX:+VerifyStack
In-Reply-To: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
References: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
Message-ID: <-o_Q8_R9Mz3I70DbL_STe7RQfq-59Tca_Ke87HxgE8I=.e479c4bc-3166-4faf-969c-e03490479cae@github.com>

On Thu, 3 Feb 2022 04:11:38 GMT, Dean Long <dlong at openjdk.org> wrote:

> Reproduced the problem with a new JASM test rather than relying on idiosyncrasies of javac.
> The fix is to not look at the next instruction (which might be the beginning of an unreachable block) if the current instruction doesn't fall through (like "goto"!).

Looks good.

-------------

Marked as reviewed by vlivanov (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7331

From thartmann at openjdk.java.net  Thu Feb  3 07:02:12 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Thu, 3 Feb 2022 07:02:12 GMT
Subject: RFR: 8271055: Crash during deoptimization with
 "assert(bb->is_reachable()) failed: getting result from unreachable
 basicblock" with -XX:+VerifyStack
In-Reply-To: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
References: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
Message-ID: <ckbmN7iQKBWrg4CnYfX-0xzkxTX_wn1UsCsPoiGD4oQ=.34f6c76f-1906-4fd2-9e63-5d511d7b26ff@github.com>

On Thu, 3 Feb 2022 04:11:38 GMT, Dean Long <dlong at openjdk.org> wrote:

> Reproduced the problem with a new JASM test rather than relying on idiosyncrasies of javac.
> The fix is to not look at the next instruction (which might be the beginning of an unreachable block) if the current instruction doesn't fall through (like "goto"!).

Looks good to me too.

-------------

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7331

From ioi.lam at oracle.com  Thu Feb  3 07:30:46 2022
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 2 Feb 2022 23:30:46 -0800
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares causes
 underutilization
Message-ID: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>

Please see the bug report [1] for detailed description and test cases.

I'd like to have some discussion before we can decide what to do.

I discovered this issue when analyzing JDK-8279484 [2]. Under Kubernetes 
(minikube), Runtime.availableProcessors() returns 1, despite that the 
fact the machine has 32 CPUs, the Kubernetes node has a single 
deployment, and no CPU limits were set.

Specifically, I want to understand why the JDK is using 
CgroupSubsystem::cpu_shares() to limit the number of CPUs used by the 
Java process.

In cgroup, there are other ways that are designed specifically for 
limiting the number of CPUs, i.e., CgroupSubsystem::cpu_quota(). Why is 
using cpu_quota() alone not enough? Why did we choose the current 
approach of considering both cpu_quota() and cpu_shares()?

My guess is that sometimes people don't limit the actual number of CPUs 
per container, but instead use CPU Shares to set the relative scheduling 
priority between containers.

I.e., they run "docker run --cpu-shares=1234" without using the "--cpus" 
flag.

If this is indeed the reason, I can understand the (good) intention, but 
the solution seems awfully insufficient.

CPU Shares is a *relative* number. How much CPU is allocated to you 
depends on

- how many other processes are actively running
- what their CPU Shares are

The above information can change dynamically, as other processes may be 
added or removed, and they can change between active and idle states.

However, the JVM treats CPU Shares as an *absolute/static* number, and 
sets the CPU quota of the current process using this very simplistic 
formula.

Value of /sys/fs/cgroup/cpu.shares -> cpu quota:

 ??? 1023 -> 1 CPU
 ??? 1024 -> no limit (huh??)
 ??? 2048 -> 2 CPUs
 ??? 4096 -> 4 CPUs

This seems just wrong to me. There's no way you can get a "correct" 
result without knowing anything about other processes that are running 
at the same time.

The net effect is when Java is running under a container, more likely 
that not, the JVM will limit itself to a single CPU. This seems really 
inefficient to me.

What should we do?

Thanks
- Ioi

[1] https://bugs.openjdk.java.net/browse/JDK-8281181
[2] https://bugs.openjdk.java.net/browse/JDK-8279484

From iklam at openjdk.java.net  Thu Feb  3 07:41:07 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Thu, 3 Feb 2022 07:41:07 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v3]
In-Reply-To: <H7ccLquGF86Bte9vH4-D3ZtNkz31w65TxZ7Z3hTLPIY=.ac1473ea-a2b7-43d9-bec0-076c6ac44610@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <H7ccLquGF86Bte9vH4-D3ZtNkz31w65TxZ7Z3hTLPIY=.ac1473ea-a2b7-43d9-bec0-076c6ac44610@github.com>
Message-ID: <NzAPUcW3mH4jrB21HsLDe4MivQz1mA1MGqqvPHoaFvE=.ce478753-7930-4da1-aa38-54ac0b066e59@github.com>

On Thu, 3 Feb 2022 05:37:45 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
>> 
>> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
>> 
>> 
>> (gdb) call pp(0x7ffff010b030)
>> 
>> "Executing pp"
>> 
>> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
>> 180 assert(addr != NULL, "Invalid address");
>> 
>> 
>> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
>> 
>> 
>> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
>> 
>> 
>> This patch:
>> - fixes uses of SafeFetch
>> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
>> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
>> 
>> Tests:
>> - I tested manually that pp works with different levels of NMT (Linux x64)
>> - GHAs in process
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use CanSafeFetch32

Marked as reviewed by iklam (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From iklam at openjdk.java.net  Thu Feb  3 07:41:08 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Thu, 3 Feb 2022 07:41:08 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v3]
In-Reply-To: <02M2JuFVSa-pZX75zgzYLEpBO7A_kihJZeSws5lrMCc=.e0f89c55-6f74-42b3-9307-d00449cd0730@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <tPpDiCW8Lne_aeFrYinpQUMXj7rtlfdnDJQALim1WlA=.69ccba88-a8a8-44eb-af97-0370fedeb32d@github.com>
 <02M2JuFVSa-pZX75zgzYLEpBO7A_kihJZeSws5lrMCc=.e0f89c55-6f74-42b3-9307-d00449cd0730@github.com>
Message-ID: <dSdAX1UWWudgBrrQttgwmp6C8PPrTfWGnaGcMVD_QUQ=.4ee87e67-ab4f-4c3b-85bb-32e6497ef506@github.com>

On Wed, 2 Feb 2022 08:24:10 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Note that I think @iklam was overcautious in the original discussion. We already risk signals in the debugging session by using SafeFetch to read the malloc header. Using a reference to a potentially dead ReservedMemoryRegion runs an additional - very low - risk of signals. So, if we really want to be cautious, we should not print out NMT information at all in pp. But I think NMT is very useful and worth the risk. All we risk is a slightly miffed debugger.

I wasn't trying to make life unnecessarily difficult, and I agree that printing out the NMT info is a great idea.

All I was suggesting was -- if there's a less risky way to do the printing and that's no overly complicated, we should do it that way. And the new code looks good to me.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From stuefe at openjdk.java.net  Thu Feb  3 07:48:08 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Feb 2022 07:48:08 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v3]
In-Reply-To: <dSdAX1UWWudgBrrQttgwmp6C8PPrTfWGnaGcMVD_QUQ=.4ee87e67-ab4f-4c3b-85bb-32e6497ef506@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <tPpDiCW8Lne_aeFrYinpQUMXj7rtlfdnDJQALim1WlA=.69ccba88-a8a8-44eb-af97-0370fedeb32d@github.com>
 <02M2JuFVSa-pZX75zgzYLEpBO7A_kihJZeSws5lrMCc=.e0f89c55-6f74-42b3-9307-d00449cd0730@github.com>
 <dSdAX1UWWudgBrrQttgwmp6C8PPrTfWGnaGcMVD_QUQ=.4ee87e67-ab4f-4c3b-85bb-32e6497ef506@github.com>
Message-ID: <Jc2JDkRFGue6bKqkZLJ4aaI8h9DaDcMEEHzQVp59lb0=.58f7c84d-2261-4299-98d6-e08261aa1baf@github.com>

On Thu, 3 Feb 2022 07:37:40 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> > Note that I think @iklam was overcautious in the original discussion. We already risk signals in the debugging session by using SafeFetch to read the malloc header. Using a reference to a potentially dead ReservedMemoryRegion runs an additional - very low - risk of signals. So, if we really want to be cautious, we should not print out NMT information at all in pp. But I think NMT is very useful and worth the risk. All we risk is a slightly miffed debugger.
> 
> I wasn't trying to make life unnecessarily difficult, and I agree that printing out the NMT info is a great idea.
> 
> All I was suggesting was -- if there's a less risky way to do the printing and that's no overly complicated, we should do it that way. And the new code looks good to me.

Thank you for the review!

If pp() were used outside of debugging, I'd agree with your original estimate.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From david.holmes at oracle.com  Thu Feb  3 09:19:10 2022
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 3 Feb 2022 19:19:10 +1000
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
Message-ID: <44ce9669-71cc-0c20-ecbf-265845626820@oracle.com>

Hi Ioi,

For the benefit of the mailing list discussion ...

On 3/02/2022 5:30 pm, Ioi Lam wrote:
> Please see the bug report [1] for detailed description and test cases.
> 
> I'd like to have some discussion before we can decide what to do.
> 
> I discovered this issue when analyzing JDK-8279484 [2]. Under Kubernetes 
> (minikube), Runtime.availableProcessors() returns 1, despite that the 
> fact the machine has 32 CPUs, the Kubernetes node has a single 
> deployment, and no CPU limits were set.
> 
> Specifically, I want to understand why the JDK is using 
> CgroupSubsystem::cpu_shares() to limit the number of CPUs used by the 
> Java process.

Because we were asked to by customers deploying in containers.

> In cgroup, there are other ways that are designed specifically for 
> limiting the number of CPUs, i.e., CgroupSubsystem::cpu_quota(). Why is 
> using cpu_quota() alone not enough? Why did we choose the current 
> approach of considering both cpu_quota() and cpu_shares()?

Because people were using both (whether that made sense or not) and so 
we needed a policy on what to do if both were set.

> My guess is that sometimes people don't limit the actual number of CPUs 
> per container, but instead use CPU Shares to set the relative scheduling 
> priority between containers.
> 
> I.e., they run "docker run --cpu-shares=1234" without using the "--cpus" 
> flag.
> 
> If this is indeed the reason, I can understand the (good) intention, but 
> the solution seems awfully insufficient.
> 
> CPU Shares is a *relative* number. How much CPU is allocated to you 
> depends on
> 
> - how many other processes are actively running
> - what their CPU Shares are
> 
> The above information can change dynamically, as other processes may be 
> added or removed, and they can change between active and idle states.
> 
> However, the JVM treats CPU Shares as an *absolute/static* number, and 
> sets the CPU quota of the current process using this very simplistic 
> formula.

 From old discussion and the code I believe the thought was that share 
was relative to the the per-cpu default shares of 1024. So we use that 
to determine the fraction of each CPU that should be assigned, and we 
should then use that to determine the available number of CPUs. But that 
isn't what we actually do - we only calculate the fraction and round it 
up to get the number of CPUs and that is wrong (and typically only gives 
1 cpu because shares < 1024). I speculate that what was intended was to 
map from having an X% share of each CPU, to instead having access to X% 
of the total CPUs (at 100% of each). Mathematically this has some basis 
but it actually makes no practical sense from a throughput or response 
time perspective. If I'm allowed 50% of the CPU per time period to do my 
calculations, I want 100% of each CPU for half of the period as that 
potentially minimises the elapsed time till I have a result.

> Value of /sys/fs/cgroup/cpu.shares -> cpu quota:
> 
>  ??? 1023 -> 1 CPU
>  ??? 1024 -> no limit (huh??)
>  ??? 2048 -> 2 CPUs
>  ??? 4096 -> 4 CPUs
> 
> This seems just wrong to me. There's no way you can get a "correct" 
> result without knowing anything about other processes that are running 
> at the same time.

As I said above and in the bug report I think this was an error and the 
intent was to then multiply by the number of actual processors.

> The net effect is when Java is running under a container, more likely 
> that not, the JVM will limit itself to a single CPU. This seems really 
> inefficient to me.

Yes.

> What should we do?

We could just adjust the calculation as I suggested.

Or, given that share aka weight is meaningless without knowing the total 
weight in the system we could just ignore it. The app then gets access 
to all cpu's and it is up to the container to track actual usage and 
impose any limits configured.

I've always thought that these cgroups mechanisms were fundamentally 
flawed and that if the intent was to define a resource limited 
environment, then the environment should report what resources were 
available by the normal APIs. They got this right with cpu-sets by 
integrating with sched_getaffinity; but for shares and quotas it has 
been left to the applications to try and figure out what that should 
mean - and that makes no sense to me.

Cheers,
David

> Thanks
> - Ioi
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8281181
> [2] https://bugs.openjdk.java.net/browse/JDK-8279484

From mdoerr at openjdk.java.net  Thu Feb  3 10:16:45 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Thu, 3 Feb 2022 10:16:45 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v3]
In-Reply-To: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
Message-ID: <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>

> PPC64 implementation of JDK-8277180.

Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:

  Update Copyright years.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7305/files
  - new: https://git.openjdk.java.net/jdk/pull/7305/files/1eec2373..88861201

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7305&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7305&range=01-02

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7305.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7305/head:pull/7305

PR: https://git.openjdk.java.net/jdk/pull/7305

From duke at openjdk.java.net  Thu Feb  3 10:35:11 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Thu, 3 Feb 2022 10:35:11 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v8]
In-Reply-To: <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>
Message-ID: <F3oj2EzJkIYMV0QWwwydcMHwkB5E4WJGsdjd9FDjYAQ=.4041a0bf-9ee7-4806-82eb-9d1c29bf9f4e@github.com>

On Thu, 3 Feb 2022 00:03:53 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix JVMCIEnv::get_long_at

lgtm

-------------

Marked as reviewed by eastig at github.com (no known OpenJDK username).

PR: https://git.openjdk.java.net/jdk/pull/7294

From sgehwolf at redhat.com  Thu Feb  3 11:29:46 2022
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Thu, 03 Feb 2022 12:29:46 +0100
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
Message-ID: <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>

Hi Ioi,

On Wed, 2022-02-02 at 23:30 -0800, Ioi Lam wrote:
> Please see the bug report [1] for detailed description and test cases.
> 
> I'd like to have some discussion before we can decide what to do.
> 
> I discovered this issue when analyzing JDK-8279484 [2]. Under Kubernetes 
> (minikube), Runtime.availableProcessors() returns 1, despite that the
> fact the machine has 32 CPUs, the Kubernetes node has a single 
> deployment, and no CPU limits were set.

>From looking at the bug it would be good to know why a cpu.weight value
of 1 is being obverved. The default is 100. I.e. if it is really unset:

$ sudo docker run --rm -v $(pwd)/jdk17:/opt/jdk:z fedora:35 /opt/jdk/bin/java -Xlog:os+container=trace --version
[0.000s][trace][os,container] OSContainer::init: Initializing Container Support
[0.001s][debug][os,container] Detected cgroups v2 unified hierarchy
[0.001s][trace][os,container] Path to /memory.max is /sys/fs/cgroup//memory.max
[0.001s][trace][os,container] Raw value for memory limit is: max
[0.001s][trace][os,container] Memory Limit is: Unlimited
[0.001s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup//cpu.max
[0.001s][trace][os,container] Raw value for CPU quota is: max
[0.001s][trace][os,container] CPU Quota is: -1
[0.001s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup//cpu.max
[0.001s][trace][os,container] CPU Period is: 100000
[0.001s][trace][os,container] Path to /cpu.weight is /sys/fs/cgroup//cpu.weight
[0.001s][trace][os,container] Raw value for CPU shares is: 100
[0.001s][debug][os,container] CPU Shares is: -1
[0.001s][trace][os,container] OSContainer::active_processor_count: 4
[0.001s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4
[0.001s][debug][os,container] container memory limit unlimited: -1, using host value
[0.001s][debug][os,container] container memory limit unlimited: -1, using host value
[0.002s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4
[0.007s][debug][os,container] container memory limit unlimited: -1, using host value
[0.014s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4
[0.022s][trace][os,container] Path to /memory.max is /sys/fs/cgroup//memory.max
[0.022s][trace][os,container] Raw value for memory limit is: max
[0.022s][trace][os,container] Memory Limit is: Unlimited
[0.022s][debug][os,container] container memory limit unlimited: -1, using host value
openjdk 17.0.2-internal 2022-01-18
OpenJDK Runtime Environment (build 17.0.2-internal+0-adhoc.sgehwolf.jdk17u)
OpenJDK 64-Bit Server VM (build 17.0.2-internal+0-adhoc.sgehwolf.jdk17u, mixed mode, sharing)

> Specifically, I want to understand why the JDK is using 
> CgroupSubsystem::cpu_shares() to limit the number of CPUs used by the
> Java process.

TLDR: Kubernetes and/or other container orchestration frameworks? That
was back in the day of cgroups v1, though.

> In cgroup, there are other ways that are designed specifically for 
> limiting the number of CPUs, i.e., CgroupSubsystem::cpu_quota(). Why is 
> using cpu_quota() alone not enough? Why did we choose the current 
> approach of considering both cpu_quota() and cpu_shares()?

Kubernetes has a concept of "cpu requests" and "cpu limit". It maps (or
mapped?) those values to cpu shares and cpu quota in cgroups.

> My guess is that sometimes people don't limit the actual number of CPUs 
> per container, but instead use CPU Shares to set the relative scheduling 
> priority between containers.
> 
> I.e., they run "docker run --cpu-shares=1234" without using the "--cpus" 
> flag.
> 
> If this is indeed the reason, I can understand the (good) intention, but 
> the solution seems awfully insufficient.
> 
> CPU Shares is a *relative* number. How much CPU is allocated to you 
> depends on
> 
> - how many other processes are actively running
> - what their CPU Shares are
> 
> The above information can change dynamically, as other processes may be 
> added or removed, and they can change between active and idle states.
> 
> However, the JVM treats CPU Shares as an *absolute/static* number, and 
> sets the CPU quota of the current process using this very simplistic 
> formula.
> 
> Value of /sys/fs/cgroup/cpu.shares -> cpu quota:
> 
> ???? 1023 -> 1 CPU
> ???? 1024 -> no limit (huh??)
> ???? 2048 -> 2 CPUs
> ???? 4096 -> 4 CPUs
> 
> This seems just wrong to me. There's no way you can get a "correct" 
> result without knowing anything about other processes that are running 
> at the same time.
> 
> The net effect is when Java is running under a container, more likely
> that not, the JVM will limit itself to a single CPU. This seems really 
> inefficient to me.

I believe the point is that popular container orchestration frameworks
use the cpu requests feature to map to cpu.shares. A similar question
regarding this was asked by myself a while ago. See JDK-8216366.

Here is what Bob Vandette had to say at the time:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036093.html

Thanks,
Severin

> 
> What should we do?
> 
> Thanks
> - Ioi
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8281181
> [2] https://bugs.openjdk.java.net/browse/JDK-8279484
> 


From duke at openjdk.java.net  Thu Feb  3 11:34:11 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Thu, 3 Feb 2022 11:34:11 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent.
In-Reply-To: <2c0f5a4e-6c60-100b-f0d3-a0a047926aef@oracle.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <2c0f5a4e-6c60-100b-f0d3-a0a047926aef@oracle.com>
Message-ID: <4FQ-YEzWaDvXTIqe6IYjQUlJCSS7vaoywDQfu7DNX0Y=.3de04253-c875-4bb4-9238-07d48d713183@github.com>

On Wed, 2 Feb 2022 22:16:31 GMT, David Holmes <david.holmes at oracle.com> wrote:

> > Other missed places to change:
> > jvmci/jvmciEnv.hpp:  long get_long_at(JVMCIPrimitiveArray array, int index);
> > services/memReporter.hpp:  inline long diff_in_current_scale(size_t s1, size_t s2) const {
> > services/memReporter.hpp:    long amount = (long)(s1 - s2);
> > services/memReporter.hpp:    long scale = (long)_scale;
> > services/memReporter.cpp:  long amount_diff = diff_in_current_scale(current_amount, early_amount);
> > services/memReporter.cpp:  long reserved_diff = diff_in_current_scale(current_reserved, early_reserved);
> > services/memReporter.cpp:  long committed_diff = diff_in_current_scale(current_committed, early_committed);
> > services/memReporter.cpp:      long overhead_diff = diff_in_current_scale(_current_baseline.malloc_tracking_overhead(),
> > services/memReporter.cpp:  long diff_used = diff_in_current_scale(current_stats.used(),
> > services/memReporter.cpp:  long diff_waste = diff_in_current_scale(current_waste, early_waste);
> > runtime/vmThread.cpp:  long interval_ms = SafepointTracing::time_since_last_safepoint_ms();
> 
> Other than jvmci these are not "compiler shared code" - other cleanups in other areas will need their own RFE.
> 
> Cheers, David

Created:
https://bugs.openjdk.java.net/browse/JDK-8281213
https://bugs.openjdk.java.net/browse/JDK-8281214

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From duke at openjdk.java.net  Thu Feb  3 12:14:18 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 3 Feb 2022 12:14:18 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v17]
In-Reply-To: <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>
Message-ID: <HaDwxAQNR0HLLvJCkKM0opQ85gzYUtSTZPOWjdDzddY=.53a2527b-6b8f-43bb-b4bf-2e328e313384@github.com>

On Wed, 2 Feb 2022 16:03:48 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update copyrights to 2022

As mentioned on the CSR, the JEP is being dropped - unless anyone has any objections.
JDK-8277204 will become a normal RFE.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Thu Feb  3 13:05:12 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Thu, 3 Feb 2022 13:05:12 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v17]
In-Reply-To: <HaDwxAQNR0HLLvJCkKM0opQ85gzYUtSTZPOWjdDzddY=.53a2527b-6b8f-43bb-b4bf-2e328e313384@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>
 <HaDwxAQNR0HLLvJCkKM0opQ85gzYUtSTZPOWjdDzddY=.53a2527b-6b8f-43bb-b4bf-2e328e313384@github.com>
Message-ID: <_Bbro08HLFKOtrG9jBdy9s3W6FOgeqZkh0_Sttkm8EM=.a7080cfc-2c89-434b-9898-4f9e1ceb4817@github.com>

On Thu, 3 Feb 2022 12:11:16 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> As mentioned on the CSR, the JEP is being dropped - unless anyone has any objections. JDK-8277204 will become a normal RFE.

Good decision.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From mdoerr at openjdk.java.net  Thu Feb  3 13:16:05 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Thu, 3 Feb 2022 13:16:05 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v3]
In-Reply-To: <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
 <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
Message-ID: <87IrdT-fI5RIWDXfr5YY5lZ27U1v9XT30A-moZz3Mn4=.22447a22-affa-48c8-b947-a787f6570bcd@github.com>

On Thu, 3 Feb 2022 10:16:45 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of JDK-8277180.
>> 
>> `java -Xms4g -Xmx4g -jar dacapo-9.12-bach.jar h2 -s huge -t 1 -n 1 --max-iterations=35 --variance=5 --verbose --converge`
>> 
>> Before this patch (2 runs):
>> `===== DaCapo 9.12 h2 PASSED in 309753 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 300755 msec =====`
>> 
>> After:
>> `===== DaCapo 9.12 h2 PASSED in 285144 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 288255 msec =====`
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update Copyright years.

Thanks for the review! Copyright updated and benchmark results added above.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7305

From zgu at openjdk.java.net  Thu Feb  3 13:22:07 2022
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Feb 2022 13:22:07 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v3]
In-Reply-To: <H7ccLquGF86Bte9vH4-D3ZtNkz31w65TxZ7Z3hTLPIY=.ac1473ea-a2b7-43d9-bec0-076c6ac44610@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <H7ccLquGF86Bte9vH4-D3ZtNkz31w65TxZ7Z3hTLPIY=.ac1473ea-a2b7-43d9-bec0-076c6ac44610@github.com>
Message-ID: <3-lNY1lndmGQSA0Lo4d4mWVqOXnKHw7rReENOOBHzZs=.3d95213a-4a75-4b80-adff-151de90b3339@github.com>

On Thu, 3 Feb 2022 05:37:45 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
>> 
>> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
>> 
>> 
>> (gdb) call pp(0x7ffff010b030)
>> 
>> "Executing pp"
>> 
>> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
>> 180 assert(addr != NULL, "Invalid address");
>> 
>> 
>> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
>> 
>> 
>> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
>> 
>> 
>> This patch:
>> - fixes uses of SafeFetch
>> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
>> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
>> 
>> Tests:
>> - I tested manually that pp works with different levels of NMT (Linux x64)
>> - GHAs in process
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use CanSafeFetch32

Marked as reviewed by zgu (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From rrich at openjdk.java.net  Thu Feb  3 14:05:10 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Thu, 3 Feb 2022 14:05:10 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v3]
In-Reply-To: <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
 <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
Message-ID: <nnOfME-8plh6GRuzvlS4KzfN0U4JO5v36msZvCF7SuA=.23b2a876-f76c-481e-839b-6844361e8fca@github.com>

On Thu, 3 Feb 2022 10:16:45 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of JDK-8277180.
>> 
>> `java -Xms4g -Xmx4g -jar dacapo-9.12-bach.jar h2 -s huge -t 1 -n 1 --max-iterations=35 --variance=5 --verbose --converge`
>> 
>> Before this patch (2 runs):
>> `===== DaCapo 9.12 h2 PASSED in 309753 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 300755 msec =====`
>> 
>> After:
>> `===== DaCapo 9.12 h2 PASSED in 285144 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 288255 msec =====`
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update Copyright years.

Looks good.

Thanks, Richard.

-------------

Marked as reviewed by rrich (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7305

From stuefe at openjdk.java.net  Thu Feb  3 14:15:15 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Feb 2022 14:15:15 GMT
Subject: RFR: JDK-8281023: NMT integration into pp debug command does not
 work [v3]
In-Reply-To: <3-lNY1lndmGQSA0Lo4d4mWVqOXnKHw7rReENOOBHzZs=.3d95213a-4a75-4b80-adff-151de90b3339@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
 <H7ccLquGF86Bte9vH4-D3ZtNkz31w65TxZ7Z3hTLPIY=.ac1473ea-a2b7-43d9-bec0-076c6ac44610@github.com>
 <3-lNY1lndmGQSA0Lo4d4mWVqOXnKHw7rReENOOBHzZs=.3d95213a-4a75-4b80-adff-151de90b3339@github.com>
Message-ID: <K9aJygh61WusjmkbTTk9wWktFcQWo5NgNDx4iF0gcWI=.8b9dd6b0-d1f3-4374-bba7-2e53dadbf1ae@github.com>

On Thu, 3 Feb 2022 13:18:49 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Use CanSafeFetch32
>
> Marked as reviewed by zgu (Reviewer).

Thanks @zhengyu123 and @iklam.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From stuefe at openjdk.java.net  Thu Feb  3 14:15:15 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Feb 2022 14:15:15 GMT
Subject: Integrated: JDK-8281023: NMT integration into pp debug command does
 not work
In-Reply-To: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
References: <F6QOUZoDf81M8Oh3BtaSBz1lZu6daKURnfgC91qPky0=.176b4031-048a-4b85-930e-e92cd66b8137@github.com>
Message-ID: <0BG4ktv-tf6tZ2V4J7aM-2n405mpnTugPT3PqK3uE9o=.fe33f34f-e79d-4151-8a53-ec93e1fd99eb@github.com>

On Tue, 1 Feb 2022 08:36:42 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> JDK-8280289 enhanced the debug pp() command to use NMT if enabled, and to print NMT related info. That is useful, but there are some issues.
> 
> On debug, it just asserts, since the empty reserved region we create to hold the output of the mmap-search is created with address=NULL:
> 
> 
> (gdb) call pp(0x7ffff010b030)
> 
> "Executing pp"
> 
> Thread 2 "java" received signal SIGSEGV, Segmentation fault.
> 0x00007ffff6721a71 in VirtualMemoryRegion::VirtualMemoryRegion (this=this at entry=0x7ffff5bb2620, addr=addr at entry=0x0, size=size at entry=0) at /shared/projects/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.hpp:180
> 180 assert(addr != NULL, "Invalid address");
> 
> 
> On release we don't assert and get further, but the use of SafeFetch is slightly wrong. It will deny us any NMT data about p if *p==0:
> 
> 
> if (CanUseSafeFetchN() && SafeFetchN((intptr_t*)p, 0) != 0) {
> 
> 
> This patch:
> - fixes uses of SafeFetch
> - changes the mmap-region-search-code to not require an empty ReservedMemoryRegion in order to avoid triggering the assert in virtualMemoryTracker.hpp:180
> - adds a comment about the safe use of pp() in gdb (one needs to switch off signal handling of SIGSEGV for this to work)
> 
> Tests:
> - I tested manually that pp works with different levels of NMT (Linux x64)
> - GHAs in process

This pull request has now been integrated.

Changeset: 010965c8
Author:    Thomas Stuefe <stuefe at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/010965c86ab39260b882df807c4f5d6420b20ca9
Stats:     97 lines in 5 files changed: 54 ins; 25 del; 18 mod

8281023: NMT integration into pp debug command does not work

Reviewed-by: zgu, iklam

-------------

PR: https://git.openjdk.java.net/jdk/pull/7297

From prappo at openjdk.java.net  Thu Feb  3 14:58:12 2022
From: prappo at openjdk.java.net (Pavel Rappo)
Date: Thu, 3 Feb 2022 14:58:12 GMT
Subject: Integrated: 8281057: Fix doc references to overriding in JLS
In-Reply-To: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
References: <eTfoamYgcWCfM1yA45i3JtSQmJsUpmUE1EAnMEhe1C4=.027df332-9792-4622-9ea6-4ea8244d731c@github.com>
Message-ID: <pu7_A1KlRaS_vbJFVDHmA-IoeVwGvkO3wd3jKVvmcpg=.3d192c8d-7e17-4da1-8f87-8ebcbc4f6e0a@github.com>

On Tue, 1 Feb 2022 16:19:01 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

> While looking into guts of javadoc comment inheritance, I noticed that a number of places in JDK seem to confuse JLS 8.4.6.** with JLS 8.4.8.**.
> 
> Granted, "8.4.6 Method Throws" tangentially addresses overriding. However, I believe that the real target should be "8.4.8. Inheritance, Overriding, and Hiding" and its subsections.

This pull request has now been integrated.

Changeset: 1f926609
Author:    Pavel Rappo <prappo at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/1f926609372c9b80dde831a014310a3729768c92
Stats:     18 lines in 5 files changed: 0 ins; 0 del; 18 mod

8281057: Fix doc references to overriding in JLS

Reviewed-by: darcy, iris, dholmes, cjplummer

-------------

PR: https://git.openjdk.java.net/jdk/pull/7311

From duke at openjdk.java.net  Thu Feb  3 16:51:50 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 3 Feb 2022 16:51:50 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v17]
In-Reply-To: <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>
Message-ID: <Dok0bE7QDan5hdeiCnuctzvNFc4KC1bAw7uegPTpb7s=.92e247fe-ec36-4a1c-9055-7d1986514d86@github.com>

On Wed, 2 Feb 2022 16:03:48 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update copyrights to 2022

As requested in the RFE, added the new flag to the man page. Also updated the building.md instructions.

However, I'm not sure how to add to the release notes - I can't find any files or a process.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Thu Feb  3 16:51:48 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 3 Feb 2022 16:51:48 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v18]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Documentation updates

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/6255d4c8..d97883b5

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=17
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=16-17

  Stats: 34 lines in 2 files changed: 33 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From lucy at openjdk.java.net  Thu Feb  3 17:42:09 2022
From: lucy at openjdk.java.net (Lutz Schmidt)
Date: Thu, 3 Feb 2022 17:42:09 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v3]
In-Reply-To: <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
 <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
Message-ID: <0UtRVkT3nknIO6XWwzOhMs1SSZPNHfyeaNGkWY8qQkE=.6addc46e-77f2-4b19-8390-bc0497092a53@github.com>

On Thu, 3 Feb 2022 10:16:45 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of JDK-8277180.
>> 
>> `java -Xms4g -Xmx4g -jar dacapo-9.12-bach.jar h2 -s huge -t 1 -n 1 --max-iterations=35 --variance=5 --verbose --converge`
>> 
>> Before this patch (2 runs):
>> `===== DaCapo 9.12 h2 PASSED in 309753 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 300755 msec =====`
>> 
>> After:
>> `===== DaCapo 9.12 h2 PASSED in 285144 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 288255 msec =====`
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update Copyright years.

Changes look good to me.
Nice performance gain for such a small change!

-------------

Marked as reviewed by lucy (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7305

From minqi at openjdk.java.net  Thu Feb  3 18:07:12 2022
From: minqi at openjdk.java.net (Yumin Qi)
Date: Thu, 3 Feb 2022 18:07:12 GMT
Subject: Integrated: 8278753: Runtime crashes with access violation during
 JNI_CreateJavaVM call
In-Reply-To: <VDKQApKfLuLb84ncVs16N5X8qPv6D3zksRpL7S755p4=.776bb459-8f11-4994-97f5-c0240cf22828@github.com>
References: <VDKQApKfLuLb84ncVs16N5X8qPv6D3zksRpL7S755p4=.776bb459-8f11-4994-97f5-c0240cf22828@github.com>
Message-ID: <3dNFLZSlmP3oTlqhbEvQrTbVPUmh43zxnRHqXoUxCR8=.da8cdf20-6b08-4c58-ba75-cb4981cd80ad@github.com>

On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi <minqi at openjdk.org> wrote:

> Please review,
>   When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result.
>   The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle.
> 
>   Tests: tier1,4,7 in test
>    Manually tested user case, and checked output of jimage list <modules> for jlinked files using --compress=2.
> 
> Thanks
> Yumin

This pull request has now been integrated.

Changeset: cda9c301
Author:    Yumin Qi <minqi at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/cda9c3011beeec8df68e78e096132e712255ce1b
Stats:     49 lines in 6 files changed: 18 ins; 14 del; 17 mod

8278753: Runtime crashes with access violation during JNI_CreateJavaVM call

Reviewed-by: dholmes, stuefe

-------------

PR: https://git.openjdk.java.net/jdk/pull/7206

From duke at openjdk.java.net  Thu Feb  3 19:38:18 2022
From: duke at openjdk.java.net (Yi-Fan Tsai)
Date: Thu, 3 Feb 2022 19:38:18 GMT
Subject: Integrated: 8251505: Use of types in compiler shared code should be
 consistent.
In-Reply-To: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
Message-ID: <fYpwXe2hq98P1F3joiZWcnzM7BFDm90qfVQcR4WFlGY=.96e6087c-3322-412e-89b5-f6fa50859650@github.com>

On Tue, 1 Feb 2022 03:35:13 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

> 8251505: Use of types in compiler shared code should be consistent.

This pull request has now been integrated.

Changeset: b6935dfb
Author:    Yi-Fan Tsai <yftsai at amazon.com>
Committer: Paul Hohensee <phh at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/b6935dfb86a1c011355d2dfb2140be26ec536351
Stats:     33 lines in 10 files changed: 2 ins; 0 del; 31 mod

8251505: Use of types in compiler shared code should be consistent.

Reviewed-by: phh

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From lucy at openjdk.java.net  Thu Feb  3 21:43:07 2022
From: lucy at openjdk.java.net (Lutz Schmidt)
Date: Thu, 3 Feb 2022 21:43:07 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames
In-Reply-To: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
Message-ID: <4a2kmV7FQ6RflwAUiV4gyzghvRreLpLiF2DranH1LJI=.1032f99a-08eb-44b9-b932-34b68e745ac4@github.com>

On Tue, 1 Feb 2022 17:22:57 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

Changes look good to me. Is there a chance to "officially" run some JFR jtreg tests?

-------------

Changes requested by lucy (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7312

From dlong at openjdk.java.net  Thu Feb  3 22:01:10 2022
From: dlong at openjdk.java.net (Dean Long)
Date: Thu, 3 Feb 2022 22:01:10 GMT
Subject: RFR: 8271055: Crash during deoptimization with
 "assert(bb->is_reachable()) failed: getting result from unreachable
 basicblock" with -XX:+VerifyStack
In-Reply-To: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
References: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
Message-ID: <Bd3HCCm5swVb85OU-l0PyEFJ-gceUmlYe06s78HVrDo=.0f5efb06-fc5d-4221-bd1d-952b0185082b@github.com>

On Thu, 3 Feb 2022 04:11:38 GMT, Dean Long <dlong at openjdk.org> wrote:

> Reproduced the problem with a new JASM test rather than relying on idiosyncrasies of javac.
> The fix is to not look at the next instruction (which might be the beginning of an unreachable block) if the current instruction doesn't fall through (like "goto"!).

Thanks Tobias and Vladimir!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7331

From dlong at openjdk.java.net  Thu Feb  3 22:14:18 2022
From: dlong at openjdk.java.net (Dean Long)
Date: Thu, 3 Feb 2022 22:14:18 GMT
Subject: Integrated: 8271055: Crash during deoptimization with
 "assert(bb->is_reachable()) failed: getting result from unreachable
 basicblock" with -XX:+VerifyStack
In-Reply-To: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
References: <VbH0r2-S-sOJIhbh5fejoiIEM3OGLTIoOzsJHgdOPqs=.800bd723-7054-4a14-9c50-646c3a238354@github.com>
Message-ID: <Vw4T047-tcFlEGHmjtyvGa1KBfOos4w29jj1i7ej-3I=.991bbe7e-244b-487d-8019-5139ee5ebc7b@github.com>

On Thu, 3 Feb 2022 04:11:38 GMT, Dean Long <dlong at openjdk.org> wrote:

> Reproduced the problem with a new JASM test rather than relying on idiosyncrasies of javac.
> The fix is to not look at the next instruction (which might be the beginning of an unreachable block) if the current instruction doesn't fall through (like "goto"!).

This pull request has now been integrated.

Changeset: e44dc638
Author:    Dean Long <dlong at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/e44dc638b8936b1b76ca9ddf9ece0c5c4705a19c
Stats:     121 lines in 3 files changed: 120 ins; 0 del; 1 mod

8271055: Crash during deoptimization with "assert(bb->is_reachable()) failed: getting result from unreachable basicblock" with -XX:+VerifyStack

Co-authored-by: Yi Yang <yyang at openjdk.org>
Co-authored-by: Yi Yang <qingfeng.yy at alibaba-inc.com>
Reviewed-by: vlivanov, thartmann

-------------

PR: https://git.openjdk.java.net/jdk/pull/7331

From dholmes at openjdk.java.net  Thu Feb  3 22:19:10 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 3 Feb 2022 22:19:10 GMT
Subject: RFR: 8251505: Use of types in compiler shared code should be
 consistent. [v8]
In-Reply-To: <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>
References: <oUUE69vJX22HOlvWAoNrNQuAPRan5XFmorCc-nG9B8Y=.726d5c9f-da84-4a17-8c50-aee35de6dca5@github.com>
 <jdfdBN0PQqNis6ODxAg-6d1P5Gbdtc_lkGLzc8jN0D8=.c75ad7e2-e751-4ed6-b492-8fe9d0be3117@github.com>
Message-ID: <FVRPQ60lItiz6QVvivS2HWdRl0Fp27HBJZdykSoenpM=.432772ec-3a4e-4b62-a353-9a4c08742f6b@github.com>

On Thu, 3 Feb 2022 00:03:53 GMT, Yi-Fan Tsai <duke at openjdk.java.net> wrote:

>> 8251505: Use of types in compiler shared code should be consistent.
>
> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix JVMCIEnv::get_long_at

These changes also appear okay to me. Where we have changed from 32-bit to 64-bit types we will need to watch for issues with the 32-bit builds.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7294

From mdoerr at openjdk.java.net  Thu Feb  3 23:01:45 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Thu, 3 Feb 2022 23:01:45 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v2]
In-Reply-To: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
Message-ID: <GQznoNmH1Rg23zvfvsjxTMlChknL5tY2mNAzoTlmI3w=.c064fdaa-9df8-4e55-880c-3b9088847d27@github.com>

> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:

  Fix istate in stack range check.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7312/files
  - new: https://git.openjdk.java.net/jdk/pull/7312/files/f491da86..934e13c0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7312.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7312/head:pull/7312

PR: https://git.openjdk.java.net/jdk/pull/7312

From mdoerr at openjdk.java.net  Thu Feb  3 23:01:46 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Thu, 3 Feb 2022 23:01:46 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames
In-Reply-To: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
Message-ID: <UWxNjHMGLPWU-qKRJnFzR5oX_XSiuJvQjEvDQZwmCZE=.2bf670c9-dc6e-4d35-ba82-cd30b336d807@github.com>

On Tue, 1 Feb 2022 17:22:57 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

I had ran a couple of JFR jtreg tests. But obviously not enough ones. I'll try more. "officially", we don't test on s390 any more.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From mdoerr at openjdk.java.net  Fri Feb  4 09:18:10 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 4 Feb 2022 09:18:10 GMT
Subject: RFR: 8281043: Intrinsify recursive ObjectMonitor locking for PPC64
 [v3]
In-Reply-To: <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
 <7ErBIaYMp6HAZqIyG-r8_B9EI3sw4hu3VzZ54SPYaKk=.3db6e7d9-cb10-4532-b1c9-19a6a3239f58@github.com>
Message-ID: <KaLNzIB9UUPXUEErC9nSOnU4unJlfh7_TSAMgz_Zdw4=.655cf3c6-beb0-448c-89c5-33516fa6c70a@github.com>

On Thu, 3 Feb 2022 10:16:45 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of JDK-8277180.
>> 
>> `java -Xms4g -Xmx4g -jar dacapo-9.12-bach.jar h2 -s huge -t 1 -n 1 --max-iterations=35 --variance=5 --verbose --converge`
>> 
>> Before this patch (2 runs):
>> `===== DaCapo 9.12 h2 PASSED in 309753 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 300755 msec =====`
>> 
>> After:
>> `===== DaCapo 9.12 h2 PASSED in 285144 msec =====`
>> `===== DaCapo 9.12 h2 PASSED in 288255 msec =====`
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update Copyright years.

Thanks!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7305

From mdoerr at openjdk.java.net  Fri Feb  4 09:18:11 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 4 Feb 2022 09:18:11 GMT
Subject: Integrated: 8281043: Intrinsify recursive ObjectMonitor locking for
 PPC64
In-Reply-To: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
References: <ZEAwJIcUomKQXX6YIAarYqikbAmj4P05MeR6do0DmQo=.9e35eb16-bb0c-427b-9700-6c3205723ea6@github.com>
Message-ID: <y8oCaswZTgTaHyjpzVE89anEp-3WxgQEfcAtviW-5mU=.3621796a-1a2d-4c2c-a305-1848e30001ed@github.com>

On Tue, 1 Feb 2022 13:23:42 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> PPC64 implementation of JDK-8277180.
> 
> `java -Xms4g -Xmx4g -jar dacapo-9.12-bach.jar h2 -s huge -t 1 -n 1 --max-iterations=35 --variance=5 --verbose --converge`
> 
> Before this patch (2 runs):
> `===== DaCapo 9.12 h2 PASSED in 309753 msec =====`
> `===== DaCapo 9.12 h2 PASSED in 300755 msec =====`
> 
> After:
> `===== DaCapo 9.12 h2 PASSED in 285144 msec =====`
> `===== DaCapo 9.12 h2 PASSED in 288255 msec =====`

This pull request has now been integrated.

Changeset: 46c6c6f3
Author:    Martin Doerr <mdoerr at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/46c6c6f308b5ec0ec3b762df4b76de555287474c
Stats:     23 lines in 1 file changed: 8 ins; 3 del; 12 mod

8281043: Intrinsify recursive ObjectMonitor locking for PPC64

Reviewed-by: rrich, lucy

-------------

PR: https://git.openjdk.java.net/jdk/pull/7305

From duke at openjdk.java.net  Fri Feb  4 09:19:15 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Fri, 4 Feb 2022 09:19:15 GMT
Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection
 for Linux/AArch64 [v17]
In-Reply-To: <Dok0bE7QDan5hdeiCnuctzvNFc4KC1bAw7uegPTpb7s=.92e247fe-ec36-4a1c-9055-7d1986514d86@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <52o8K8q5wBP4HgBI3AljysgeR6tbogiOtQYu0VhWOAA=.80d5b306-f67f-4a87-836f-44bdbb0713f1@github.com>
 <Dok0bE7QDan5hdeiCnuctzvNFc4KC1bAw7uegPTpb7s=.92e247fe-ec36-4a1c-9055-7d1986514d86@github.com>
Message-ID: <rzaJ13bQtHco6JmgxkgovaoFb6Q7IYRBdJ3oCFrYX9U=.e250fb9d-23a0-447a-ba15-9f7d35e623c6@github.com>

On Thu, 3 Feb 2022 16:49:08 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> However, I'm not sure how to add to the release notes - I can't find any files or a process.

Ok, This part I understand now :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From shade at openjdk.java.net  Fri Feb  4 11:21:27 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 4 Feb 2022 11:21:27 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
Message-ID: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>

This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.

The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.

This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.

I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.

I think it is fairly complete, and so would like to solicit more feedback and testing here.

Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:


 compiler.compiler: +77%
 compiler.sunflow: +69%
 compress: +166%
 crypto.rsa: +15%
 crypto.signverify: +70%
 mpegaudio: +8%
 serial: +50%
 sunflow: +57%
 xml.transform: +61%
 xml.validation: +43%


My new `java.lang.invoke` benchmarks improve a lot as well:


Benchmark              Mode  Cnt    Score    Error  Units

# Mainline
MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op

# This WIP
MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op


It also palpably improves startup even on small HelloWorld, _even when compilers are present_:


$ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null

 Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):

             22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
                96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
                 7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
             2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
        78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
         2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
         2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
        67,296,528      instructions              #    0.85  insn per cycle         
                                                  #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
        12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
           384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)

         0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )

$ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null

 Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):

             21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
                98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
                 7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
             2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
        77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
         2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
         2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
        66,742,892      instructions              #    0.86  insn per cycle         
                                                  #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
        12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
           386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)

         0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )


Additional testing:
 - [x] Linux x86_64 fastdebug, `tier1`
 - [ ] Linux x86_64 fastdebug, `tier2`
 - [ ] Linux x86_64 fastdebug, `tier3`
 - [x] Linux x86_32 fastdebug, `tier1`
 - [ ] Linux x86_32 fastdebug, `tier2`
 - [ ] Linux x86_32 fastdebug, `tier3`

-------------

Commit messages:
 - Initial fairly complete fix

Changes: https://git.openjdk.java.net/jdk/pull/7247/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8072070
  Stats: 190 lines in 6 files changed: 169 ins; 4 del; 17 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7247.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7247/head:pull/7247

PR: https://git.openjdk.java.net/jdk/pull/7247

From mdoerr at openjdk.java.net  Fri Feb  4 13:57:10 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 4 Feb 2022 13:57:10 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <a5yTPrRU0h9-QcfnLJbBhi0o6cDYFb8UOd7QGaB_QwI=.424d036f-7826-493f-9d68-2ed0fc22b863@github.com>

On Thu, 27 Jan 2022 18:42:15 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [ ] Linux x86_64 fastdebug, `tier2`
>  - [ ] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [ ] Linux x86_32 fastdebug, `tier2`
>  - [ ] Linux x86_32 fastdebug, `tier3`

Hi Aleksey, thanks for working on the stack banging code. I wanted to do so for a long time, but couldn't make it, yet. Results are impressive!

A quick question. Why can't we just use something like the following on linux?

  __ cmpptr(rsp, Address(r15_thread, JavaThread::stack_overflow_limit_offset()));
  __ jump_cc(Assembler::belowEqual, ExternalAddress(Interpreter::_throw_StackOverflowError_entry));

Is banging the shadow area strictly required on linux?
Could be that it is needed on some OSes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From mdoerr at openjdk.java.net  Fri Feb  4 15:45:48 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 4 Feb 2022 15:45:48 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
Message-ID: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>

> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:

  Fix sender_sp.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7312/files
  - new: https://git.openjdk.java.net/jdk/pull/7312/files/934e13c0..6d9446a8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=01-02

  Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7312.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7312/head:pull/7312

PR: https://git.openjdk.java.net/jdk/pull/7312

From lucy at openjdk.java.net  Fri Feb  4 16:02:12 2022
From: lucy at openjdk.java.net (Lutz Schmidt)
Date: Fri, 4 Feb 2022 16:02:12 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <GS87PmfdiH8mZVOjakoaFvU5uvreYmDkLQOqKQPvYUw=.2354b362-29d3-4532-9427-99448fb3d0c8@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

Yes, I know. I was hoping for support from IBM / Red Hat. Tyler Steele had been helpful in the past. He seems to have changed name. "backwaterred" is not known anymore.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From shade at openjdk.java.net  Fri Feb  4 16:02:13 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 4 Feb 2022 16:02:13 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <a5yTPrRU0h9-QcfnLJbBhi0o6cDYFb8UOd7QGaB_QwI=.424d036f-7826-493f-9d68-2ed0fc22b863@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <a5yTPrRU0h9-QcfnLJbBhi0o6cDYFb8UOd7QGaB_QwI=.424d036f-7826-493f-9d68-2ed0fc22b863@github.com>
Message-ID: <seKbLMboppe4usjUvdX151ipQy6JnJcIOYubCzvfIo0=.cde60b8d-e289-48c4-8b37-31fc6d6fa34e@github.com>

On Fri, 4 Feb 2022 13:54:02 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> A quick question. Why can't we just use something like the following on linux?
> 
> ```
>   __ cmpptr(rsp, Address(r15_thread, JavaThread::stack_overflow_limit_offset()));
>   __ jump_cc(Assembler::belowEqual, ExternalAddress(Interpreter::_throw_StackOverflowError_entry));
> ```
> 
> Is banging the shadow area strictly required on linux? Could be that it is needed on some OSes.

(There is a large comment in `stackOverflow.hpp` -- do you see blind spots there?)

My early patches were something like that. But the deeper I got into this, the more I realized it is safer to keep banging in order to cooperate with the rest of stack overflow machinery. For example, I am not at all sure that throwing the SOE when below `stack_overflow_limit` works well with reserved zone handling. It was probably okay when we only had the yellow+red zones.

AFAIU, the only OS that needs to bang page by page to commit stacks is Windows; got some funky GHA failures without it. But, given how the watermark code effectively bangs each part of the stack once, I don't see a reason to bother with OS-specific code here. We can keep "overbanging" on Linux, and pay little cost for it. Same with `native_call`-s.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From mdoerr at openjdk.java.net  Fri Feb  4 16:27:09 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 4 Feb 2022 16:27:09 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <xHpYgsaMlG2CJRZ70Z1mdlZtCiI-zXCEmRSdnh9Yr1g=.ea3b596e-6d0c-4760-b6d1-b1feeb6ed3d9@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

@backwaterred is still there. But now with picture and real name. I've got a lot of the JFR jtreg tests passing, but I'll try to run more over the weekend. Additional testing would still be appreciated, though.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From mdoerr at openjdk.java.net  Fri Feb  4 17:28:09 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 4 Feb 2022 17:28:09 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <MXNoqBB1812kfMQzckOxVRuIjVba1DX3S7pPJLb1lFY=.c90c557e-2445-40d5-87b3-8a745e61adec@github.com>

On Thu, 27 Jan 2022 18:42:15 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [ ] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

I think it would be interesting to figure out if we can let the linux kernel do all the stack management work for us and avoid stack banging, protected pages etc. inside of hotspot completely. But that may be beyond the scope of your PR. (Windows is a different story.) I hope that I can find time to figure it out at some point of time.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From duke at openjdk.java.net  Fri Feb  4 18:25:13 2022
From: duke at openjdk.java.net (Tyler Steele)
Date: Fri, 4 Feb 2022 18:25:13 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <mpEjBwDeQbmQqVLfLPQSc7fqseuQAvdTCfHARuE07uE=.1f1c2267-303e-4d01-94b9-2886a0c575f1@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

Hello ??. I am happy to do some official testing. I have set up a run for tier1 and jfr tests on our s390 machines. I will let you know what I find.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From harold.seigel at oracle.com  Fri Feb  4 20:24:42 2022
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 4 Feb 2022 15:24:42 -0500
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
Message-ID: <c131039f-47ed-d085-393b-3fc99302bd8c@oracle.com>

Information on how cpu's are calculated can be found here: 
https://bugs.openjdk.java.net/browse/JDK-8197867

Harold

On 2/3/2022 2:30 AM, Ioi Lam wrote:
> Please see the bug report [1] for detailed description and test cases.
>
> I'd like to have some discussion before we can decide what to do.
>
> I discovered this issue when analyzing JDK-8279484 [2]. Under 
> Kubernetes (minikube), Runtime.availableProcessors() returns 1, 
> despite that the fact the machine has 32 CPUs, the Kubernetes node has 
> a single deployment, and no CPU limits were set.
>
> Specifically, I want to understand why the JDK is using 
> CgroupSubsystem::cpu_shares() to limit the number of CPUs used by the 
> Java process.
>
> In cgroup, there are other ways that are designed specifically for 
> limiting the number of CPUs, i.e., CgroupSubsystem::cpu_quota(). Why 
> is using cpu_quota() alone not enough? Why did we choose the current 
> approach of considering both cpu_quota() and cpu_shares()?
>
> My guess is that sometimes people don't limit the actual number of 
> CPUs per container, but instead use CPU Shares to set the relative 
> scheduling priority between containers.
>
> I.e., they run "docker run --cpu-shares=1234" without using the 
> "--cpus" flag.
>
> If this is indeed the reason, I can understand the (good) intention, 
> but the solution seems awfully insufficient.
>
> CPU Shares is a *relative* number. How much CPU is allocated to you 
> depends on
>
> - how many other processes are actively running
> - what their CPU Shares are
>
> The above information can change dynamically, as other processes may 
> be added or removed, and they can change between active and idle states.
>
> However, the JVM treats CPU Shares as an *absolute/static* number, and 
> sets the CPU quota of the current process using this very simplistic 
> formula.
>
> Value of /sys/fs/cgroup/cpu.shares -> cpu quota:
>
> ??? 1023 -> 1 CPU
> ??? 1024 -> no limit (huh??)
> ??? 2048 -> 2 CPUs
> ??? 4096 -> 4 CPUs
>
> This seems just wrong to me. There's no way you can get a "correct" 
> result without knowing anything about other processes that are running 
> at the same time.
>
> The net effect is when Java is running under a container, more likely 
> that not, the JVM will limit itself to a single CPU. This seems really 
> inefficient to me.
>
> What should we do?
>
> Thanks
> - Ioi
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8281181
> [2] https://bugs.openjdk.java.net/browse/JDK-8279484

From xliu at openjdk.java.net  Sat Feb  5 01:26:07 2022
From: xliu at openjdk.java.net (Xin Liu)
Date: Sat, 5 Feb 2022 01:26:07 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <2ycwqbNL3ATeqhegUjGcUmLVYi4-2IOhATj8QK5qQNw=.86c91bd8-9d24-42ff-aa1e-8391742850c5@github.com>

On Thu, 27 Jan 2022 18:42:15 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [ ] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 715:

> 713: }
> 714: 
> 715: void TemplateInterpreterGenerator::bang_stack_shadow_pages(bool native_call) {

The watermark algorithm should also work on other architectures such as aarch64, right?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From xliu at openjdk.java.net  Sat Feb  5 01:32:09 2022
From: xliu at openjdk.java.net (Xin Liu)
Date: Sat, 5 Feb 2022 01:32:09 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <X43GMemeHr1OkAWD1lNO8FOQOcMff2zW2-ikz8v2mVI=.a4784b47-7a42-4fbd-830c-582ac8a8de52@github.com>

On Thu, 27 Jan 2022 18:42:15 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [ ] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

since you this PR touches stackoverflow.hpp, Could you also take a look at this?
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stackOverflow.cpp#L66

we actually get the page size from os. why do we need alignment = 4k?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Sat Feb  5 07:31:40 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Sat, 5 Feb 2022 07:31:40 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <X43GMemeHr1OkAWD1lNO8FOQOcMff2zW2-ikz8v2mVI=.a4784b47-7a42-4fbd-830c-582ac8a8de52@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <X43GMemeHr1OkAWD1lNO8FOQOcMff2zW2-ikz8v2mVI=.a4784b47-7a42-4fbd-830c-582ac8a8de52@github.com>
Message-ID: <Uzm608EYxhqmhpzYhKRiv5ecj75iov26SuxUogz9G9s=.43c1a92c-ae0d-480e-a25f-3f116c758134@github.com>

On Sat, 5 Feb 2022 01:28:47 GMT, Xin Liu <xliu at openjdk.org> wrote:

> since you this PR touches stackoverflow.hpp, Could you also take a look at this? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stackOverflow.cpp#L66
> 
> we actually get the page size from os. why do we need alignment = 4k?

Look here: https://github.com/openjdk/jdk/blob/48523b090886f7b24ed4009f0c150efaa6f7b056/src/hotspot/share/runtime/stackOverflow.cpp#L42-L45 -- the `StackYellowPages`, `StackRedPages`, `StackShadowPages` are defined in as 4K pages. It should probably be called `unit`, not `alignment`. I'd like to avoid scope creep for this PR, so that's for another day.

> src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 715:
> 
>> 713: }
>> 714: 
>> 715: void TemplateInterpreterGenerator::bang_stack_shadow_pages(bool native_call) {
> 
> The watermark algorithm should also work on other architectures such as aarch64, right?

Yes, as I stated in PR text: "This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. "

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From xliu at openjdk.java.net  Sat Feb  5 08:17:21 2022
From: xliu at openjdk.java.net (Xin Liu)
Date: Sat, 5 Feb 2022 08:17:21 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <vuNbXFRcZFVsBcpU1O3V9rFDKTDMXhAxJ93urmz62xQ=.30cee780-6e1c-458c-80cd-e5c21cd3e626@github.com>

On Thu, 27 Jan 2022 18:42:15 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [x] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

src/hotspot/share/runtime/stackOverflow.hpp line 166:

> 164:   //     into adjacent thread stack, or even into other readable memory. This would potentially
> 165:   //     pass the check by accident.
> 166:   //  c) Allow for incremental stack growth by handling traps from not yet committed thread

I failed to understand why we have to do "incremental stack growth" here.  Why can't use touch the last page? 

__ bang_stack_with_offset(n_shadow_pages*page_size);


The entire shadow zone is mapped. Touching it causes commit, page faults or SEGV. First 2 events are transparent for the userspace process.

Hotspot will trap into the signal handler if `bang_stack_shadow_pages` does cross shadow_zone_safe_limit().  `rsp + n_shadow_pages * page_size` falls into 2 possibilities: 
1. red zone:  the program is about to die anyway. 
2. yellow reserved zones, both are recoverable.

I feel it's not necessary to touch pages from 1 to n_shadow_pages-1. The side effect is same as touching the last page directly.

ps: I tried this [idea](https://github.com/navyxliu/jdk/runs/5075962312?check_suite_focus=true). 2 failures are found on Windows. I guess the premise that the shadow zone is mapped is false on Windows. 

compiler/interpreter/cr7116216/StackOverflow.java 
compiler/uncommontrap/UncommonTrapStackBang.java

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From xliu at openjdk.java.net  Sat Feb  5 09:21:09 2022
From: xliu at openjdk.java.net (Xin Liu)
Date: Sat, 5 Feb 2022 09:21:09 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <vuNbXFRcZFVsBcpU1O3V9rFDKTDMXhAxJ93urmz62xQ=.30cee780-6e1c-458c-80cd-e5c21cd3e626@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <vuNbXFRcZFVsBcpU1O3V9rFDKTDMXhAxJ93urmz62xQ=.30cee780-6e1c-458c-80cd-e5c21cd3e626@github.com>
Message-ID: <tnS19jzIEJMFr_NW2pPxhwp3kcjytPX6x4pAhHG6-38=.207affaf-f797-4726-a699-f3401f8718b1@github.com>

On Sat, 5 Feb 2022 08:13:34 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> src/hotspot/share/runtime/stackOverflow.hpp line 166:
> 
>> 164:   //     into adjacent thread stack, or even into other readable memory. This would potentially
>> 165:   //     pass the check by accident.
>> 166:   //  c) Allow for incremental stack growth by handling traps from not yet committed thread
> 
> I failed to understand why we have to do "incremental stack growth" here.  Why can't use touch the last page? 
> 
> __ bang_stack_with_offset(n_shadow_pages*page_size);
> 
> 
> The entire shadow zone is mapped. Touching it causes commit, page faults or SEGV. First 2 events are transparent for the userspace process.
> 
> Hotspot will trap into the signal handler if `bang_stack_shadow_pages` does cross shadow_zone_safe_limit().  `rsp + n_shadow_pages * page_size` falls into 2 possibilities: 
> 1. red zone:  the program is about to die anyway. 
> 2. yellow reserved zones, both are recoverable.
> 
> I feel it's not necessary to touch pages from 1 to n_shadow_pages-1. The side effect is same as touching the last page directly.
> 
> ps: I tried this [idea](https://github.com/navyxliu/jdk/runs/5075962312?check_suite_focus=true). 2 failures are found on Windows. I guess the premise that the shadow zone is mapped is false on Windows. 
> 
> compiler/interpreter/cr7116216/StackOverflow.java 
> compiler/uncommontrap/UncommonTrapStackBang.java

I read this blogpost and I need to take back my comment. 
https://pangin.pro/posts/stack-overflow-handling

now I think interpreter has to do linear probing to make sure HotSpot executes Java programs correctly. reserve_zone has special meaning.  Further, if rsp is very close to the shadow_zone_safe_limit(), so-called last page may surpass red zone.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From duke at openjdk.java.net  Sat Feb  5 15:40:33 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sat, 5 Feb 2022 15:40:33 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts
Message-ID: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>

Hi,

This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.

Thank you very much.

-------------

Commit messages:
 - unsigned cast intrinsics

Changes: https://git.openjdk.java.net/jdk/pull/7358/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7358&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8278173
  Stats: 494 lines in 16 files changed: 435 ins; 24 del; 35 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7358.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7358/head:pull/7358

PR: https://git.openjdk.java.net/jdk/pull/7358

From shade at openjdk.java.net  Sun Feb  6 07:28:08 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Sun, 6 Feb 2022 07:28:08 GMT
Subject: RFR: 8072070: Improve interpreter stack banging
In-Reply-To: <tnS19jzIEJMFr_NW2pPxhwp3kcjytPX6x4pAhHG6-38=.207affaf-f797-4726-a699-f3401f8718b1@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <vuNbXFRcZFVsBcpU1O3V9rFDKTDMXhAxJ93urmz62xQ=.30cee780-6e1c-458c-80cd-e5c21cd3e626@github.com>
 <tnS19jzIEJMFr_NW2pPxhwp3kcjytPX6x4pAhHG6-38=.207affaf-f797-4726-a699-f3401f8718b1@github.com>
Message-ID: <IPveZ-6JcDVhPvHpc0RUE5X3rE8jVMvIbgaVh8GFYpE=.2e390992-3930-4097-8a10-1d6b58eebab6@github.com>

On Sat, 5 Feb 2022 09:18:17 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> src/hotspot/share/runtime/stackOverflow.hpp line 166:
>> 
>>> 164:   //     into adjacent thread stack, or even into other readable memory. This would potentially
>>> 165:   //     pass the check by accident.
>>> 166:   //  c) Allow for incremental stack growth by handling traps from not yet committed thread
>> 
>> I failed to understand why we have to do "incremental stack growth" here.  Why can't use touch the last page? 
>> 
>> __ bang_stack_with_offset(n_shadow_pages*page_size);
>> 
>> 
>> The entire shadow zone is mapped. Touching it causes commit, page faults or SEGV. First 2 events are transparent for the userspace process.
>> 
>> Hotspot will trap into the signal handler if `bang_stack_shadow_pages` does cross shadow_zone_safe_limit().  `rsp + n_shadow_pages * page_size` falls into 2 possibilities: 
>> 1. red zone:  the program is about to die anyway. 
>> 2. yellow reserved zones, both are recoverable.
>> 
>> I feel it's not necessary to touch pages from 1 to n_shadow_pages-1. The side effect is same as touching the last page directly.
>> 
>> ps: I tried this [idea](https://github.com/navyxliu/jdk/runs/5075962312?check_suite_focus=true). 2 failures are found on Windows. I guess the premise that the shadow zone is mapped is false on Windows. 
>> 
>> compiler/interpreter/cr7116216/StackOverflow.java 
>> compiler/uncommontrap/UncommonTrapStackBang.java
>
> I read this blogpost and I need to take back my comment. 
> https://pangin.pro/posts/stack-overflow-handling
> 
> now I think interpreter has to do linear probing to make sure HotSpot executes Java programs correctly. reserve_zone has special meaning.  Further, if rsp is very close to the shadow_zone_safe_limit(), so-called last page may surpass red zone.

Yes, that's exactly what point "(c)" is about.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Sun Feb  6 08:03:39 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Sun, 6 Feb 2022 08:03:39 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <cqdhFR25gT5pA4vw6ZYjrcABpLUixsu16zmHwdhPNL4=.e66854cf-49bb-468d-b542-eeba25c33d3f@github.com>

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [x] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:

  Rectify comment "(c)"

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7247/files
  - new: https://git.openjdk.java.net/jdk/pull/7247/files/b1ed28f8..c3983819

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=00-01

  Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7247.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7247/head:pull/7247

PR: https://git.openjdk.java.net/jdk/pull/7247

From david.holmes at oracle.com  Mon Feb  7 01:12:21 2022
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Feb 2022 11:12:21 +1000
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <44ce9669-71cc-0c20-ecbf-265845626820@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <44ce9669-71cc-0c20-ecbf-265845626820@oracle.com>
Message-ID: <be727ee0-6df9-5971-fe37-1f96fb49b608@oracle.com>

Just for the record ...

On 3/02/2022 7:19 pm, David Holmes wrote:
> Hi Ioi,
> 
> For the benefit of the mailing list discussion ...
> 
> On 3/02/2022 5:30 pm, Ioi Lam wrote:
>> Please see the bug report [1] for detailed description and test cases.
>>
>> I'd like to have some discussion before we can decide what to do.
>>
>> I discovered this issue when analyzing JDK-8279484 [2]. Under 
>> Kubernetes (minikube), Runtime.availableProcessors() returns 1, 
>> despite that the fact the machine has 32 CPUs, the Kubernetes node has 
>> a single deployment, and no CPU limits were set.
>>
>> Specifically, I want to understand why the JDK is using 
>> CgroupSubsystem::cpu_shares() to limit the number of CPUs used by the 
>> Java process.
> 
> Because we were asked to by customers deploying in containers.
> 
>> In cgroup, there are other ways that are designed specifically for 
>> limiting the number of CPUs, i.e., CgroupSubsystem::cpu_quota(). Why 
>> is using cpu_quota() alone not enough? Why did we choose the current 
>> approach of considering both cpu_quota() and cpu_shares()?
> 
> Because people were using both (whether that made sense or not) and so 
> we needed a policy on what to do if both were set.
> 
>> My guess is that sometimes people don't limit the actual number of 
>> CPUs per container, but instead use CPU Shares to set the relative 
>> scheduling priority between containers.
>>
>> I.e., they run "docker run --cpu-shares=1234" without using the 
>> "--cpus" flag.
>>
>> If this is indeed the reason, I can understand the (good) intention, 
>> but the solution seems awfully insufficient.
>>
>> CPU Shares is a *relative* number. How much CPU is allocated to you 
>> depends on
>>
>> - how many other processes are actively running
>> - what their CPU Shares are
>>
>> The above information can change dynamically, as other processes may 
>> be added or removed, and they can change between active and idle states.
>>
>> However, the JVM treats CPU Shares as an *absolute/static* number, and 
>> sets the CPU quota of the current process using this very simplistic 
>> formula.
> 
>  From old discussion and the code I believe the thought was that share 
> was relative to the the per-cpu default shares of 1024. So we use that 
> to determine the fraction of each CPU that should be assigned, and we 
> should then use that to determine the available number of CPUs. But that 
> isn't what we actually do - we only calculate the fraction and round it 
> up to get the number of CPUs and that is wrong (and typically only gives 
> 1 cpu because shares < 1024). I speculate that what was intended was to 
> map from having an X% share of each CPU, to instead having access to X% 
> of the total CPUs (at 100% of each). Mathematically this has some basis 
> but it actually makes no practical sense from a throughput or response 
> time perspective. If I'm allowed 50% of the CPU per time period to do my 
> calculations, I want 100% of each CPU for half of the period as that 
> potentially minimises the elapsed time till I have a result.
> 
>> Value of /sys/fs/cgroup/cpu.shares -> cpu quota:
>>
>> ???? 1023 -> 1 CPU
>> ???? 1024 -> no limit (huh??)
>> ???? 2048 -> 2 CPUs
>> ???? 4096 -> 4 CPUs
>>
>> This seems just wrong to me. There's no way you can get a "correct" 
>> result without knowing anything about other processes that are running 
>> at the same time.
> 
> As I said above and in the bug report I think this was an error and the 
> intent was to then multiply by the number of actual processors.

Not it was not an error. See the discussion Severin referenced:

http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036093.html

David
-----

>> The net effect is when Java is running under a container, more likely 
>> that not, the JVM will limit itself to a single CPU. This seems really 
>> inefficient to me.
> 
> Yes.
> 
>> What should we do?
> 
> We could just adjust the calculation as I suggested.
> 
> Or, given that share aka weight is meaningless without knowing the total 
> weight in the system we could just ignore it. The app then gets access 
> to all cpu's and it is up to the container to track actual usage and 
> impose any limits configured.
> 
> I've always thought that these cgroups mechanisms were fundamentally 
> flawed and that if the intent was to define a resource limited 
> environment, then the environment should report what resources were 
> available by the normal APIs. They got this right with cpu-sets by 
> integrating with sched_getaffinity; but for shares and quotas it has 
> been left to the applications to try and figure out what that should 
> mean - and that makes no sense to me.
> 
> Cheers,
> David
> 
>> Thanks
>> - Ioi
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8281181
>> [2] https://bugs.openjdk.java.net/browse/JDK-8279484

From ioi.lam at oracle.com  Mon Feb  7 04:16:30 2022
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 6 Feb 2022 20:16:30 -0800
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
Message-ID: <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>

On 2/3/2022 3:29 AM, Severin Gehwolf wrote:
> Hi Ioi,
>
> On Wed, 2022-02-02 at 23:30 -0800, Ioi Lam wrote:
>> Please see the bug report [1] for detailed description and test cases.
>>
>> I'd like to have some discussion before we can decide what to do.
>>
>> I discovered this issue when analyzing JDK-8279484 [2]. Under Kubernetes
>> (minikube), Runtime.availableProcessors() returns 1, despite that the
>> fact the machine has 32 CPUs, the Kubernetes node has a single
>> deployment, and no CPU limits were set.
>  From looking at the bug it would be good to know why a cpu.weight value
> of 1 is being obverved. The default is 100. I.e. if it is really unset:
>
> $ sudo docker run --rm -v $(pwd)/jdk17:/opt/jdk:z fedora:35 /opt/jdk/bin/java -Xlog:os+container=trace --version
> [0.000s][trace][os,container] OSContainer::init: Initializing Container Support
> [0.001s][debug][os,container] Detected cgroups v2 unified hierarchy
> [0.001s][trace][os,container] Path to /memory.max is /sys/fs/cgroup//memory.max
> [0.001s][trace][os,container] Raw value for memory limit is: max
> [0.001s][trace][os,container] Memory Limit is: Unlimited
> [0.001s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup//cpu.max
> [0.001s][trace][os,container] Raw value for CPU quota is: max
> [0.001s][trace][os,container] CPU Quota is: -1
> [0.001s][trace][os,container] Path to /cpu.max is /sys/fs/cgroup//cpu.max
> [0.001s][trace][os,container] CPU Period is: 100000
> [0.001s][trace][os,container] Path to /cpu.weight is /sys/fs/cgroup//cpu.weight
> [0.001s][trace][os,container] Raw value for CPU shares is: 100
> [0.001s][debug][os,container] CPU Shares is: -1
> [0.001s][trace][os,container] OSContainer::active_processor_count: 4
> [0.001s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4
> [0.001s][debug][os,container] container memory limit unlimited: -1, using host value
> [0.001s][debug][os,container] container memory limit unlimited: -1, using host value
> [0.002s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4
> [0.007s][debug][os,container] container memory limit unlimited: -1, using host value
> [0.014s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 4
> [0.022s][trace][os,container] Path to /memory.max is /sys/fs/cgroup//memory.max
> [0.022s][trace][os,container] Raw value for memory limit is: max
> [0.022s][trace][os,container] Memory Limit is: Unlimited
> [0.022s][debug][os,container] container memory limit unlimited: -1, using host value
> openjdk 17.0.2-internal 2022-01-18
> OpenJDK Runtime Environment (build 17.0.2-internal+0-adhoc.sgehwolf.jdk17u)
> OpenJDK 64-Bit Server VM (build 17.0.2-internal+0-adhoc.sgehwolf.jdk17u, mixed mode, sharing)

In JDK-8279484, the JVM is launched by Kubernetes, which manages CPU 
resources with the concept of "request" and "limit".

https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/


Quote: "Containers cannot use more CPU than the configured limit.
 ? ? ? ? Provided the system has CPU time free, a container
 ??????? is guaranteed to be allocated as much CPU as it requests."


So "CPU request" is a guaranteed minimum. For example, if you have a 
container that requests 6 CPUs, but all the hosts in your cluster have 
no more than 4 CPUs each, then this container will never be deployed by 
Kubernetes, because the minimum of 6 CPUs cannot be guaranteed.


Consider the following 4 cases:

(1) You specify both "cpu request" and "cpu limit"

(2) You specify only "cpu limit" -> Kubernetes will set the
 ? ? "cpu request" to be the same as the limit.

(3) If you specify only "cpu request", Kubernetes will set
 ? ? the "cpu limit" to a default value that's not smaller
 ? ? than the request.

(4) Neither "cpu request" nor "cpu limit" is set

(For details about the defaults, see 
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/ 
)


In the first 3 cases, the JVM (in cgroupv1)? will see that both 
cpu.cfs_quota_us and cpu.shares are set. The cpu.shares will be ignored 
(due to the PreferContainerQuotaForCPUCount flag. See JDK-8197867).

Case (4) is the cause for the bug in JDK-8279484

Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares to 2. 
This means:

- This container is guaranteed a minimum amount of CPU resources
- If no other containers are executing, this container can use as
 ? much CPU as available on the host
- If other containers are executing, the amount of CPU available
 ? to this container is (2 / (sum of cpu.shares of all active
 ? containers))


The fundamental problem with the current JVM implementation is that it 
treats "CPU request" as a maximum value, the opposite of what Kubernetes 
does. Because of this, in case (4), the JVM artificially limits itself 
to a single CPU. This leads to CPU underutilization.


>> Specifically, I want to understand why the JDK is using
>> CgroupSubsystem::cpu_shares() to limit the number of CPUs used by the
>> Java process.
> TLDR: Kubernetes and/or other container orchestration frameworks? That
> was back in the day of cgroups v1, though.
>
>> In cgroup, there are other ways that are designed specifically for
>> limiting the number of CPUs, i.e., CgroupSubsystem::cpu_quota(). Why is
>> using cpu_quota() alone not enough? Why did we choose the current
>> approach of considering both cpu_quota() and cpu_shares()?
> Kubernetes has a concept of "cpu requests" and "cpu limit". It maps (or
> mapped?) those values to cpu shares and cpu quota in cgroups.
>
>> My guess is that sometimes people don't limit the actual number of CPUs
>> per container, but instead use CPU Shares to set the relative scheduling
>> priority between containers.
>>
>> I.e., they run "docker run --cpu-shares=1234" without using the "--cpus"
>> flag.
>>
>> If this is indeed the reason, I can understand the (good) intention, but
>> the solution seems awfully insufficient.
>>
>> CPU Shares is a *relative* number. How much CPU is allocated to you
>> depends on
>>
>> - how many other processes are actively running
>> - what their CPU Shares are
>>
>> The above information can change dynamically, as other processes may be
>> added or removed, and they can change between active and idle states.
>>
>> However, the JVM treats CPU Shares as an *absolute/static* number, and
>> sets the CPU quota of the current process using this very simplistic
>> formula.
>>
>> Value of /sys/fs/cgroup/cpu.shares -> cpu quota:
>>
>>  ???? 1023 -> 1 CPU
>>  ???? 1024 -> no limit (huh??)
>>  ???? 2048 -> 2 CPUs
>>  ???? 4096 -> 4 CPUs
>>
>> This seems just wrong to me. There's no way you can get a "correct"
>> result without knowing anything about other processes that are running
>> at the same time.
>>
>> The net effect is when Java is running under a container, more likely
>> that not, the JVM will limit itself to a single CPU. This seems really
>> inefficient to me.
> I believe the point is that popular container orchestration frameworks
> use the cpu requests feature to map to cpu.shares. A similar question
> regarding this was asked by myself a while ago. See JDK-8216366.
>
> Here is what Bob Vandette had to say at the time:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036093.html

To quote Bob's reply from the above e-mail:

     Although the value for cpu-shares can be set to
     any of the values that you mention, we decided to
     follow the convention set by Kubernetes and other container
     orchestration products that use 1024 as the unit for
     cpu shares.  Ignoring the cpu shares in this case is
     not what users of this popular technology
     want.

https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu

     ? The spec.containers[].resources.requests.cpu is converted to
       its core value, which is potentially fractional, and multiplied
       by 1024. The greater of this number or 2 is used as the value
       of the --cpu-shares flag in the docker run command.
     ? The spec.containers[].resources.limits.cpu is converted
       to its millicore value and multiplied by 100. The resulting
       value is the total amount of CPU time that a container can use
       every 100ms. A container cannot use more than its share of
       CPU time during this interval.

As I mentioned above, Bob's conclusion that cpu.shares should be used
as an upper limit value was probably based on the misunderstanding
of what resources.requests.cpu means in Kubernetes.

With resources.requests.cpu = 1.0, docker runs with --cpu-shares=1024

This means "I need at least 1 CPU to execute".

However, JVM incorrectly treats this as "I promise I will not used
more than 1 CPU'.


Thanks
- Ioi


> Thanks,
> Severin
>
>> What should we do?
>>
>> Thanks
>> - Ioi
>>
>> [1]https://bugs.openjdk.java.net/browse/JDK-8281181
>> [2]https://bugs.openjdk.java.net/browse/JDK-8279484
>>

From shade at openjdk.java.net  Mon Feb  7 07:31:10 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 7 Feb 2022 07:31:10 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <Uzm608EYxhqmhpzYhKRiv5ecj75iov26SuxUogz9G9s=.43c1a92c-ae0d-480e-a25f-3f116c758134@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <X43GMemeHr1OkAWD1lNO8FOQOcMff2zW2-ikz8v2mVI=.a4784b47-7a42-4fbd-830c-582ac8a8de52@github.com>
 <Uzm608EYxhqmhpzYhKRiv5ecj75iov26SuxUogz9G9s=.43c1a92c-ae0d-480e-a25f-3f116c758134@github.com>
Message-ID: <Zm18xeCkseqIKo3dnYrNB25uyFpjHCMFounlerKYsng=.0f7de078-f9d8-4e00-9908-71345e018085@github.com>

On Sat, 5 Feb 2022 05:50:34 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> > since you this PR touches stackoverflow.hpp, Could you also take a look at this? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stackOverflow.cpp#L66
> > we actually get the page size from os. why do we need alignment = 4k?
> 
> Look here:
> 
> https://github.com/openjdk/jdk/blob/48523b090886f7b24ed4009f0c150efaa6f7b056/src/hotspot/share/runtime/stackOverflow.cpp#L42-L45
> -- the `StackYellowPages`, `StackRedPages`, `StackShadowPages` are defined in as 4K pages. It should probably be called `unit`, not `alignment`. I'd like to avoid scope creep for this PR, so that's for another day.

Done in #7362.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From mdoerr at openjdk.java.net  Mon Feb  7 10:18:10 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Mon, 7 Feb 2022 10:18:10 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <YzXalgzruUvkAErEOmB5_-F-Jfsm_cJXze7fhtIg5B4=.1de70f12-d4ca-44dd-8caf-e838422efd56@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

Awesome! Thanks a lot for testing! I have also run all JFR jtreg tests I found and the have passed. So, I will need reviews to proceed. I'll backport to 17u and 11u (together with the other JFR related fixes).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From aph at openjdk.java.net  Mon Feb  7 10:53:14 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 10:53:14 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
Message-ID: <-FQjRIxxiyiMqo8kEwUVP6XzCEfHIiRdbmlmGaPfXmA=.c56eee79-194a-425b-a645-775deb963d7b@github.com>

On Thu, 3 Feb 2022 16:51:48 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Documentation updates

doc/building.md line 141:

> 139: 
> 140: In order to use Branch Protection features in the VM, `--enable-branch-protection`
> 141: must be provided. This requires compiler support (GCC 9.1.0+ or Clang 10+). The

Suggestion:

must be used. This option requires C++ compiler support (GCC 9.1.0+ or Clang 10+). The

doc/building.md line 143:

> 141: must be provided. This requires compiler support (GCC 9.1.0+ or Clang 10+). The
> 142: resulting build can be run on both machines with and without support for branch
> 143: protection in hardware. This is only supported for Linux targets.

Suggestion:

protection in hardware. Branch Protection is only supported for Linux targets.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:02:11 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:02:11 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
Message-ID: <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>

On Mon, 7 Feb 2022 10:55:52 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Documentation updates
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5293:
> 
>> 5291: // Create an additional frame for a function.
>> 5292: void MacroAssembler::enter_subframe() {
>> 5293:   // Addresses can only be signed once, so strip it first. PAC safe because the value is not
> 
> This needs a more descriptive name. `enter_and_sign()` ? No, that's not right either. How do we come up with a name that's more descriptive?

Because enter always enters a subframe. That's what it's for.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:02:11 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:02:11 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
Message-ID: <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>

On Thu, 3 Feb 2022 16:51:48 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Documentation updates

src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 1163:

> 1161: #undef INSN
> 1162: 
> 1163:   // PAC branch instructions (with register modifier)

This section title is wrong. According to DDI0487G, the correct section title is "Unconditional branch (register)". All of the instructions in each section of this file should be grouped in the same way that they are in the Arm ARM.

src/hotspot/cpu/aarch64/frame_aarch64.cpp line 275:

> 273:   if (TracePcPatching) {
> 274:     tty->print_cr("patch_pc at address " INTPTR_FORMAT " [" INTPTR_FORMAT " -> " INTPTR_FORMAT "]",
> 275:                   p2i(pc_addr), p2i(*pc_addr), p2i(signed_pc));

Let's see both pc and signed pc here.

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5293:

> 5291: // Create an additional frame for a function.
> 5292: void MacroAssembler::enter_subframe() {
> 5293:   // Addresses can only be signed once, so strip it first. PAC safe because the value is not

This needs a more descriptive name. `enter_and_sign()` ? No, that's not right either. How do we come up with a name that's more descriptive?

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:09:14 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:09:14 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
Message-ID: <z94AJk6GHQRxUcbNQdGd3K_1M2XUJoFE9ZcgoydG0h8=.2d7fe08f-475e-4625-84f6-6503d52d42b9@github.com>

On Mon, 7 Feb 2022 10:58:59 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Documentation updates
>
> src/hotspot/cpu/aarch64/frame_aarch64.cpp line 275:
> 
>> 273:   if (TracePcPatching) {
>> 274:     tty->print_cr("patch_pc at address " INTPTR_FORMAT " [" INTPTR_FORMAT " -> " INTPTR_FORMAT "]",
>> 275:                   p2i(pc_addr), p2i(*pc_addr), p2i(signed_pc));
> 
> Let's see both pc and signed pc here.

> Let's see both pc and signed pc here, if they are different.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:09:14 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:09:14 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
Message-ID: <RzxoYr-y8OWsLLQtdUnZeT2E3P3ScmXQ_Kr8Rp-NQyc=.4458672a-1b51-424c-84c8-abd7b6cf4518@github.com>

On Thu, 3 Feb 2022 16:51:48 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Documentation updates

src/hotspot/os_cpu/linux_aarch64/pauth_linux_aarch64.inline.hpp line 57:

> 55:     register address r17 __asm("r17") = ret_addr;
> 56:     register address r16 __asm("r16") = sp;
> 57:     asm volatile (PACIA1716 : "+r"(r17) : "r"(r16));

I don't see the point of `volatile` here, any more than you'd use volatile on an addition. `volatile` is when you have a side effect you care about.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:17:18 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:17:18 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
Message-ID: <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>

On Thu, 3 Feb 2022 16:51:48 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Documentation updates

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5328:

> 5326: // Uses the FP from the start of the function as the modifier - which is stored at the address of
> 5327: // the current FP.
> 5328: //

Is it? C2 uses FP as a scratch register. I guess we know that this is never used in C2-generated code? I'm tempted to put an assertion here, just in case. Or does it not matter?

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb  7 11:32:15 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 11:32:15 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <z94AJk6GHQRxUcbNQdGd3K_1M2XUJoFE9ZcgoydG0h8=.2d7fe08f-475e-4625-84f6-6503d52d42b9@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <z94AJk6GHQRxUcbNQdGd3K_1M2XUJoFE9ZcgoydG0h8=.2d7fe08f-475e-4625-84f6-6503d52d42b9@github.com>
Message-ID: <NPNfy4lZLmfwsENiGsvDorKLZ_FeiAYzHougna3gwMs=.7b4e3251-9ff3-4886-98ea-75a6f15c2dbb@github.com>

On Mon, 7 Feb 2022 11:06:20 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> src/hotspot/cpu/aarch64/frame_aarch64.cpp line 275:
>> 
>>> 273:   if (TracePcPatching) {
>>> 274:     tty->print_cr("patch_pc at address " INTPTR_FORMAT " [" INTPTR_FORMAT " -> " INTPTR_FORMAT "]",
>>> 275:                   p2i(pc_addr), p2i(*pc_addr), p2i(signed_pc));
>> 
>> Let's see both pc and signed pc here.
>
>> Let's see both pc and signed pc here, if they are different.

Are you sure? At the moment with PAC we get:

patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc -> 0x00abffffed17b7f8]

With both signed and unsigned you'd have:

patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc (0x0000ffffed17b5fc) -> 0x00abffffed17b7f8 (0x0000ffffed17b7f8)]

I prefer the first - it's shorter and you can infer the address from the signed version. Happy to go with the longer version if you think the shorter version is confusing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:46:20 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:46:20 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <NPNfy4lZLmfwsENiGsvDorKLZ_FeiAYzHougna3gwMs=.7b4e3251-9ff3-4886-98ea-75a6f15c2dbb@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <z94AJk6GHQRxUcbNQdGd3K_1M2XUJoFE9ZcgoydG0h8=.2d7fe08f-475e-4625-84f6-6503d52d42b9@github.com>
 <NPNfy4lZLmfwsENiGsvDorKLZ_FeiAYzHougna3gwMs=.7b4e3251-9ff3-4886-98ea-75a6f15c2dbb@github.com>
Message-ID: <bUnkLEe5QvHNyxwlD4ILN57TrCyRvdj-n_r3G2AXTMc=.efec51c1-c062-4da4-8c93-7c601923d86d@github.com>

On Mon, 7 Feb 2022 11:28:30 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>>> Let's see both pc and signed pc here, if they are different.
>
> Are you sure? At the moment with PAC we get:
> 
> patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc -> 0x00abffffed17b7f8]
> 
> With both signed and unsigned you'd have:
> 
> patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc (0x0000ffffed17b5fc) -> 0x00abffffed17b7f8 (0x0000ffffed17b7f8)]
> 
> I prefer the first - it's shorter and you can infer the address from the signed version. Happy to go with the longer version if you think the shorter version is confusing.

You've been looking at PAC-signed addresses for a long time. 
Let's see "at address [prev true dest -> new true dest] [signed prev signed dest -> new signed dest]", but only show the signed dests if they're different. So it appears as 
`patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc -> 0x00abffffed17b7f8] [signed 0x0000ffffed17b5fc ->  0x0000ffffed17b7f8]` .

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb  7 11:46:21 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 11:46:21 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>
Message-ID: <CPrtaN-iAmOQIS0MCMo9Ss3z_RDqKkeifZ2Ij6FRcCo=.62f9ade2-8472-4b3f-b448-ef73655eeb0b@github.com>

On Mon, 7 Feb 2022 10:57:15 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5293:
>> 
>>> 5291: // Create an additional frame for a function.
>>> 5292: void MacroAssembler::enter_subframe() {
>>> 5293:   // Addresses can only be signed once, so strip it first. PAC safe because the value is not
>> 
>> This needs a more descriptive name. `enter_and_sign()` ? No, that's not right either. How do we come up with a name that's more descriptive?
>
> Because enter always enters a subframe. That's what it's for.

enter_nested() ?
enter_inner() ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb  7 11:46:21 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 11:46:21 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
Message-ID: <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>

On Mon, 7 Feb 2022 11:11:02 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Documentation updates
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5328:
> 
>> 5326: // Uses the FP from the start of the function as the modifier - which is stored at the address of
>> 5327: // the current FP.
>> 5328: //
> 
> Is it? C2 uses FP as a scratch register. I guess we know that this is never used in C2-generated code? I'm tempted to put an assertion here, just in case. Or does it not matter?

Allocating FP is disabled for rop protection:

aarch64.md has:
// r29 is not allocatable when PreserveFramePointer or ROP protection is on
if (PreserveFramePointer || VM_Version::use_rop_protection()) {

I think that covers it.
What assertion would you want to check?

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb  7 11:57:12 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 11:57:12 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <bUnkLEe5QvHNyxwlD4ILN57TrCyRvdj-n_r3G2AXTMc=.efec51c1-c062-4da4-8c93-7c601923d86d@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <z94AJk6GHQRxUcbNQdGd3K_1M2XUJoFE9ZcgoydG0h8=.2d7fe08f-475e-4625-84f6-6503d52d42b9@github.com>
 <NPNfy4lZLmfwsENiGsvDorKLZ_FeiAYzHougna3gwMs=.7b4e3251-9ff3-4886-98ea-75a6f15c2dbb@github.com>
 <bUnkLEe5QvHNyxwlD4ILN57TrCyRvdj-n_r3G2AXTMc=.efec51c1-c062-4da4-8c93-7c601923d86d@github.com>
Message-ID: <RFD3_8xVu95FK1eppLHDTkL3-hdvVXNDOHfzzLd5sb0=.7725501b-cbe1-43a9-8f21-dfa83c9c22f2@github.com>

On Mon, 7 Feb 2022 11:43:13 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Are you sure? At the moment with PAC we get:
>> 
>> patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc -> 0x00abffffed17b7f8]
>> 
>> With both signed and unsigned you'd have:
>> 
>> patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc (0x0000ffffed17b5fc) -> 0x00abffffed17b7f8 (0x0000ffffed17b7f8)]
>> 
>> I prefer the first - it's shorter and you can infer the address from the signed version. Happy to go with the longer version if you think the shorter version is confusing.
>
> You've been looking at PAC-signed addresses for a long time. 
> Let's see "at address [prev true dest -> new true dest] [signed prev signed dest -> new signed dest]", but only show the signed dests if they're different. So it appears as 
> `patch_pc at address 0x0000fffff58edf98 [0x0068ffffed17b5fc -> 0x00abffffed17b7f8] [signed 0x0000ffffed17b5fc ->  0x0000ffffed17b7f8]` .

ok, that looks better than my longer version, I'll go with that

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 11:57:12 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 11:57:12 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
Message-ID: <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>

On Mon, 7 Feb 2022 11:41:57 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5328:
>> 
>>> 5326: // Uses the FP from the start of the function as the modifier - which is stored at the address of
>>> 5327: // the current FP.
>>> 5328: //
>> 
>> Is it? C2 uses FP as a scratch register. I guess we know that this is never used in C2-generated code? I'm tempted to put an assertion here, just in case. Or does it not matter?
>
> Allocating FP is disabled for rop protection:
> 
> aarch64.md has:
> // r29 is not allocatable when PreserveFramePointer or ROP protection is on
> if (PreserveFramePointer || VM_Version::use_rop_protection()) {
> 
> I think that covers it.
> What assertion would you want to check?

If `UseROPProtection` is on, is there any reason not to set `PreserveFramePointer`, and assert here that it is set? It is a crucial assumption, so let's assert it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 12:01:13 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 12:01:13 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <CPrtaN-iAmOQIS0MCMo9Ss3z_RDqKkeifZ2Ij6FRcCo=.62f9ade2-8472-4b3f-b448-ef73655eeb0b@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>
 <CPrtaN-iAmOQIS0MCMo9Ss3z_RDqKkeifZ2Ij6FRcCo=.62f9ade2-8472-4b3f-b448-ef73655eeb0b@github.com>
Message-ID: <QbRR80JhYnzAnB9o9HMSJFU-lVpINvDXfh4AisP9VEM=.513ff0b8-d42e-4ef1-8c8b-88db1b72b772@github.com>

On Mon, 7 Feb 2022 11:42:43 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> Because enter always enters a subframe. That's what it's for.
>
> enter_nested() ?
> enter_inner() ?

Tell you what, first put a comment here that says when it should (and therefore, should not) be used. Once it's clear exactly what this is for, thinking of a name maight be easier.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb  7 12:04:16 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 12:04:16 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
Message-ID: <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>

On Mon, 7 Feb 2022 11:54:09 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Allocating FP is disabled for rop protection:
>> 
>> aarch64.md has:
>> // r29 is not allocatable when PreserveFramePointer or ROP protection is on
>> if (PreserveFramePointer || VM_Version::use_rop_protection()) {
>> 
>> I think that covers it.
>> What assertion would you want to check?
>
> If `UseROPProtection` is on, is there any reason not to set `PreserveFramePointer`, and assert here that it is set? It is a crucial assumption, so let's assert it.

PreserveFramePointer is doing some additional stuff. I'll give it a test to make sure everything still works with PreserveFramePointer fully set. It would make things easier just to force it set with rop protection on.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Mon Feb  7 12:29:21 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 12:29:21 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
 <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
Message-ID: <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>

On Mon, 7 Feb 2022 12:01:18 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> PreserveFramePointer is doing some additional stuff. I'll give it a test to make sure everything still works with PreserveFramePointer fully set. It would make things easier just to force it set with rop protection on.

Using PreserveFramePointer greatly simplifies the testing matrix, and has little adverse performance impact beyond disallowing C2 from allocating FP as a scratch register. It also simplifies this patch, which would be a very Good Thing. Let's do it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From jiefu at openjdk.java.net  Mon Feb  7 12:43:02 2022
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Feb 2022 12:43:02 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <Zm18xeCkseqIKo3dnYrNB25uyFpjHCMFounlerKYsng=.0f7de078-f9d8-4e00-9908-71345e018085@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <X43GMemeHr1OkAWD1lNO8FOQOcMff2zW2-ikz8v2mVI=.a4784b47-7a42-4fbd-830c-582ac8a8de52@github.com>
 <Uzm608EYxhqmhpzYhKRiv5ecj75iov26SuxUogz9G9s=.43c1a92c-ae0d-480e-a25f-3f116c758134@github.com>
 <Zm18xeCkseqIKo3dnYrNB25uyFpjHCMFounlerKYsng=.0f7de078-f9d8-4e00-9908-71345e018085@github.com>
Message-ID: <FK6UYDqovtJ7RjWmfiSQ6p006jD6QUB5PkNE4rH6GZc=.3a4355db-4986-405a-bdd8-04d8e58bd020@github.com>

On Mon, 7 Feb 2022 07:27:57 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>>> since you this PR touches stackoverflow.hpp, Could you also take a look at this? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stackOverflow.cpp#L66
>>> 
>>> we actually get the page size from os. why do we need alignment = 4k?
>> 
>> Look here: https://github.com/openjdk/jdk/blob/48523b090886f7b24ed4009f0c150efaa6f7b056/src/hotspot/share/runtime/stackOverflow.cpp#L42-L45 -- the `StackYellowPages`, `StackRedPages`, `StackShadowPages` are defined in as 4K pages. It should probably be called `unit`, not `alignment`. I'd like to avoid scope creep for this PR, so that's for another day.
>
>> > since you this PR touches stackoverflow.hpp, Could you also take a look at this? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/stackOverflow.cpp#L66
>> > we actually get the page size from os. why do we need alignment = 4k?
>> 
>> Look here:
>> 
>> https://github.com/openjdk/jdk/blob/48523b090886f7b24ed4009f0c150efaa6f7b056/src/hotspot/share/runtime/stackOverflow.cpp#L42-L45
>> -- the `StackYellowPages`, `StackRedPages`, `StackShadowPages` are defined in as 4K pages. It should probably be called `unit`, not `alignment`. I'd like to avoid scope creep for this PR, so that's for another day.
> 
> Done in #7362.

Hi @shipilev ,

Did you test the perf improvement base on the latest jdk?
I tried to test SPECjvm2008's `compiler.compiler` with jdk19, but failed with

  Benchmark:   compiler.compiler
  Run mode:    timed run
  Test type:   multi
  Threads:     8
  Warmup:      120s
  Iterations:  1
  Run length:  240s
Error in setup of Benchmark.
spec.harness.StopBenchmarkException: Error invoking bmSetupBenchmarkMethod
        at spec.harness.ProgramRunner.invokeBmSetupBenchmark(ProgramRunner.java:185)
        at spec.harness.ProgramRunner.runBenchmark(ProgramRunner.java:301)
        at spec.harness.ProgramRunner.run(ProgramRunner.java:98)
Caused by: java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:119)
        at java.base/java.lang.reflect.Method.invoke(Method.java:577)
        at spec.harness.ProgramRunner.invokeBmSetupBenchmark(ProgramRunner.java:183)
        ... 2 more
Caused by: java.lang.NoClassDefFoundError: com/sun/tools/javac/util/JavacFileManager
        at java.base/java.lang.ClassLoader.defineClass1(Native Method)
        at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1013)
        at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
        at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
        at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
        at spec.benchmarks.compiler.MainBase.preSetupBenchmark(MainBase.java:38)
        at spec.benchmarks.compiler.compiler.Main.setupBenchmark(Main.java:38)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
        ... 4 more
Caused by: java.lang.ClassNotFoundException: com.sun.tools.javac.util.JavacFileManager
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
        ... 16 more

Warmup (120s) begins: Mon Feb 07 20:35:40 CST 2022
Warmup (120s) ends:   Mon Feb 07 20:35:40 CST 2022
Warmup (120s) result:  **NOT VALID**


Am I missed something?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From duke at openjdk.java.net  Mon Feb  7 13:47:14 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 13:47:14 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <QbRR80JhYnzAnB9o9HMSJFU-lVpINvDXfh4AisP9VEM=.513ff0b8-d42e-4ef1-8c8b-88db1b72b772@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>
 <CPrtaN-iAmOQIS0MCMo9Ss3z_RDqKkeifZ2Ij6FRcCo=.62f9ade2-8472-4b3f-b448-ef73655eeb0b@github.com>
 <QbRR80JhYnzAnB9o9HMSJFU-lVpINvDXfh4AisP9VEM=.513ff0b8-d42e-4ef1-8c8b-88db1b72b772@github.com>
Message-ID: <-V7ptCS4QdcpFHOomMnTPPYvFtKSQ0nswzFNXQDoWLg=.2d72897f-ef45-4867-892f-64df085eca85@github.com>

On Mon, 7 Feb 2022 11:58:22 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> enter_nested() ?
>> enter_inner() ?
>
> Tell you what, first put a comment here that says when it should (and therefore, should not) be used. Once it's clear exactly what this is for, thinking of a name maight be easier.

How about extending the existing enter() function: 

// Enter a new stack frame for the current method.
// nested:     Indicates a frame has already been entered (and not left) for the current method. 
void MacroAssembler::enter(bool nested=false) {
   if (nested) strip()
   protect()
   stp()
   mov()
}

This would add an additional bool check for every call of enter() - that's at code generation time, so probably not an issue.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb  7 13:58:51 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 7 Feb 2022 13:58:51 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v19]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <jLq9QGE621_DBir4higkvKirBRyAMjZ43S1A8Gn1lKQ=.7c184c9c-24f0-4e9f-aa28-c1ccaf9eb7af@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Review fixups

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/d97883b5..614a3262

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=18
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=17-18

  Stats: 20 lines in 4 files changed: 7 ins; 6 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From coleenp at openjdk.java.net  Mon Feb  7 14:41:06 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 7 Feb 2022 14:41:06 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <bcd3iyUdLhW_zzhY_qsjdi0YXqKvDV0REIFj8houDpk=.90fe0c25-d38e-4e7b-b68a-c1922f396ca7@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <cqdhFR25gT5pA4vw6ZYjrcABpLUixsu16zmHwdhPNL4=.e66854cf-49bb-468d-b542-eeba25c33d3f@github.com>
 <bcd3iyUdLhW_zzhY_qsjdi0YXqKvDV0REIFj8houDpk=.90fe0c25-d38e-4e7b-b68a-c1922f396ca7@github.com>
Message-ID: <Z77feUOqTsVXATzFbdS9oHDoU5SKID8vU6LHsThGnpc=.70376261-90b6-484f-b387-6986fb6f6c02@github.com>

On Mon, 7 Feb 2022 14:20:54 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Rectify comment "(c)"
>
> src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 758:
> 
>> 756:     }
>> 757: 
>> 758:     __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit()));
> 
> stack watermark starts at stack_base, increase to current rsp to optimize away stack banging for rsp greater than this.  Cannot increase watermark if esp < shadow_zone_safe_limit because of ...

looking for a brief comment why here.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From coleenp at openjdk.java.net  Mon Feb  7 14:41:06 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 7 Feb 2022 14:41:06 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <cqdhFR25gT5pA4vw6ZYjrcABpLUixsu16zmHwdhPNL4=.e66854cf-49bb-468d-b542-eeba25c33d3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <cqdhFR25gT5pA4vw6ZYjrcABpLUixsu16zmHwdhPNL4=.e66854cf-49bb-468d-b542-eeba25c33d3f@github.com>
Message-ID: <bcd3iyUdLhW_zzhY_qsjdi0YXqKvDV0REIFj8houDpk=.90fe0c25-d38e-4e7b-b68a-c1922f396ca7@github.com>

On Sun, 6 Feb 2022 08:03:39 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Rectify comment "(c)"

This looks like a nice optimization to me.  I initially thought that calling the new limit of where we can elide stack banging "watermark" had something to do with the GC stack watermark code but "watermark" is sort of the best word for this.  If we had another descriptive word that might be better, but I can't think of anything.
Thank you for fixing this! We didn't have tests showing the motivation ourselves so sorry that we ignored it for so long.

src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 758:

> 756:     }
> 757: 
> 758:     __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit()));

stack watermark starts at stack_base, increase to current rsp to optimize away stack banging for rsp greater than this.  Cannot increase watermark if esp < shadow_zone_safe_limit because of ...

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7247

From chagedorn at openjdk.java.net  Mon Feb  7 15:04:10 2022
From: chagedorn at openjdk.java.net (Christian Hagedorn)
Date: Mon, 7 Feb 2022 15:04:10 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files [v3]
In-Reply-To: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
Message-ID: <N3jsXjCt7lKMP99M8f_JY5ncSmdGGQYdCe1AAc0wo98=.1e21096c-812d-46a1-8d98-b644d2a78b6e@github.com>

> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method:
> 
> Stack: [0x00007f6e01739000,0x00007f6e0183a000],  sp=0x00007f6e01838110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64
> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec
> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899
> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df
> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69
> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d
> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167
> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180
> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f
> 
> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method.
> 
> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)):
> 
> Stack: [0x00007f34fca18000,0x00007f34fcb19000],  sp=0x00007f34fcb17110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64  (c1_Compilation.cpp:607)
> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec  (c1_Compiler.cpp:250)
> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899  (compileBroker.cpp:2291)
> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df  (compileBroker.cpp:1966)
> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69  (compilerThread.cpp:59)
> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d  (thread.cpp:1297)
> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167  (thread.cpp:1280)
> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180  (thread.cpp:358)
> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f  (os_linux.cpp:705)
> 
> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. 
> 
> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf
> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. 
> 
> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability.
> 
> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. 
> 
> **Testing:**
> Apart from manual testing, I've added two kinds of tests:
> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers.
> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename.
> 
> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation.
> 
> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional  `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number.
> 
> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user).
> 
> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`!
>  
> Thanks,
> Christian

Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision:

  Change log_* to log_develop_* and log_warning to log_develop_info

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7126/files
  - new: https://git.openjdk.java.net/jdk/pull/7126/files/7ddb7737..698663b9

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=01-02

  Stats: 74 lines in 2 files changed: 0 ins; 0 del; 74 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7126/head:pull/7126

PR: https://git.openjdk.java.net/jdk/pull/7126

From aph at openjdk.java.net  Mon Feb  7 15:15:17 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 7 Feb 2022 15:15:17 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <-V7ptCS4QdcpFHOomMnTPPYvFtKSQ0nswzFNXQDoWLg=.2d72897f-ef45-4867-892f-64df085eca85@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>
 <CPrtaN-iAmOQIS0MCMo9Ss3z_RDqKkeifZ2Ij6FRcCo=.62f9ade2-8472-4b3f-b448-ef73655eeb0b@github.com>
 <QbRR80JhYnzAnB9o9HMSJFU-lVpINvDXfh4AisP9VEM=.513ff0b8-d42e-4ef1-8c8b-88db1b72b772@github.com>
 <-V7ptCS4QdcpFHOomMnTPPYvFtKSQ0nswzFNXQDoWLg=.2d72897f-ef45-4867-892f-64df085eca85@github.com>
Message-ID: <-nQf8_Gh666U_KH2wCMBEApxI3GFXre1cghHN41KoVg=.c0bc85fd-16ed-49f5-a595-73893facf6df@github.com>

On Mon, 7 Feb 2022 13:43:55 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> Tell you what, first put a comment here that says when it should (and therefore, should not) be used. Once it's clear exactly what this is for, thinking of a name maight be easier.
>
> How about extending the existing enter() function: 
> 
> // Enter a new stack frame for the current method.
> // nested:     Indicates a frame has already been entered (and not left) for the current method. 
> void MacroAssembler::enter(bool nested=false) {
>    if (nested) strip()
>    protect()
>    stp()
>    mov()
> }
> 
> This would add an additional bool check for every call of enter() - that's at code generation time, so probably not an issue.

So, `nested` is true iff we are, say, pushing an extra frame for a runtime call in the middle of generated code, but for some mysterious reason the logic is inline instead of being implemented in the obvious way as a stub.

Please do this as:

` MacroAssembler::enter(bool strip_return_address=false)`

and I'll be happy. Please make sure that all calls are commented, as in

`__ enter(/*strip_return_address*/true);`

and I'll be happy.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From shade at openjdk.java.net  Mon Feb  7 15:28:08 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 7 Feb 2022 15:28:08 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <FK6UYDqovtJ7RjWmfiSQ6p006jD6QUB5PkNE4rH6GZc=.3a4355db-4986-405a-bdd8-04d8e58bd020@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <X43GMemeHr1OkAWD1lNO8FOQOcMff2zW2-ikz8v2mVI=.a4784b47-7a42-4fbd-830c-582ac8a8de52@github.com>
 <Uzm608EYxhqmhpzYhKRiv5ecj75iov26SuxUogz9G9s=.43c1a92c-ae0d-480e-a25f-3f116c758134@github.com>
 <Zm18xeCkseqIKo3dnYrNB25uyFpjHCMFounlerKYsng=.0f7de078-f9d8-4e00-9908-71345e018085@github.com>
 <FK6UYDqovtJ7RjWmfiSQ6p006jD6QUB5PkNE4rH6GZc=.3a4355db-4986-405a-bdd8-04d8e58bd020@github.com>
Message-ID: <Qf63BIbr2nvNCQfh_qRYGLFH1cXcK-hw3LsDxLlEMLc=.54525c1b-b1d4-499d-b168-98f090a4d0fc@github.com>

On Mon, 7 Feb 2022 12:40:24 GMT, Jie Fu <jiefu at openjdk.org> wrote:

> I tried to test SPECjvm2008's `compiler.compiler` with jdk19.

I don't think currently public SPECjvm2008 works with JDK 19 due to missing dependencies. I have a hacky version that is able to work with modern JDK. I used that to estimate performance on JDK mainline.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Mon Feb  7 15:42:46 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 7 Feb 2022 15:42:46 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v3]
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <L4GyGWL29f0cjIYWIFomwRJ-60bUQMb1fBVKGfkkdXA=.1794edc2-54e9-4b5c-9fe1-5fd3001087f0@github.com>

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [x] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:

  More comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7247/files
  - new: https://git.openjdk.java.net/jdk/pull/7247/files/c3983819..2c710882

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=01-02

  Stats: 4 lines in 2 files changed: 3 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7247.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7247/head:pull/7247

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Mon Feb  7 15:42:48 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 7 Feb 2022 15:42:48 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v2]
In-Reply-To: <Z77feUOqTsVXATzFbdS9oHDoU5SKID8vU6LHsThGnpc=.70376261-90b6-484f-b387-6986fb6f6c02@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <cqdhFR25gT5pA4vw6ZYjrcABpLUixsu16zmHwdhPNL4=.e66854cf-49bb-468d-b542-eeba25c33d3f@github.com>
 <bcd3iyUdLhW_zzhY_qsjdi0YXqKvDV0REIFj8houDpk=.90fe0c25-d38e-4e7b-b68a-c1922f396ca7@github.com>
 <Z77feUOqTsVXATzFbdS9oHDoU5SKID8vU6LHsThGnpc=.70376261-90b6-484f-b387-6986fb6f6c02@github.com>
Message-ID: <e-Wf3Lo9wS02RMi0Hfhkwmvzf29IYOUvpUhTYGly51s=.c756e043-0b23-4c68-afd7-05351e281f90@github.com>

On Mon, 7 Feb 2022 14:23:38 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 758:
>> 
>>> 756:     }
>>> 757: 
>>> 758:     __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit()));
>> 
>> stack watermark starts at stack_base, increase to current rsp to optimize away stack banging for rsp greater than this.  Cannot increase watermark if esp < shadow_zone_safe_limit because of ...
>
> looking for a brief comment why here.

See new commit. I realized I need to write down explicitly that growth watermark is always above the safe limit, for the check to be safe.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Mon Feb  7 15:50:11 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 7 Feb 2022 15:50:11 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v3]
In-Reply-To: <L4GyGWL29f0cjIYWIFomwRJ-60bUQMb1fBVKGfkkdXA=.1794edc2-54e9-4b5c-9fe1-5fd3001087f0@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <L4GyGWL29f0cjIYWIFomwRJ-60bUQMb1fBVKGfkkdXA=.1794edc2-54e9-4b5c-9fe1-5fd3001087f0@github.com>
Message-ID: <8OYBV_pnf17dKEOSLQPB7HucivwweoUv9apRxVd_Oik=.2495bcae-b6f9-46fe-a3bc-00d32deba2ae@github.com>

On Mon, 7 Feb 2022 15:42:46 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   More comments

Fiddly code, documentation update for the core subsystem, so:

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From sgehwolf at redhat.com  Mon Feb  7 18:36:25 2022
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Mon, 07 Feb 2022 19:36:25 +0100
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
Message-ID: <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>

On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
> Case (4) is the cause for the bug in JDK-8279484
> 
> Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares to 2. 
> This means:
> 
> - This container is guaranteed a minimum amount of CPU resources
> - If no other containers are executing, this container can use as
> ?? much CPU as available on the host
> - If other containers are executing, the amount of CPU available
> ?? to this container is (2 / (sum of cpu.shares of all active
> ?? containers))
> 
> 
> The fundamental problem with the current JVM implementation is that it 
> treats "CPU request" as a maximum value, the opposite of what Kubernetes 
> does. Because of this, in case (4), the JVM artificially limits itself 
> to a single CPU. This leads to CPU underutilization.

I agree with your analysis. Key point is that in such a setup
Kubernetes sets CPU shares value to 2. Though, it's a very specific
case.

In contrast to Kubernetes the JVM doesn't have insight into what other
containers are doing (or how they are configured). It would, perhaps,
be good to know what Kubernetes does for containers when the
environment (i.e. other containers) changes. Do they get restarted?
Restarted with different values for cpu shares?

Either way, what are our options to fix this? Does it need fixing?

 * Should we no longer take cpu shares as a means to limit CPU into
   account? It would be a significant change to how previous JDKs
   worked. Maybe that wouldn't be such a bad idea :)
 * How likely is CPU underutilization to happen in practise?
   Considering the container is not the only container on the node,
   then according to your formula, it'll get one CPU or less anyway.
   Underutilization would, thus, only happen when it's an idle node
   with no other containers running. That would suggest to do nothing
   and let the user override it as they see fit.
 * Something else I'm missing?

Thanks,
Severin


From hseigel at openjdk.java.net  Mon Feb  7 21:36:30 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Mon, 7 Feb 2022 21:36:30 GMT
Subject: RFR: 8281400: Remove unused wcslen() function from
 globalDefinitions_gcc.hpp
Message-ID: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>

Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.

Thanks, Harold

-------------

Commit messages:
 - 8281400: Remove unused wcslen() function from globalDefinitions_gcc.hpp

Changes: https://git.openjdk.java.net/jdk/pull/7374/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7374&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281400
  Stats: 6 lines in 1 file changed: 0 ins; 5 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7374.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7374/head:pull/7374

PR: https://git.openjdk.java.net/jdk/pull/7374

From dcubed at openjdk.java.net  Mon Feb  7 21:54:08 2022
From: dcubed at openjdk.java.net (Daniel D.Daugherty)
Date: Mon, 7 Feb 2022 21:54:08 GMT
Subject: RFR: 8281400: Remove unused wcslen() function from
 globalDefinitions_gcc.hpp
In-Reply-To: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
Message-ID: <D4HvC2P6XM0y_flgw38oL47EoZnaLfGNg79XDWurIXw=.765ac2aa-92e5-46e1-8386-b16c0740eea4@github.com>

On Mon, 7 Feb 2022 21:30:32 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
> 
> Thanks, Harold

"grep" concurs that this function is only used in os_windows.cpp.
Thumbs up. This is a trivial fix.

Not your problem, but the XLC header has the same issue:
src/hotspot/share/utilities/globalDefinitions_xlc.hpp:inline int wcslen(const jchar* x) { return wcslen((const wchar_t*)x); }

-------------

Marked as reviewed by dcubed (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7374

From duke at openjdk.java.net  Tue Feb  8 01:29:10 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Tue, 8 Feb 2022 01:29:10 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <RbRbSSHs-0mMcE3rvfpCT2J50xia4gV2jxkSO8dlRqc=.b1bd375b-e180-47a2-a8ef-fba1ec1cdfe8@github.com>

On Wed, 26 Jan 2022 06:41:41 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

Hi, JFR team

Could somebody please review this fix for 8280684?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From dholmes at openjdk.java.net  Tue Feb  8 02:38:10 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 8 Feb 2022 02:38:10 GMT
Subject: RFR: 8281400: Remove unused wcslen() function from
 globalDefinitions_gcc.hpp
In-Reply-To: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
Message-ID: <LqeKjedCRlYguKt87tBUjAh9yBf13X2KX-tXEzDc_Jk=.83e25e5b-9f94-4038-a1c3-8e86992dbe9e@github.com>

On Mon, 7 Feb 2022 21:30:32 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
> 
> Thanks, Harold

Please fix both files and update the JBS issue title etc.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7374

From jiefu at openjdk.java.net  Tue Feb  8 04:00:04 2022
From: jiefu at openjdk.java.net (Jie Fu)
Date: Tue, 8 Feb 2022 04:00:04 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v3]
In-Reply-To: <L4GyGWL29f0cjIYWIFomwRJ-60bUQMb1fBVKGfkkdXA=.1794edc2-54e9-4b5c-9fe1-5fd3001087f0@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <L4GyGWL29f0cjIYWIFomwRJ-60bUQMb1fBVKGfkkdXA=.1794edc2-54e9-4b5c-9fe1-5fd3001087f0@github.com>
Message-ID: <LfBAwVzTXxY4AAqrVp_bJO50dHsv7CU4z0YlFKNKs7c=.cca9375e-729e-49d6-9f5d-dfbdbd491f65@github.com>

On Mon, 7 Feb 2022 15:42:46 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   More comments

src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 739:

> 737:   __ cmpptr(Address(thread, JavaThread::shadow_zone_safe_limit()), (int32_t)NULL_WORD);
> 738:   __ jcc(Assembler::notEqual, L_good_limit);
> 739:     __ stop("shadow zone safe limit is not initialized");

This indentation seems strange to me.
Also see lines {745, 762}.

And we'd better update the copyright year for all touched files.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From ioi.lam at oracle.com  Tue Feb  8 06:29:52 2022
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 7 Feb 2022 22:29:52 -0800
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
 <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
Message-ID: <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>

On 2022/02/07 10:36, Severin Gehwolf wrote:
> On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
>> Case (4) is the cause for the bug in JDK-8279484
>>
>> Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares to 2.
>> This means:
>>
>> - This container is guaranteed a minimum amount of CPU resources
>> - If no other containers are executing, this container can use as
>>  ?? much CPU as available on the host
>> - If other containers are executing, the amount of CPU available
>>  ?? to this container is (2 / (sum of cpu.shares of all active
>>  ?? containers))
>>
>>
>> The fundamental problem with the current JVM implementation is that it
>> treats "CPU request" as a maximum value, the opposite of what Kubernetes
>> does. Because of this, in case (4), the JVM artificially limits itself
>> to a single CPU. This leads to CPU underutilization.
> I agree with your analysis. Key point is that in such a setup
> Kubernetes sets CPU shares value to 2. Though, it's a very specific
> case.
>
> In contrast to Kubernetes the JVM doesn't have insight into what other
> containers are doing (or how they are configured). It would, perhaps,
> be good to know what Kubernetes does for containers when the
> environment (i.e. other containers) changes. Do they get restarted?
> Restarted with different values for cpu shares?

My understanding is that Kubernetes will try to do load balancing and 
may migrate the containers. According to this:

https://stackoverflow.com/questions/64891872/kubernetes-dynamic-configurationn-of-cpu-resource-limit

If you change the CPU limits, a currently running container will be shut 
down and restarted (using the new limit), and may be relocated to a 
different host if necessary.

I think this means that a JVM process doesn't need to worry about the 
CPU limit changing during its lifetime :-)

> Either way, what are our options to fix this? Does it need fixing?
>
>   * Should we no longer take cpu shares as a means to limit CPU into
>     account? It would be a significant change to how previous JDKs
>     worked. Maybe that wouldn't be such a bad idea :)

I think we should get rid of it. This feature was designed to work with 
Kubernetes, but has no effect in most cases. The only time it takes 
effect (when no resource limits are set) it does the opposite of what 
the user expects.

Also, the current implementation is really tied to specific behaviors of 
Kubernetes + docker (the 1024 and 100 constants). This will cause 
problems with other container/orchestration software that use different 
algorithms and constants.

>   * How likely is CPU underutilization to happen in practise?
>     Considering the container is not the only container on the node,
>     then according to your formula, it'll get one CPU or less anyway.
>     Underutilization would, thus, only happen when it's an idle node
>     with no other containers running. That would suggest to do nothing
>     and let the user override it as they see fit.

I think under utilization happens when the containers have a bursty 
usage pattern. If other containers do not fully utilize their CPU 
quotas, we should distribute the unused CPUs to the busy containers.

Thanks
- Ioi

>   * Something else I'm missing?
>
> Thanks,
> Severin
>


From shade at openjdk.java.net  Tue Feb  8 07:18:52 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 8 Feb 2022 07:18:52 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v4]
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <qKekSNhm3jZ-DYIXCF0AmNtv2uiLAJhrVaCCbmkyYqU=.27f1344d-dae2-4afd-bf8a-8ff025af0816@github.com>

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [x] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

Aleksey Shipilev has updated the pull request incrementally with three additional commits since the last revision:

 - Indents
 - Drop the test group definition
 - Update copyrights

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7247/files
  - new: https://git.openjdk.java.net/jdk/pull/7247/files/2c710882..ffd560ab

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=02-03

  Stats: 41 lines in 4 files changed: 0 ins; 28 del; 13 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7247.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7247/head:pull/7247

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Tue Feb  8 07:18:54 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 8 Feb 2022 07:18:54 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v3]
In-Reply-To: <LfBAwVzTXxY4AAqrVp_bJO50dHsv7CU4z0YlFKNKs7c=.cca9375e-729e-49d6-9f5d-dfbdbd491f65@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <L4GyGWL29f0cjIYWIFomwRJ-60bUQMb1fBVKGfkkdXA=.1794edc2-54e9-4b5c-9fe1-5fd3001087f0@github.com>
 <LfBAwVzTXxY4AAqrVp_bJO50dHsv7CU4z0YlFKNKs7c=.cca9375e-729e-49d6-9f5d-dfbdbd491f65@github.com>
Message-ID: <Dt5-DqRT2bDJrDpeZAOaMLFV2-8q8lvW4xvtLyUb6uM=.efa8c308-046d-4e51-a710-96b9f5e62fc2@github.com>

On Tue, 8 Feb 2022 03:57:07 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   More comments
>
> src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp line 739:
> 
>> 737:   __ cmpptr(Address(thread, JavaThread::shadow_zone_safe_limit()), (int32_t)NULL_WORD);
>> 738:   __ jcc(Assembler::notEqual, L_good_limit);
>> 739:     __ stop("shadow zone safe limit is not initialized");
> 
> This indentation seems strange to me.
> Also see lines {745, 762}.
> 
> And we'd better update the copyright year for all touched files.

Fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From chagedorn at openjdk.java.net  Tue Feb  8 08:17:17 2022
From: chagedorn at openjdk.java.net (Christian Hagedorn)
Date: Tue, 8 Feb 2022 08:17:17 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files [v4]
In-Reply-To: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
Message-ID: <PESchU9s7SJ30uIlIhCkZZZb84bvppiRwBPMFBNJvs0=.1308e4f4-a4af-427b-b2db-f13f2a05be3a@github.com>

> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method:
> 
> Stack: [0x00007f6e01739000,0x00007f6e0183a000],  sp=0x00007f6e01838110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64
> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec
> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899
> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df
> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69
> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d
> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167
> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180
> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f
> 
> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method.
> 
> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)):
> 
> Stack: [0x00007f34fca18000,0x00007f34fcb19000],  sp=0x00007f34fcb17110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64  (c1_Compilation.cpp:607)
> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec  (c1_Compiler.cpp:250)
> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899  (compileBroker.cpp:2291)
> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df  (compileBroker.cpp:1966)
> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69  (compilerThread.cpp:59)
> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d  (thread.cpp:1297)
> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167  (thread.cpp:1280)
> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180  (thread.cpp:358)
> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f  (os_linux.cpp:705)
> 
> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. 
> 
> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf
> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. 
> 
> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability.
> 
> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. 
> 
> **Testing:**
> Apart from manual testing, I've added two kinds of tests:
> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers.
> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename.
> 
> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation.
> 
> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional  `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number.
> 
> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user).
> 
> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`!
>  
> Thanks,
> Christian

Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision:

  Make dwarf tag NOT_PRODUCT

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7126/files
  - new: https://git.openjdk.java.net/jdk/pull/7126/files/698663b9..820f0da6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7126/head:pull/7126

PR: https://git.openjdk.java.net/jdk/pull/7126

From duke at openjdk.java.net  Tue Feb  8 09:26:14 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Tue, 8 Feb 2022 09:26:14 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
 <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
 <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>
Message-ID: <n5enwyIUBovTlCjVXWbKBCayt4O2227qRPzbmzajzr0=.ba48446a-9279-453c-9cde-82b7755e1767@github.com>

On Mon, 7 Feb 2022 12:25:44 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> PreserveFramePointer is doing some additional stuff. I'll give it a test to make sure everything still works with PreserveFramePointer fully set. It would make things easier just to force it set with rop protection on.
>
>> PreserveFramePointer is doing some additional stuff. I'll give it a test to make sure everything still works with PreserveFramePointer fully set. It would make things easier just to force it set with rop protection on.
> 
> Using PreserveFramePointer greatly simplifies the testing matrix, and has little adverse performance impact beyond disallowing C2 from allocating FP as a scratch register. It also simplifies this patch, which would be a very Good Thing. Let's do it.

Doing this caused 7 failures across a full jtreg run, namely:

serviceability/sa/ClhsdbFindPC.java#xcomp-core
vmTestbase/jit/misctests/fpustack/GraphApplet.java
vmTestbase/nsk/jdi/MonitorWaitRequest/MonitorWaitRequest001/TestDescription.java
vmTestbase/nsk/jdi/MonitorWaitedRequest/MonitorWaitedRequest001/TestDescription.java
vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java
vmTestbase/nsk/jdwp/ThreadReference/OwnedMonitorsStackDepthInfo/ownedMonitorsStackDepthInfo002/ownedMonitorsStackDepthInfo002.java
vmTestbase/nsk/jvmti/RedefineClasses/StressRedefine/TestDescription.java

....I'll investigate.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Tue Feb  8 09:44:16 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Tue, 8 Feb 2022 09:44:16 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <n5enwyIUBovTlCjVXWbKBCayt4O2227qRPzbmzajzr0=.ba48446a-9279-453c-9cde-82b7755e1767@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
 <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
 <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>
 <n5enwyIUBovTlCjVXWbKBCayt4O2227qRPzbmzajzr0=.ba48446a-9279-453c-9cde-82b7755e1767@github.com>
Message-ID: <GXUcAo55K4vReK737EJE8VWuNh5fUP0O01nxczF5fV8=.0bb1ca06-d8d7-4440-92bb-eaad4e22a169@github.com>

On Tue, 8 Feb 2022 09:22:39 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> Doing this caused 7 failures across a full jtreg run, namely:

I'm glad we caught that.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From kbarrett at openjdk.java.net  Tue Feb  8 10:09:48 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 10:09:48 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
Message-ID: <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>

> Please review this change to NonblockingQueue to improve invariants in the
> append operation by making a change in try_pop.
> 
> When taking the last entry in the queue, try_pop needs to do some cleanup of
> the queue fields, setting them to NULL.  The order of those cleanups doesn't
> matter for correctness.  However, setting first _head then _tail permits
> append to assert that _head is NULL when it finds _tail was NULL.  The current
> order (set _tail first, then _head) doesn't permit such an assertion.
> 
> Testing:
> mach5 tier1-3
> 
> I also did lots of testing with this change included while investigating
> JDK-8273383.

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Merge branch 'master' into append-invariant
 - minor comment fixes
 - append invariant

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7250/files
  - new: https://git.openjdk.java.net/jdk/pull/7250/files/4559ec8d..9648d183

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7250&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7250&range=00-01

  Stats: 9830 lines in 385 files changed: 6406 ins; 1963 del; 1461 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7250.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7250/head:pull/7250

PR: https://git.openjdk.java.net/jdk/pull/7250

From iwalulya at openjdk.java.net  Tue Feb  8 11:14:12 2022
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Tue, 8 Feb 2022 11:14:12 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
 <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
Message-ID: <B2yqrgl475eIZaMhJdswFpyeg86i-AuAIfE8NkgslMA=.0bb89daa-b736-4267-a534-fbcf89754c24@github.com>

On Tue, 8 Feb 2022 10:09:48 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this change to NonblockingQueue to improve invariants in the
>> append operation by making a change in try_pop.
>> 
>> When taking the last entry in the queue, try_pop needs to do some cleanup of
>> the queue fields, setting them to NULL.  The order of those cleanups doesn't
>> matter for correctness.  However, setting first _head then _tail permits
>> append to assert that _head is NULL when it finds _tail was NULL.  The current
>> order (set _tail first, then _head) doesn't permit such an assertion.
>> 
>> Testing:
>> mach5 tier1-3
>> 
>> I also did lots of testing with this change included while investigating
>> JDK-8273383.
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Merge branch 'master' into append-invariant
>  - minor comment fixes
>  - append invariant

Lgtm!

-------------

Marked as reviewed by iwalulya (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7250

From sgehwolf at redhat.com  Tue Feb  8 11:32:07 2022
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Tue, 08 Feb 2022 12:32:07 +0100
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
 <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
 <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>
Message-ID: <5d25e7ceeabd9186dd6fe5e9e6e04d0d11ef26c0.camel@redhat.com>

On Mon, 2022-02-07 at 22:29 -0800, Ioi Lam wrote:
> On 2022/02/07 10:36, Severin Gehwolf wrote:
> > On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
> > > Case (4) is the cause for the bug in JDK-8279484
> > > 
> > > Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares to 2.
> > > This means:
> > > 
> > > - This container is guaranteed a minimum amount of CPU resources
> > > - If no other containers are executing, this container can use as
> > > ??? much CPU as available on the host
> > > - If other containers are executing, the amount of CPU available
> > > ??? to this container is (2 / (sum of cpu.shares of all active
> > > ??? containers))
> > > 
> > > 
> > > The fundamental problem with the current JVM implementation is that it
> > > treats "CPU request" as a maximum value, the opposite of what Kubernetes
> > > does. Because of this, in case (4), the JVM artificially limits itself
> > > to a single CPU. This leads to CPU underutilization.
> > I agree with your analysis. Key point is that in such a setup
> > Kubernetes sets CPU shares value to 2. Though, it's a very specific
> > case.
> > 
> > In contrast to Kubernetes the JVM doesn't have insight into what other
> > containers are doing (or how they are configured). It would, perhaps,
> > be good to know what Kubernetes does for containers when the
> > environment (i.e. other containers) changes. Do they get restarted?
> > Restarted with different values for cpu shares?
> 
> My understanding is that Kubernetes will try to do load balancing and
> may migrate the containers. According to this:
> 
> https://stackoverflow.com/questions/64891872/kubernetes-dynamic-configurationn-of-cpu-resource-limit
> 
> If you change the CPU limits, a currently running container will be shut 
> down and restarted (using the new limit), and may be relocated to a 
> different host if necessary.
> 
> I think this means that a JVM process doesn't need to worry about the
> CPU limit changing during its lifetime :-)

> > Either way, what are our options to fix this? Does it need fixing?
> > 
> > ? * Should we no longer take cpu shares as a means to limit CPU into
> > ??? account? It would be a significant change to how previous JDKs
> > ??? worked. Maybe that wouldn't be such a bad idea :)
> 
> I think we should get rid of it. This feature was designed to work with 
> Kubernetes, but has no effect in most cases. The only time it takes 
> effect (when no resource limits are set) it does the opposite of what
> the user expects.

I tend to agree. We should start with a CSR review of this, though, as
it would be a behavioural change as compared to previous versions of
the JDK.

> Also, the current implementation is really tied to specific behaviors of 
> Kubernetes + docker (the 1024 and 100 constants). This will cause 
> problems with other container/orchestration software that use different 
> algorithms and constants.

There are other container orchestration frameworks, like Mesos, which
behave in a similar way (1024 constant is being used). The good news is
that mesos seems to have moved to a hard-limit default. See:

https://mesosphere.github.io/field-notes/faqs/utilization.html
https://mesos.apache.org/documentation/latest/quota/#deprecated-quota-guarantees

> 
> > ? * How likely is CPU underutilization to happen in practise?
> > ??? Considering the container is not the only container on the node,
> > ??? then according to your formula, it'll get one CPU or less anyway.
> > ??? Underutilization would, thus, only happen when it's an idle node
> > ??? with no other containers running. That would suggest to do nothing
> > ??? and let the user override it as they see fit.
> 
> I think under utilization happens when the containers have a bursty 
> usage pattern. If other containers do not fully utilize their CPU 
> quotas, we should distribute the unused CPUs to the busy containers.

Right, but this isn't really something the JVM process should care
about. It's really a core feature of the orchestration framework to do
that. All we could do is to not limit CPU for those cases. On the other
hand there is the risk of resource starvation too. Consider a node with
many cores, 50 say, and a very small cpu share setting via container
limits. The experience running a JVM application in such a set up would
be very mediocre as the JVM thinks it can use 50 cores (100% of the
time), yet it would only get this when the rest of the
containers/universe is idle.

Thanks,
Severin


From iwalulya at openjdk.java.net  Tue Feb  8 12:12:06 2022
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Tue, 8 Feb 2022 12:12:06 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
 <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
Message-ID: <edoJtPIsbq9mkPATxGIVcGNFvvN2tjmbv7Xnz6vUPIk=.800b3880-2c44-412e-8e15-1a7384a0ce4c@github.com>

On Tue, 8 Feb 2022 10:09:48 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this change to NonblockingQueue to improve invariants in the
>> append operation by making a change in try_pop.
>> 
>> When taking the last entry in the queue, try_pop needs to do some cleanup of
>> the queue fields, setting them to NULL.  The order of those cleanups doesn't
>> matter for correctness.  However, setting first _head then _tail permits
>> append to assert that _head is NULL when it finds _tail was NULL.  The current
>> order (set _tail first, then _head) doesn't permit such an assertion.
>> 
>> Testing:
>> mach5 tier1-3
>> 
>> I also did lots of testing with this change included while investigating
>> JDK-8273383.
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Merge branch 'master' into append-invariant
>  - minor comment fixes
>  - append invariant

Not part of this PR, but we need to add a comment about `push/append` being susceptible to ABA behavior as discovered in JDK-8273383.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7250

From duke at openjdk.java.net  Tue Feb  8 12:32:08 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Tue, 8 Feb 2022 12:32:08 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v4]
In-Reply-To: <qKekSNhm3jZ-DYIXCF0AmNtv2uiLAJhrVaCCbmkyYqU=.27f1344d-dae2-4afd-bf8a-8ff025af0816@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <qKekSNhm3jZ-DYIXCF0AmNtv2uiLAJhrVaCCbmkyYqU=.27f1344d-dae2-4afd-bf8a-8ff025af0816@github.com>
Message-ID: <-sdR46GHtobc9KyphiiBn6TLfrmR4dDDpXwpaUsmuJg=.c6baf0ee-a06d-4fcb-a75a-349dc5646e18@github.com>

On Tue, 8 Feb 2022 07:18:52 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - Indents
>  - Drop the test group definition
>  - Update copyrights

src/hotspot/share/runtime/stackOverflow.hpp line 121:

> 119:   //  |  shadow zone
> 120:   //  |
> 121:   //  --                                          ---   <--  shadow_zone_growth_watermark()

Hi, should the `watermark` be somewhere below (regarding address) the last frame instead?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From kbarrett at openjdk.java.net  Tue Feb  8 12:58:06 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 12:58:06 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <edoJtPIsbq9mkPATxGIVcGNFvvN2tjmbv7Xnz6vUPIk=.800b3880-2c44-412e-8e15-1a7384a0ce4c@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
 <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
 <edoJtPIsbq9mkPATxGIVcGNFvvN2tjmbv7Xnz6vUPIk=.800b3880-2c44-412e-8e15-1a7384a0ce4c@github.com>
Message-ID: <0TXYYreM8RybLJ_4l__SPdOJLEeNAb9EJ_VSlDzTXRo=.5381dd81-a019-4806-8693-0572a9ee8b98@github.com>

On Tue, 8 Feb 2022 12:08:51 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

> Not part of this PR, but we need to add a comment about `push/append` being susceptible to ABA behavior as discovered in JDK-8273383.

See https://bugs.openjdk.java.net/browse/JDK-8280832

-------------

PR: https://git.openjdk.java.net/jdk/pull/7250

From tschatzl at openjdk.java.net  Tue Feb  8 13:16:04 2022
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Tue, 8 Feb 2022 13:16:04 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
 <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
Message-ID: <syociFHnPiccdyDvtZWHXRaDxXlUmvEajMFrk-VAqvU=.9a5d3839-b9b6-424f-b1f8-087fd628a614@github.com>

On Tue, 8 Feb 2022 10:09:48 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this change to NonblockingQueue to improve invariants in the
>> append operation by making a change in try_pop.
>> 
>> When taking the last entry in the queue, try_pop needs to do some cleanup of
>> the queue fields, setting them to NULL.  The order of those cleanups doesn't
>> matter for correctness.  However, setting first _head then _tail permits
>> append to assert that _head is NULL when it finds _tail was NULL.  The current
>> order (set _tail first, then _head) doesn't permit such an assertion.
>> 
>> Testing:
>> mach5 tier1-3
>> 
>> I also did lots of testing with this change included while investigating
>> JDK-8273383.
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Merge branch 'master' into append-invariant
>  - minor comment fixes
>  - append invariant

Lgtm.

src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 200:

> 198:     // cmpxchg indicates a concurrent operation updated _head first.  That
> 199:     // could be either a push/append or a try_pop in [Clause 1b].
> 200:     Atomic::cmpxchg(&_head, result, (T*)NULL);

These `NULL`s could be replaced by `nullptr`.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7250

From hseigel at openjdk.java.net  Tue Feb  8 13:40:29 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 8 Feb 2022 13:40:29 GMT
Subject: RFR: 8281400: Remove unused wcslen() function [v2]
In-Reply-To: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
Message-ID: <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>

> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  globalDefinitions_xlc.hpp change

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7374/files
  - new: https://git.openjdk.java.net/jdk/pull/7374/files/82015268..4c3375ce

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7374&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7374&range=00-01

  Stats: 5 lines in 1 file changed: 0 ins; 4 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7374.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7374/head:pull/7374

PR: https://git.openjdk.java.net/jdk/pull/7374

From duke at openjdk.java.net  Tue Feb  8 15:40:44 2022
From: duke at openjdk.java.net (Bhavana-Kilambi)
Date: Tue, 8 Feb 2022 15:40:44 GMT
Subject: RFR: 8280007: Enable Neoverse N1 optimizations for Arm Neoverse V1 &
 N2
Message-ID: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>

As Arm Neoverse V1 and N2s will benefit from the same optimizations as Neoverse N1 does, it should have OnSpinWaitInst/OnSpinWaitInstCount defaults set to "isb"/1 and UseSIMDForMemoryOps default set to true.
This patch sets these flags accordingly for both V1 and N2 architectures.

-------------

Commit messages:
 - 8280007: Enable Neoverse N1 optimizations for Arm Neoverse V1 & N2

Changes: https://git.openjdk.java.net/jdk/pull/7383/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7383&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8280007
  Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7383.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7383/head:pull/7383

PR: https://git.openjdk.java.net/jdk/pull/7383

From lucy at openjdk.java.net  Tue Feb  8 15:55:09 2022
From: lucy at openjdk.java.net (Lutz Schmidt)
Date: Tue, 8 Feb 2022 15:55:09 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <5Sl9zEnZBmbo31zfsGuZxLhF_fZtEmWMxxCdJVK3vo8=.ab417d6c-66a3-4712-b807-4992ed17805b@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

Changes look good. 
As you mentioned, jtreg tests are green.

-------------

Marked as reviewed by lucy (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7312

From hseigel at openjdk.java.net  Tue Feb  8 16:04:13 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 8 Feb 2022 16:04:13 GMT
Subject: Integrated: 8281400: Remove unused wcslen() function
In-Reply-To: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
Message-ID: <kWqLw_cZ2MrsXJhr0qBGIrx7zgL7XrPXnLDC0jFSo3U=.c27f6103-58fb-4073-95a7-ca919bd2f176@github.com>

On Mon, 7 Feb 2022 21:30:32 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
> 
> Thanks, Harold

This pull request has now been integrated.

Changeset: 380378c5
Author:    Harold Seigel <hseigel at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/380378c551b4243ef72d868571f725b390e12124
Stats:     11 lines in 2 files changed: 0 ins; 9 del; 2 mod

8281400: Remove unused wcslen() function

Reviewed-by: dcubed, coleenp, lfoltan

-------------

PR: https://git.openjdk.java.net/jdk/pull/7374

From coleenp at openjdk.java.net  Tue Feb  8 16:04:13 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 8 Feb 2022 16:04:13 GMT
Subject: RFR: 8281400: Remove unused wcslen() function [v2]
In-Reply-To: <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
 <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>
Message-ID: <QgvDEuvXlQ9cxPkw1RDw1QSMJHk7nC3y3cXQykGx--Q=.420d13c0-771c-48eb-b88f-e366bad42637@github.com>

On Tue, 8 Feb 2022 13:40:29 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
>> 
>> Thanks, Harold
>
> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
> 
>   globalDefinitions_xlc.hpp change

Looks good + trivial.

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7374

From lfoltan at openjdk.java.net  Tue Feb  8 16:04:13 2022
From: lfoltan at openjdk.java.net (Lois Foltan)
Date: Tue, 8 Feb 2022 16:04:13 GMT
Subject: RFR: 8281400: Remove unused wcslen() function [v2]
In-Reply-To: <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
 <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>
Message-ID: <80UI0jRH6YMI0QXv-1HR2H6FUk7WPbk1VephvEGmXa4=.0da23c18-9600-4214-a6a9-7c1f35ff1bfa@github.com>

On Tue, 8 Feb 2022 13:40:29 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
>> 
>> Thanks, Harold
>
> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
> 
>   globalDefinitions_xlc.hpp change

Looks good.
Lois

-------------

Marked as reviewed by lfoltan (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7374

From hseigel at openjdk.java.net  Tue Feb  8 16:04:13 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 8 Feb 2022 16:04:13 GMT
Subject: RFR: 8281400: Remove unused wcslen() function [v2]
In-Reply-To: <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>
References: <qNarJOx9SGd6P1BHY3WpQWpYw4oaF6P0vyBgsGCmi6A=.4976048d-7e75-4c8e-9a37-d06d27f59bce@github.com>
 <W35K8lx-7l9xrpoHX_vxvKS2iMWeC0bCtx_ketnjifg=.4de15a2a-08b3-48df-8cd3-6158ddda47dd@github.com>
Message-ID: <9IjrQDJk7oz5LhM9uSTRo3TLRJ9Bs5s_gnW2zG5UXb4=.378d4b4f-2669-4ecd-a3d1-38879769be2d@github.com>

On Tue, 8 Feb 2022 13:40:29 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Please review this small change to remove the unused wcslen() function.  This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows.
>> 
>> Thanks, Harold
>
> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
> 
>   globalDefinitions_xlc.hpp change

Thanks Dan, Coleen, and Lois for the reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7374

From duke at openjdk.java.net  Tue Feb  8 16:28:02 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Tue, 8 Feb 2022 16:28:02 GMT
Subject: RFR: 8280007: Enable Neoverse N1 optimizations for Arm Neoverse V1
 & N2
In-Reply-To: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>
References: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>
Message-ID: <GbaotoJ4J23pjiGNfuGJSCloAE0BNAA6l4of2fLejEk=.1da9f293-2b67-45e7-9315-38901d7c9312@github.com>

On Tue, 8 Feb 2022 15:33:20 GMT, Bhavana-Kilambi <duke at openjdk.java.net> wrote:

> As Arm Neoverse V1 and N2s will benefit from the same optimizations as Neoverse N1 does, it should have OnSpinWaitInst/OnSpinWaitInstCount defaults set to "isb"/1 and UseSIMDForMemoryOps default set to true.
> This patch sets these flags accordingly for both V1 and N2 architectures.

Lgtm

-------------

Marked as reviewed by eastig at github.com (no known OpenJDK username).

PR: https://git.openjdk.java.net/jdk/pull/7383

From phh at openjdk.java.net  Tue Feb  8 16:53:09 2022
From: phh at openjdk.java.net (Paul Hohensee)
Date: Tue, 8 Feb 2022 16:53:09 GMT
Subject: RFR: 8280007: Enable Neoverse N1 optimizations for Arm Neoverse V1
 & N2
In-Reply-To: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>
References: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>
Message-ID: <ZyV-FaGsQeKk9jJzk1xOhjov0mscMujHxdEnb_miIkI=.4a48793e-cbdc-48a9-a60e-5e9a758cdb4c@github.com>

On Tue, 8 Feb 2022 15:33:20 GMT, Bhavana-Kilambi <duke at openjdk.java.net> wrote:

> As Arm Neoverse V1 and N2s will benefit from the same optimizations as Neoverse N1 does, it should have OnSpinWaitInst/OnSpinWaitInstCount defaults set to "isb"/1 and UseSIMDForMemoryOps default set to true.
> This patch sets these flags accordingly for both V1 and N2 architectures.

Lgtm.

-------------

Marked as reviewed by phh (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7383

From rrich at openjdk.java.net  Tue Feb  8 16:57:06 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Tue, 8 Feb 2022 16:57:06 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <F-neShcWz85TkSD-I8sw1JwI2ynR0P0GmaPz2pq0dWI=.4596f1b6-4cba-48b0-995c-5b63ca1ef019@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

Hi Martin,

changes look reasonable. `thread_linux_s390.cpp` needs copyright header update.

Cheers, Richard.

-------------

Marked as reviewed by rrich (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7312

From mdoerr at openjdk.java.net  Tue Feb  8 17:09:51 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Tue, 8 Feb 2022 17:09:51 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v4]
In-Reply-To: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
Message-ID: <qXRjizCBHdvFjsy9MEO0PGT_RhmOhwdX3VXpTaU2h4E=.825a3d66-5604-4154-b463-dd528cb42c8b@github.com>

> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:

  Update Copyright years.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7312/files
  - new: https://git.openjdk.java.net/jdk/pull/7312/files/6d9446a8..31f5aa6a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7312&range=02-03

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7312.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7312/head:pull/7312

PR: https://git.openjdk.java.net/jdk/pull/7312

From mdoerr at openjdk.java.net  Tue Feb  8 17:09:53 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Tue, 8 Feb 2022 17:09:53 GMT
Subject: RFR: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames [v3]
In-Reply-To: <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
 <AV6Mn_Al92Y5fHtOrlk73_tfPpL9RLjYmNgRFvchPHM=.e7102cd8-cc22-4989-b2e3-b6c51e952302@github.com>
Message-ID: <L_oUPM0k56rQYG_0r-65pjicnpSdp45QA31UMnPj5Qw=.8fa299b9-4796-4547-b7b4-84ac70239d71@github.com>

On Fri, 4 Feb 2022 15:45:48 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix sender_sp.

Copyright updated. Thank you for the reviews!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From shade at openjdk.java.net  Tue Feb  8 17:24:41 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 8 Feb 2022 17:24:41 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [x] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:

  Show watermark in better place on the chart

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7247/files
  - new: https://git.openjdk.java.net/jdk/pull/7247/files/ffd560ab..13073992

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7247&range=03-04

  Stats: 12 lines in 1 file changed: 10 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7247.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7247/head:pull/7247

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Tue Feb  8 17:24:44 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 8 Feb 2022 17:24:44 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v4]
In-Reply-To: <-sdR46GHtobc9KyphiiBn6TLfrmR4dDDpXwpaUsmuJg=.c6baf0ee-a06d-4fcb-a75a-349dc5646e18@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <qKekSNhm3jZ-DYIXCF0AmNtv2uiLAJhrVaCCbmkyYqU=.27f1344d-dae2-4afd-bf8a-8ff025af0816@github.com>
 <-sdR46GHtobc9KyphiiBn6TLfrmR4dDDpXwpaUsmuJg=.c6baf0ee-a06d-4fcb-a75a-349dc5646e18@github.com>
Message-ID: <0hYkIKMjil2vQcKwZqXCMmkYTcCrElpFswabP3fuDzI=.bc5bf151-1b7f-4943-82fa-8a48dd390c5e@github.com>

On Tue, 8 Feb 2022 12:28:34 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with three additional commits since the last revision:
>> 
>>  - Indents
>>  - Drop the test group definition
>>  - Update copyrights
>
> src/hotspot/share/runtime/stackOverflow.hpp line 121:
> 
>> 119:   //  |  shadow zone
>> 120:   //  |
>> 121:   //  --                                          ---   <--  shadow_zone_growth_watermark()
> 
> Hi, should the `watermark` be somewhere below (regarding address) the last frame instead?

Good point, fixed in new commit.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From mdoerr at openjdk.java.net  Tue Feb  8 17:52:09 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Tue, 8 Feb 2022 17:52:09 GMT
Subject: Integrated: 8281061: [s390] JFR runs into assertions while validating
 interpreter frames
In-Reply-To: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
References: <q-6e5jyelfMy8P-6zeg4VKGxqWWtZx40Y6yzJ0nJSjc=.7d8afb4a-428b-40c0-8a6b-72d963a39ca6@github.com>
Message-ID: <In_zMnDuei2qpgfsUUZkZFP3RVXuc_uxSchKZL4RqDg=.adf14970-d63e-4efe-83a0-8ce4366ed0c0@github.com>

On Tue, 1 Feb 2022 17:22:57 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> s390 implementation requires small changes to avoid running into assertions in debug builds. See JBS for details.

This pull request has now been integrated.

Changeset: 7f19c700
Author:    Martin Doerr <mdoerr at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/7f19c700707573000a37910dd6d2f2bb6e8439ad
Stats:     18 lines in 2 files changed: 2 ins; 3 del; 13 mod

8281061: [s390] JFR runs into assertions while validating interpreter frames

Reviewed-by: lucy, rrich

-------------

PR: https://git.openjdk.java.net/jdk/pull/7312

From xliu at openjdk.java.net  Tue Feb  8 17:56:10 2022
From: xliu at openjdk.java.net (Xin Liu)
Date: Tue, 8 Feb 2022 17:56:10 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
Message-ID: <bb_HhDkcCGFWlL9cuKEV9xjw5TicbNlugUgV0l1daO0=.8706c618-ac27-4aa4-bd74-ab6988cfa81a@github.com>

On Tue, 8 Feb 2022 17:24:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Show watermark in better place on the chart

LGTM! I am not a reviewer. we still need reviewers to approve this.

-------------

Marked as reviewed by xliu (Committer).

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Tue Feb  8 18:25:44 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 8 Feb 2022 18:25:44 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and CodeEntryAlignment
Message-ID: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>

I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.

Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.

The default values for options are different per platform, so tests are x86_64 specific.

No default value is changed, this only unblocks experiments.

Additional testing:
 - [x] New tests on Linux x86_64 fastdebug
 - [x] New tests on Linux x86_64 release

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.java.net/jdk/pull/7388/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7388&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281467
  Stats: 178 lines in 4 files changed: 176 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7388.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7388/head:pull/7388

PR: https://git.openjdk.java.net/jdk/pull/7388

From hseigel at openjdk.java.net  Tue Feb  8 18:42:08 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 8 Feb 2022 18:42:08 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
Message-ID: <fPNQA8wA48h1_aQm_llWCYTF5gXOCih34TG0CV8RXPQ=.c074b674-eee9-4b23-841e-31009f2c266f@github.com>

On Tue, 8 Feb 2022 18:19:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
> 
> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
> 
> The default values for options are different per platform, so tests are x86_64 specific.
> 
> No default value is changed, this only unblocks experiments.
> 
> Additional testing:
>  - [x] New tests on Linux x86_64 fastdebug
>  - [x] New tests on Linux x86_64 release

src/hotspot/share/runtime/globals.hpp line 1539:

> 1537:           range(1, 128)                                                     \
> 1538:           constraint(OptoLoopAlignmentConstraintFunc, AfterErgo)            \
> 1539:                                                                             \

Should OptoLoopAlignment be an int, instead of an intx, since its range is small?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From kbarrett at openjdk.java.net  Tue Feb  8 20:22:10 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 20:22:10 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <edoJtPIsbq9mkPATxGIVcGNFvvN2tjmbv7Xnz6vUPIk=.800b3880-2c44-412e-8e15-1a7384a0ce4c@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
 <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
 <edoJtPIsbq9mkPATxGIVcGNFvvN2tjmbv7Xnz6vUPIk=.800b3880-2c44-412e-8e15-1a7384a0ce4c@github.com>
Message-ID: <dmg6p6c4GyI0931AxY0z90pXN4Ig_If06Ln62CsjnAw=.b4ef880d-03b6-4a67-b83d-d93d69806505@github.com>

On Tue, 8 Feb 2022 12:08:51 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>> 
>>  - Merge branch 'master' into append-invariant
>>  - minor comment fixes
>>  - append invariant
>
> Not part of this PR, but we need to add a comment about `push/append` being susceptible to ABA behavior as discovered in JDK-8273383.

Thanks @walulyai and @tschatzl for reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7250

From kbarrett at openjdk.java.net  Tue Feb  8 20:22:11 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 20:22:11 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v2]
In-Reply-To: <syociFHnPiccdyDvtZWHXRaDxXlUmvEajMFrk-VAqvU=.9a5d3839-b9b6-424f-b1f8-087fd628a614@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
 <MLbrniA9X2qBQ7L4_nKL3okk6WtHg63vKQpw_LlCwRE=.c9b79a12-cd59-462e-bb45-e8c7570e14bd@github.com>
 <syociFHnPiccdyDvtZWHXRaDxXlUmvEajMFrk-VAqvU=.9a5d3839-b9b6-424f-b1f8-087fd628a614@github.com>
Message-ID: <kYaBOTfMAwTTrBDyd5YVIw5fXO9LbDl4pQRV_XyEx2k=.3b5a875d-e88d-427d-aee2-1f7144c562d5@github.com>

On Tue, 8 Feb 2022 13:12:40 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>> 
>>  - Merge branch 'master' into append-invariant
>>  - minor comment fixes
>>  - append invariant
>
> src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 200:
> 
>> 198:     // cmpxchg indicates a concurrent operation updated _head first.  That
>> 199:     // could be either a push/append or a try_pop in [Clause 1b].
>> 200:     Atomic::cmpxchg(&_head, result, (T*)NULL);
> 
> These `NULL`s could be replaced by `nullptr`.

I'm planning to do that, since this code has been going through a lot of recent churn anyway.  But I didn't want to mix that cleanup with functional changes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7250

From kbarrett at openjdk.java.net  Tue Feb  8 20:32:45 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 20:32:45 GMT
Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append [v3]
In-Reply-To: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
Message-ID: <E7vPVebTyoP7M0rjIkBm3q_3tK-F8sERIEIpv0319uE=.e76ddd75-780b-411c-a6d9-7307bda8a826@github.com>

> Please review this change to NonblockingQueue to improve invariants in the
> append operation by making a change in try_pop.
> 
> When taking the last entry in the queue, try_pop needs to do some cleanup of
> the queue fields, setting them to NULL.  The order of those cleanups doesn't
> matter for correctness.  However, setting first _head then _tail permits
> append to assert that _head is NULL when it finds _tail was NULL.  The current
> order (set _tail first, then _head) doesn't permit such an assertion.
> 
> Testing:
> mach5 tier1-3
> 
> I also did lots of testing with this change included while investigating
> JDK-8273383.

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - Merge branch 'master' into append-invariant
 - Merge branch 'master' into append-invariant
 - minor comment fixes
 - append invariant

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7250/files
  - new: https://git.openjdk.java.net/jdk/pull/7250/files/9648d183..89b8f300

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7250&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7250&range=01-02

  Stats: 379 lines in 11 files changed: 325 ins; 30 del; 24 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7250.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7250/head:pull/7250

PR: https://git.openjdk.java.net/jdk/pull/7250

From kbarrett at openjdk.java.net  Tue Feb  8 20:32:47 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 20:32:47 GMT
Subject: Integrated: 8280828: Improve invariants in NonblockingQueue::append
In-Reply-To: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
References: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com>
Message-ID: <rGRcpO-Y9RVcEsXFb-NZYl7O52jktpFjv-jTDQwTR2k=.3ed164db-9588-4858-a7e6-736d96d8a6bc@github.com>

On Thu, 27 Jan 2022 20:34:29 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to NonblockingQueue to improve invariants in the
> append operation by making a change in try_pop.
> 
> When taking the last entry in the queue, try_pop needs to do some cleanup of
> the queue fields, setting them to NULL.  The order of those cleanups doesn't
> matter for correctness.  However, setting first _head then _tail permits
> append to assert that _head is NULL when it finds _tail was NULL.  The current
> order (set _tail first, then _head) doesn't permit such an assertion.
> 
> Testing:
> mach5 tier1-3
> 
> I also did lots of testing with this change included while investigating
> JDK-8273383.

This pull request has now been integrated.

Changeset: d658d945
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/d658d945cf57bab8e61302841dcb56b36e48eff3
Stats:     45 lines in 1 file changed: 19 ins; 6 del; 20 mod

8280828: Improve invariants in NonblockingQueue::append

Reviewed-by: iwalulya, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/7250

From kbarrett at openjdk.java.net  Tue Feb  8 23:05:27 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 23:05:27 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue
Message-ID: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>

Please review this update of the usage and implementation comments for
NonblockingQueue to discuss the ABA issue in push/append operations.

-------------

Commit messages:
 - document ABA for push/append

Changes: https://git.openjdk.java.net/jdk/pull/7393/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8280832
  Stats: 7 lines in 2 files changed: 5 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7393.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7393/head:pull/7393

PR: https://git.openjdk.java.net/jdk/pull/7393

From kbarrett at openjdk.java.net  Tue Feb  8 23:33:26 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Feb 2022 23:33:26 GMT
Subject: RFR: 8280830: Change NonblockingQueue::try_pop variable named "result"
Message-ID: <-k-4lPQNjY4Xx-eT0djBmXpv3uwWViUQ82pPO2qTbzA=.1e6c2826-5975-4777-a43c-0a43c8b7bf9b@github.com>

Please review this trivial change to rename a variable in
NonblockingQueue::try_pop.  The variable named "result" is being renamed to
"old_head", as the old name was found to be confusing by some people, making
the code harder to read.

Testing:
local build.

-------------

Commit messages:
 - change variable name

Changes: https://git.openjdk.java.net/jdk/pull/7394/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7394&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8280830
  Stats: 24 lines in 1 file changed: 0 ins; 0 del; 24 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7394.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7394/head:pull/7394

PR: https://git.openjdk.java.net/jdk/pull/7394

From dholmes at openjdk.java.net  Wed Feb  9 00:37:10 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 9 Feb 2022 00:37:10 GMT
Subject: RFR: 8280830: Change NonblockingQueue::try_pop variable named
 "result"
In-Reply-To: <-k-4lPQNjY4Xx-eT0djBmXpv3uwWViUQ82pPO2qTbzA=.1e6c2826-5975-4777-a43c-0a43c8b7bf9b@github.com>
References: <-k-4lPQNjY4Xx-eT0djBmXpv3uwWViUQ82pPO2qTbzA=.1e6c2826-5975-4777-a43c-0a43c8b7bf9b@github.com>
Message-ID: <UEnKdG_wfgmaazjIqusB_isDM2iGBy1r397giORwgzs=.8bf62f17-2207-4af3-87f0-6f6a45e6dfe6@github.com>

On Tue, 8 Feb 2022 23:27:41 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this trivial change to rename a variable in
> NonblockingQueue::try_pop.  The variable named "result" is being renamed to
> "old_head", as the old name was found to be confusing by some people, making
> the code harder to read.
> 
> Testing:
> local build.

Looks good and trivial.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7394

From dholmes at openjdk.java.net  Wed Feb  9 00:43:03 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 9 Feb 2022 00:43:03 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue
In-Reply-To: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
Message-ID: <ASoWE73-tR1YogRLU3COO_OeqOklC8kWeX2ZR5570As=.051ad89d-83c5-47e5-b820-76ea1d455dec@github.com>

On Tue, 8 Feb 2022 22:58:26 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this update of the usage and implementation comments for
> NonblockingQueue to discuss the ABA issue in push/append operations.

src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 122:

> 120:     // try_pop could take old_tail before our update, it gets recycled and
> 121:     // re-added to the end, and then we successfully cmpxchg, rendering the
> 122:     // list in _tail circular.

Doesn't this contradict the "We won any races with try_pop ... so we're done"!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From Divino.Cesar at microsoft.com  Wed Feb  9 01:45:57 2022
From: Divino.Cesar at microsoft.com (Cesar Soares Lucas)
Date: Wed, 9 Feb 2022 01:45:57 +0000
Subject: RFC : Approach to handle Allocation Merges in C2 Scalar Replacement
Message-ID: <BY5PR21MB1473BE17FE1EDBD3EF223DF99A2E9@BY5PR21MB1473.namprd21.prod.outlook.com>

Hi there again!

Can you please give me feedback on the following approach to at least partially
address [1], the scalar replacement allocation merge issue? 

The problem that I am trying to solve arises when allocations are merged after a
control flow split. The code below shows _one example_ of such a situation. 

public int ex1(boolean cond, int x, int y) {
? ? Point p = new Point(x, y);
? ? if (cond)
? ? ? ? p = new Point(y, x);
? ? // Allocations for p are merged here.
? ? return p.calc();
}

Assuming the method calls on "p" are inlined then the allocations will not
escape the method. The C2 IR for this method will look like this: 

public int ex1(boolean cond, int first, int second) {
? ? p0 = Allocate(...); 
? ? ...
? ? p0.x = first;
? ? p0.y = second;

? ? if (cond) {
? ? ? ? p1 = Allocate(...);
? ? ? ? ...
? ? ? ? p1.x = second;
? ? ? ? p1.y = first;
? ? }

? ? p = phi(p0, p1)

? ? return p.x - p.y;
}

However, one of the constraints implemented here [2], specifically the third
one, will prevent the objects from being scalar replaced. ?

The approach that I'm considering for solving the problem is to replace the Phi
node `p = phi(p0, p1)` with new Phi nodes for each of the fields of the objects
in the original Phi. The IR for `ex1` would look something like this after the
transformation: 

public int ex1(boolean cond, int first, int second) {
? ? p0 = Allocate(...); 
? ? ...
? ? p0.x = first;
? ? p0.y = second;

? ? if (cond) {
? ? ? ? p1 = Allocate(...);
? ? ? ? ...
? ? ? ? p1.x = second;
? ? ? ? p1.y = first;
? ? }

? ? pX = phi(first, second)
? ? pY = phi(second, first)

? ? return pX - pY;
}

I understand that this transformation might not be applicable for all cases and
that it's not as simple as illustrated above. Also, it seems to me that much of
what I'd have to implement is already implemented in other steps of the Scalar
Replacement pipeline (which is a good thing). To work around these
implementation details I plan to use as much of the existing code as possible.
The algorithm for the transformation would be like this: 

split_phis(phi)
? ? # If output of phi escapes, or something uses its identity, etc
? ? # then we can't remove it. The conditions here might possible be the 
? ? # same as the ones implemented in `PhaseMacroExpand::can_eliminate_allocation`
? ? if cant_remove_phi_output(phi)
? ? ? ? return ;

? ? # Collect a set of tuples(F,U) containing nodes U that uses field F
? ? # member of the object resulting from `phi`.
? ? fields_used = collect_fields_used_after_phi(phi)

? ? foreach field in fields_used 
? ? ? ? producers = {}

? ? ? ? # Create a list with the last Store for each field "field" on the
? ? ? ? # scope of each of the Phi input objects.
? ? ? ? foreach o in phi.inputs
? ? ? ? ? ? # The function called below might re-use a lot of the code/logic in `PhaseMacroExpand::scalar_replacement`
? ? ? ? ? ? producers += last_store_to_o_field(0, field)
? ? ? ? 
? ? ? ? # Create a new phi node whose inputs are the Store's to 'field'
? ? ? ? field_phi = create_new_phi(producers)

? ? ? ? update_consumers(field, field_phi)

The implementation that I envisioned would be as a "pre-process" [3] step just
after EA but before the constraint checks in `adjust_scalar_replaceable_state`
[2]. If we agree that the overall Scalar Replacement implementation goes through
the following major phases: 

? ? 1. Identify the Escape Status of objects. 
? ? 2. Adjust object Escape and/or Scalar Replacement status based on a set of constraints. 
? ? 3. Make call to Split_unique_types [4]. 
? ? 4 Iterate over object and array allocations. 
? ? ? ? 4.1 Check if allocation can be eliminated. ?
? ? ? ? 4.2 Perform scalar replacement. Replace uses of object in Safepoints. 
? ? ? ? 4.3 Process users of CheckCastPP other than Safepoint: AddP, ArrayCopy and CastP2X. 

The transformation that I am proposing would change the overall flow to look
like this: 

? ? 1. Identify the Escape Status of objects. 
? ? 2. ----> New: "Split phi functions" <---- 
? ? 2. Adjust object Escape and/or Scalar Replacement status based on a set of constraints. 
? ? 3. Make call to Split_unique_types [14]. 
? ? 4 Iterate over object and array allocations. 
? ? ? ? 4.1 ----> Moved to split_phi: "Check if allocation can be eliminated" <---- 
? ? ? ? 4.2 Perform scalar replacement. Replace uses of object in Safepoints. 
? ? ? ? 4.3 Process users of CheckCastPP other than Safepoint: AddP, ArrayCopy and CastP2X. 

Please let me know what you think and thank you for taking the time to review
this! 


Regards, 
Cesar 

Notes: 

? ? [1] I am not sure yet how this approach will play with the case of a merge
? ? ? ? with NULL. 
?
? ? [2] https://github.com/openjdk/jdk/blob/2f71a6b39ed6bb869b4eb3e81bc1d87f4b3328ff/src/hotspot/share/opto/escape.cpp#L1809 

? ? [3] Another option would be to "patch" the current implementation to be able
? ? ? ? to handle the merges. I am not certain that the "patch" approach would be
? ? ? ? better, however, the "pre-process" approach is certainly much easier to test
? ? ? ? and more readable. 

? ? [4] I cannot say I understand 100% the effects of executing
? ? ? ? split_unique_types(). Would the transformation that I am proposing need to
? ? ? ? be after the call to split_unique_types? 

From kbarrett at openjdk.java.net  Wed Feb  9 04:14:11 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 9 Feb 2022 04:14:11 GMT
Subject: RFR: 8280830: Change NonblockingQueue::try_pop variable named
 "result"
In-Reply-To: <UEnKdG_wfgmaazjIqusB_isDM2iGBy1r397giORwgzs=.8bf62f17-2207-4af3-87f0-6f6a45e6dfe6@github.com>
References: <-k-4lPQNjY4Xx-eT0djBmXpv3uwWViUQ82pPO2qTbzA=.1e6c2826-5975-4777-a43c-0a43c8b7bf9b@github.com>
 <UEnKdG_wfgmaazjIqusB_isDM2iGBy1r397giORwgzs=.8bf62f17-2207-4af3-87f0-6f6a45e6dfe6@github.com>
Message-ID: <2XNE2sqbmYSMG2SFhrYR-omuAIpbnJZp8-yRCoyNdEA=.f60f5cfd-3481-4d20-944f-f91246ed5d2f@github.com>

On Wed, 9 Feb 2022 00:33:55 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Please review this trivial change to rename a variable in
>> NonblockingQueue::try_pop.  The variable named "result" is being renamed to
>> "old_head", as the old name was found to be confusing by some people, making
>> the code harder to read.
>> 
>> Testing:
>> local build.
>
> Looks good and trivial.
> 
> Thanks,
> David

Thanks @dholmes-ora for reviewing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7394

From kbarrett at openjdk.java.net  Wed Feb  9 04:14:11 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 9 Feb 2022 04:14:11 GMT
Subject: Integrated: 8280830: Change NonblockingQueue::try_pop variable named
 "result"
In-Reply-To: <-k-4lPQNjY4Xx-eT0djBmXpv3uwWViUQ82pPO2qTbzA=.1e6c2826-5975-4777-a43c-0a43c8b7bf9b@github.com>
References: <-k-4lPQNjY4Xx-eT0djBmXpv3uwWViUQ82pPO2qTbzA=.1e6c2826-5975-4777-a43c-0a43c8b7bf9b@github.com>
Message-ID: <IvJH2N4FcYZngFafWywZaYbBkAgNAm_Dkxci3hVE3Ek=.8b5ca266-3b24-4ddf-86df-1f9e28e883be@github.com>

On Tue, 8 Feb 2022 23:27:41 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this trivial change to rename a variable in
> NonblockingQueue::try_pop.  The variable named "result" is being renamed to
> "old_head", as the old name was found to be confusing by some people, making
> the code harder to read.
> 
> Testing:
> local build.

This pull request has now been integrated.

Changeset: 13f739d3
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/13f739d330e393f840d134f5327a025957e1f795
Stats:     24 lines in 1 file changed: 0 ins; 0 del; 24 mod

8280830: Change NonblockingQueue::try_pop variable named "result"

Reviewed-by: dholmes

-------------

PR: https://git.openjdk.java.net/jdk/pull/7394

From kbarrett at openjdk.java.net  Wed Feb  9 04:20:10 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 9 Feb 2022 04:20:10 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue
In-Reply-To: <ASoWE73-tR1YogRLU3COO_OeqOklC8kWeX2ZR5570As=.051ad89d-83c5-47e5-b820-76ea1d455dec@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <ASoWE73-tR1YogRLU3COO_OeqOklC8kWeX2ZR5570As=.051ad89d-83c5-47e5-b820-76ea1d455dec@github.com>
Message-ID: <iKFBKtAcCffA535XuX1w6NWSdxVLSIdicmQGa5NIkEI=.781ee020-3b31-426a-a4d5-f726734d5713@github.com>

On Wed, 9 Feb 2022 00:39:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Please review this update of the usage and implementation comments for
>> NonblockingQueue to discuss the ABA issue in push/append operations.
>
> src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 122:
> 
>> 120:     // try_pop could take old_tail before our update, it gets recycled and
>> 121:     // re-added to the end, and then we successfully cmpxchg, rendering the
>> 122:     // list in _tail circular.
> 
> Doesn't this contradict the "We won any races with try_pop ... so we're done"!

The client of this class is expected to prevent ABA from occurring.  Some of the mechanisms that might be used for doing so include separate phases for push/pop and preventing recycling while some thread might be in the midst of one of the problem operations.  The only current user of this class is G1DirtyCardQueueSet, where a combination of GlobalCounter critical sections and safepoint boundaries are used to ensure ABA can't happen.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From shade at openjdk.java.net  Wed Feb  9 06:55:09 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 9 Feb 2022 06:55:09 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <fPNQA8wA48h1_aQm_llWCYTF5gXOCih34TG0CV8RXPQ=.c074b674-eee9-4b23-841e-31009f2c266f@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
 <fPNQA8wA48h1_aQm_llWCYTF5gXOCih34TG0CV8RXPQ=.c074b674-eee9-4b23-841e-31009f2c266f@github.com>
Message-ID: <5tnuK3pwhbOWk8dJlEkELJoxEFhmDyZFwpG5DfkozQ4=.b3cad787-bb0d-4716-91ed-079669da8eb0@github.com>

On Tue, 8 Feb 2022 18:39:07 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
>> 
>> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
>> 
>> The default values for options are different per platform, so tests are x86_64 specific.
>> 
>> No default value is changed, this only unblocks experiments.
>> 
>> Additional testing:
>>  - [x] New tests on Linux x86_64 fastdebug
>>  - [x] New tests on Linux x86_64 release
>
> src/hotspot/share/runtime/globals.hpp line 1539:
> 
>> 1537:           range(1, 128)                                                     \
>> 1538:           constraint(OptoLoopAlignmentConstraintFunc, AfterErgo)            \
>> 1539:                                                                             \
> 
> Should OptoLoopAlignment be an int, instead of an intx, since its range is small?

Dunno, maybe? I see the lot of other "small" options are `intx`, and the change like that would proliferate to all architectures that set `OptoLoopAlignment` as their `product_pd`. It also raises the question if `CodeEntryAlignment` should also be `int`? I'd rather keep this patch small, to be honest.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From iwalulya at openjdk.java.net  Wed Feb  9 10:14:06 2022
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Wed, 9 Feb 2022 10:14:06 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue
In-Reply-To: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
Message-ID: <uxyJYryQEmcgF0H7VT8nVba5JpOk-SmOXQO5QJFUg68=.a5a46d1d-6116-4da5-9d6c-e067755a7508@github.com>

On Tue, 8 Feb 2022 22:58:26 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this update of the usage and implementation comments for
> NonblockingQueue to discuss the ABA issue in push/append operations.

Minor suggestion

src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 120:

> 118:     // old_tail for extension.  We won any races with try_pop by changing
> 119:     // away from end-marker.  So we're done.  Note that ABA is possible;
> 120:     // try_pop could take old_tail before our update, it gets recycled and

"a concurrent try_pop could take...."

-------------

Marked as reviewed by iwalulya (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7393

From duke at openjdk.java.net  Wed Feb  9 13:22:06 2022
From: duke at openjdk.java.net (Bhavana-Kilambi)
Date: Wed, 9 Feb 2022 13:22:06 GMT
Subject: Integrated: 8280007: Enable Neoverse N1 optimizations for Arm Neoverse
 V1 & N2
In-Reply-To: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>
References: <5-WQEPc2lrSR_d0pVtsoFDT45Je1TJtJAdxAiBbEc9U=.6adf8246-dd10-4518-bd91-67e6fdd6eed9@github.com>
Message-ID: <hEGsjiwaJ0DhiW50xILowbfuWq1Q1qLpZ6DDrdRPh6o=.31fd398c-4084-4557-93f1-faac49e11b63@github.com>

On Tue, 8 Feb 2022 15:33:20 GMT, Bhavana-Kilambi <duke at openjdk.java.net> wrote:

> As Arm Neoverse V1 and N2s will benefit from the same optimizations as Neoverse N1 does, it should have OnSpinWaitInst/OnSpinWaitInstCount defaults set to "isb"/1 and UseSIMDForMemoryOps default set to true.
> This patch sets these flags accordingly for both V1 and N2 architectures.

This pull request has now been integrated.

Changeset: f823bed0
Author:    Bhavana Kilambi <bhavana.kilambi at arm.com>
Committer: Paul Hohensee <phh at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/f823bed043dc38d838baaf8c2024ef24b8a50e9b
Stats:     5 lines in 1 file changed: 2 ins; 0 del; 3 mod

8280007: Enable Neoverse N1 optimizations for Arm Neoverse V1 & N2

Reviewed-by: phh

-------------

PR: https://git.openjdk.java.net/jdk/pull/7383

From redestad at openjdk.java.net  Wed Feb  9 14:06:48 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Wed, 9 Feb 2022 14:06:48 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with countPositives
Message-ID: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>

I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.

Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904

- Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.

- An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.

- There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

-------------

Commit messages:
 - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
 - Simplify changes to encodeUTF8
 - Fix little-endian error caught by testing
 - Reduce jumps in the ascii path
 - Remove unused tail_mask
 - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
 - Add more comments, simplify tail branching in AVX512 variant
 - Resolve issues in the precise implementation
 - Add shortMixed micros, cleanups
 - Adjust the countPositives intrinsic to count the bytes exactly.
 - ... and 11 more: https://git.openjdk.java.net/jdk/compare/cab59051...2a855eb6

Changes: https://git.openjdk.java.net/jdk/pull/7231/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281146
  Stats: 806 lines in 24 files changed: 586 ins; 84 del; 136 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7231.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231

PR: https://git.openjdk.java.net/jdk/pull/7231

From duke at openjdk.java.net  Wed Feb  9 15:44:59 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 15:44:59 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v3]
In-Reply-To: <FGBshXkQAc7BAcKLdNG2P5gV-lSCVZOVQDa4VOY5hGc=.2522680c-f757-4a10-9841-9a749cd4ecb4@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <xkQETPbjN1Kqs9riVKLRKJaUSY3LYzMDxn7KenhWYOI=.d6127a0c-c04a-41ea-9fc8-79a00d17bcde@github.com>
 <FGBshXkQAc7BAcKLdNG2P5gV-lSCVZOVQDa4VOY5hGc=.2522680c-f757-4a10-9841-9a749cd4ecb4@github.com>
Message-ID: <eF6dXKqTwf_VecDlTzXBNZO6-Uqas7RrETdPKpDWqh8=.8a11f172-71c9-4b1b-a0e2-ecb26930263c@github.com>

On Wed, 19 Jan 2022 01:10:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - Merge branch 'JDK-8278423' of https://github.com/eme64/jdk into JDK-8278423
>>  - added flag to VMDeprecatedOptions Test
>
> src/hotspot/os/aix/attachListener_aix.cpp line 31:
> 
>> 29: #include "runtime/os.inline.hpp"
>> 30: #include "services/attachListener.hpp"
>> 31: 
> 
> These changes are somewhat independent of the deprecation issue and could be split out into a separate RFE. The serviceability folk may have an opinion.

@dholmes-ora Ok, I reverted this and will do it in a separate RFE.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Wed Feb  9 15:44:58 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 15:44:58 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v5]
In-Reply-To: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
Message-ID: <js036Kt1Cj0W_eVz3ewN6rQ1AbiS5QtvM1hP91bmDaw=.ca421575-8a8e-44b9-9939-b4f19352ae11@github.com>

> Deprecated ExtendedDTraceProbes.
> Edited help messages and man pages accordingly.
> Added flag to VMDeprecatedOptions test.
> 
> Checked that tests are not affected.

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  Revert "removed file with declarations that are never defined or used: /src/hotspot/share/services/dtraceAttacher.hpp"
  
  This reverts commit 885b985bb3618fc621cac1a32159b5449b5026fb.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7110/files
  - new: https://git.openjdk.java.net/jdk/pull/7110/files/0f161b01..7a93ecae

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=03-04

  Stats: 55 lines in 5 files changed: 55 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7110.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Wed Feb  9 15:54:47 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 15:54:47 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v6]
In-Reply-To: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
Message-ID: <1aigQ1zndvru8IqC1RgnZ8mGZecF4kaxUi2039maPPQ=.59745cd8-5be3-4b2a-8976-42ac25404987@github.com>

> Deprecated ExtendedDTraceProbes.
> Edited help messages and man pages accordingly.
> Added flag to VMDeprecatedOptions test.
> 
> Checked that tests are not affected.

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  replaced with 3 flags in test

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7110/files
  - new: https://git.openjdk.java.net/jdk/pull/7110/files/7a93ecae..be3e0b81

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=04-05

  Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7110.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Wed Feb  9 15:54:50 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 15:54:50 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v4]
In-Reply-To: <XL9prdsczS4ML71zBA2d_Vdl7wfMJSKMtblqH_8KbS8=.2b8161dd-17a7-4bb8-a208-e7f96d13d9ac@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <xkQETPbjN1Kqs9riVKLRKJaUSY3LYzMDxn7KenhWYOI=.d6127a0c-c04a-41ea-9fc8-79a00d17bcde@github.com>
 <XL9prdsczS4ML71zBA2d_Vdl7wfMJSKMtblqH_8KbS8=.2b8161dd-17a7-4bb8-a208-e7f96d13d9ac@github.com>
Message-ID: <RElkShbwcPBkjPhOQ2aBAKY1UIS2kbY7Y3yxQxQT4Zs=.3e8f3e2e-62a0-4a56-80fc-40c5c6265492@github.com>

On Tue, 18 Jan 2022 18:42:27 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Can you replace the use of -XX:+ExtendedDTraceProbes in test/hotspot/jtreg/serviceability/7170638/SDTProbesGNULinuxTest.java with the three new flags ?

Thank you @hseigel , I patched it according to your suggestion.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Wed Feb  9 16:10:10 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 16:10:10 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v4]
In-Reply-To: <FGBshXkQAc7BAcKLdNG2P5gV-lSCVZOVQDa4VOY5hGc=.2522680c-f757-4a10-9841-9a749cd4ecb4@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <xkQETPbjN1Kqs9riVKLRKJaUSY3LYzMDxn7KenhWYOI=.d6127a0c-c04a-41ea-9fc8-79a00d17bcde@github.com>
 <FGBshXkQAc7BAcKLdNG2P5gV-lSCVZOVQDa4VOY5hGc=.2522680c-f757-4a10-9841-9a749cd4ecb4@github.com>
Message-ID: <NetIxoft4dgW_YzLmkYQ8imhiuaaYjJGteqZxxgZe3M=.54363394-aea4-4144-95be-f519566081f0@github.com>

On Wed, 19 Jan 2022 01:12:16 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   moved deprecated flag to deprecated section in manpages
>
> src/hotspot/share/runtime/arguments.cpp line 2884:
> 
>> 2882: #if defined(DTRACE_ENABLED)
>> 2883:       warning("Option ExtendedDTraceProbes was deprecated in version 19 and will likely be removed in a future release.");
>> 2884:       warning("Use a combination of -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes and -XX:+DTraceMonitorProbes instead.");
> 
> s/a/the/
> 
> Applies to all three uses.

Agreed, changing it

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Wed Feb  9 16:20:49 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 16:20:49 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v7]
In-Reply-To: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
Message-ID: <bYJZF0VnxWgLfom7mJvhXaLlQpcmaH4IlQ7qetFdcoM=.80f1e58e-a5f9-43d9-af10-8a3752fe5f66@github.com>

> Deprecated ExtendedDTraceProbes.
> Edited help messages and man pages accordingly.
> Added flag to VMDeprecatedOptions test.
> 
> Checked that tests are not affected.

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  updated warning messages and added 3 flags to man-pages

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7110/files
  - new: https://git.openjdk.java.net/jdk/pull/7110/files/be3e0b81..b05ecfa2

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=05-06

  Stats: 18 lines in 2 files changed: 16 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7110.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Wed Feb  9 16:20:51 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Wed, 9 Feb 2022 16:20:51 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v4]
In-Reply-To: <FGBshXkQAc7BAcKLdNG2P5gV-lSCVZOVQDa4VOY5hGc=.2522680c-f757-4a10-9841-9a749cd4ecb4@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <xkQETPbjN1Kqs9riVKLRKJaUSY3LYzMDxn7KenhWYOI=.d6127a0c-c04a-41ea-9fc8-79a00d17bcde@github.com>
 <FGBshXkQAc7BAcKLdNG2P5gV-lSCVZOVQDa4VOY5hGc=.2522680c-f757-4a10-9841-9a749cd4ecb4@github.com>
Message-ID: <jWf89VYo_J441ae4pwu7duC4ha6mOz6Oto0rWQc3V0w=.5c0ea2e8-5de4-4a23-b7c9-7574d2f5de12@github.com>

On Wed, 19 Jan 2022 01:17:21 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   moved deprecated flag to deprecated section in manpages
>
> src/java.base/share/man/java.1 line 4001:
> 
>> 3999: .TP
>> 4000: .B \f[CB]\-XX:+ExtendedDTraceProbes\f[R]
>> 4001: Deprecated. Use combination of these flags instead: -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes, -XX:+DTraceMonitorProbes
> 
> Delete "Deprecated" as we are in the deprecated options section.
> 
> The wording also needs updating as per the warning text ... though that might read a little odd here so I suggest a tweak:
> 
> Use the combination of -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes and -XX:+DTraceMonitorProbes instead of this deprecated flag.
> 
> 
> I would also move that new text to the end, so we still describe the flag first (otherwise it again reads a little odd.)
> 
> We will also need to add those flags to the "ADVANCED SERVICEABILITY OPTIONS FOR JAVA" section.

@dholmes-ora thanks for the suggestions, I implemented them.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From mdoerr at openjdk.java.net  Wed Feb  9 18:08:12 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Wed, 9 Feb 2022 18:08:12 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
Message-ID: <J_hgIL38wiAUT98OuN0c1nvd9XchCuF0ut2l66ApkVU=.92f22c02-7464-48c7-8d50-f89ce936a857@github.com>

On Tue, 8 Feb 2022 17:24:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Show watermark in better place on the chart

LGTM. And a step into the right direction IMHO. We should check the code on other platforms, too (separately is ok).

-------------

Marked as reviewed by mdoerr (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7247

From hseigel at openjdk.java.net  Wed Feb  9 18:51:10 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Wed, 9 Feb 2022 18:51:10 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v7]
In-Reply-To: <bYJZF0VnxWgLfom7mJvhXaLlQpcmaH4IlQ7qetFdcoM=.80f1e58e-a5f9-43d9-af10-8a3752fe5f66@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <bYJZF0VnxWgLfom7mJvhXaLlQpcmaH4IlQ7qetFdcoM=.80f1e58e-a5f9-43d9-af10-8a3752fe5f66@github.com>
Message-ID: <Uii83D_BuQ4Bsv0RUZUUvT_W6QoNrtSbEM0o2FH4Yrs=.3e2acb4e-f587-4dd1-9fb7-1fc36c6595a6@github.com>

On Wed, 9 Feb 2022 16:20:49 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Deprecated ExtendedDTraceProbes.
>> Edited help messages and man pages accordingly, added the 3 flags to man pages.
>> Added flag to VMDeprecatedOptions test.
>> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
>> 
>> Checked that tests are not affected.
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   updated warning messages and added 3 flags to man-pages

Other than the need to update the copyright dates to 2022, these changes look good.
Thanks, Harold

-------------

Marked as reviewed by hseigel (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7110

From hseigel at openjdk.java.net  Wed Feb  9 18:58:05 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Wed, 9 Feb 2022 18:58:05 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <5tnuK3pwhbOWk8dJlEkELJoxEFhmDyZFwpG5DfkozQ4=.b3cad787-bb0d-4716-91ed-079669da8eb0@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
 <fPNQA8wA48h1_aQm_llWCYTF5gXOCih34TG0CV8RXPQ=.c074b674-eee9-4b23-841e-31009f2c266f@github.com>
 <5tnuK3pwhbOWk8dJlEkELJoxEFhmDyZFwpG5DfkozQ4=.b3cad787-bb0d-4716-91ed-079669da8eb0@github.com>
Message-ID: <4k5B_eeCIPWe4rTYueR7n0lixNRMFzItoV9U7lCfIbM=.ada0c192-7ca6-4d4b-bdcb-a912e7867aa5@github.com>

On Wed, 9 Feb 2022 06:51:47 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> src/hotspot/share/runtime/globals.hpp line 1539:
>> 
>>> 1537:           range(1, 128)                                                     \
>>> 1538:           constraint(OptoLoopAlignmentConstraintFunc, AfterErgo)            \
>>> 1539:                                                                             \
>> 
>> Should OptoLoopAlignment be an int, instead of an intx, since its range is small?
>
> Dunno, maybe? I see the lot of other "small" options are `intx`, and the change like that would proliferate to all architectures that set `OptoLoopAlignment` as their `product_pd`. It also raises the question if `CodeEntryAlignment` should also be `int`? I'd rather keep this patch small, to be honest.

Your comment makes sense.  Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From iwalulya at openjdk.java.net  Wed Feb  9 19:10:05 2022
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Wed, 9 Feb 2022 19:10:05 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <QVyevXI27-3oYiZgpfAq6YJZb-_dBVcj8HfBUKblxoo=.2a37d669-4a67-4d4a-9315-49b7c483d23e@github.com>

On Tue, 18 Jan 2022 12:03:46 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

lgtm!

-------------

Marked as reviewed by iwalulya (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7124

From sviswanathan at openjdk.java.net  Wed Feb  9 23:27:07 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Wed, 9 Feb 2022 23:27:07 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts
In-Reply-To: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
Message-ID: <EyIVF9jkyQkp_CGkS61PGbLfOmWpMJMKjkZfzBPms_U=.b8b1e26e-1fc9-44f0-9009-3b8c3f453cfc@github.com>

On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

> Hi,
> 
> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
> 
> Thank you very much.

src/hotspot/cpu/x86/assembler_x86.cpp line 4782:

> 4780:   vector_len == AVX_256bit? VM_Version::supports_avx2() :
> 4781:   vector_len == AVX_512bit? VM_Version::supports_evex() : 0, " ");
> 4782:   InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);

legacy_mode should be false here instead of _legacy_mode_bw.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From kbarrett at openjdk.java.net  Thu Feb 10 03:14:05 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 03:14:05 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <vLDmjUS8OqKtD_OGgWsoW3p9Is3op4KcF7vc6Lt_f64=.d544841d-faa7-4617-bf74-2a81d5119099@github.com>

On Tue, 18 Jan 2022 12:03:46 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

I'm not keen on the suggested new name. I want to read ParallelExpandHeap_lock
as a lock for parallel expansion of the heap, which doesn't really have the
right flavor of subsystem ownership. I think (with this change) its uses are
limited to ParallelGC oldgen expansion, suggesting a name like
PSOldGenExpand_lock (or PSOldGen::Expand_lock, which could be private were it
not for the assert in MutableSpace and the "usual" practice of defining locks
in mutexLocker.[ch]pp).

Other than that naming issue, nice cleanup.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From kbarrett at openjdk.java.net  Thu Feb 10 05:54:43 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 05:54:43 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v2]
In-Reply-To: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
Message-ID: <5kzE-8-k1Npd6wlCu_kc2xyP1NyC7iJQA4ymcl5O1Ac=.0dd5114e-d4a2-4c39-970d-93995239323b@github.com>

> Please review this update of the usage and implementation comments for
> NonblockingQueue to discuss the ABA issue in push/append operations.

Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:

  walulyai review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7393/files
  - new: https://git.openjdk.java.net/jdk/pull/7393/files/8c6593dc..fe7cc130

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=00-01

  Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7393.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7393/head:pull/7393

PR: https://git.openjdk.java.net/jdk/pull/7393

From kbarrett at openjdk.java.net  Thu Feb 10 05:54:45 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 05:54:45 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v2]
In-Reply-To: <uxyJYryQEmcgF0H7VT8nVba5JpOk-SmOXQO5QJFUg68=.a5a46d1d-6116-4da5-9d6c-e067755a7508@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <uxyJYryQEmcgF0H7VT8nVba5JpOk-SmOXQO5QJFUg68=.a5a46d1d-6116-4da5-9d6c-e067755a7508@github.com>
Message-ID: <-r7rCiSfhFsXpQqg-m1qPuMoPeFkcM92eBdAkWz7C40=.4efb33b4-1503-4edc-8635-8c0fb23db64b@github.com>

On Wed, 9 Feb 2022 10:10:17 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   walulyai review
>
> src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 120:
> 
>> 118:     // old_tail for extension.  We won any races with try_pop by changing
>> 119:     // away from end-marker.  So we're done.  Note that ABA is possible;
>> 120:     // try_pop could take old_tail before our update, it gets recycled and
> 
> "a concurrent try_pop could take...."

Sure, I can make that explicit.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From dholmes at openjdk.java.net  Thu Feb 10 06:22:15 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 06:22:15 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v7]
In-Reply-To: <bYJZF0VnxWgLfom7mJvhXaLlQpcmaH4IlQ7qetFdcoM=.80f1e58e-a5f9-43d9-af10-8a3752fe5f66@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <bYJZF0VnxWgLfom7mJvhXaLlQpcmaH4IlQ7qetFdcoM=.80f1e58e-a5f9-43d9-af10-8a3752fe5f66@github.com>
Message-ID: <vxtwyVlif2rYsFlp6_WqozyayJWnDH2xREbBwdpf3Gw=.8e15015b-d894-4f50-ae9d-f6fe2175d761@github.com>

On Wed, 9 Feb 2022 16:20:49 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Deprecated ExtendedDTraceProbes.
>> Edited help messages and man pages accordingly, added the 3 flags to man pages.
>> Added flag to VMDeprecatedOptions test.
>> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
>> 
>> Checked that tests are not affected.
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   updated warning messages and added 3 flags to man-pages

Hi Emanuel,

A few minor nits below.

Thanks,
David

src/hotspot/share/runtime/globals.hpp line 1868:

> 1866:   product(bool, ExtendedDTraceProbes,    false,                             \
> 1867:           "(Deprecated) Enable performance-impacting dtrace probes. "       \
> 1868:           "Use a combination of -XX:+DTraceMethodProbes, "                  \

You missed changing 'a' to 'the' here.

src/java.base/share/man/java.1 line 2977:

> 2975: .RE
> 2976: .TP
> 2977: .B \f[CB]\-XX:+DTraceAllocProbes\f[R]

The three newly documented flags should all be marked "Linux and macOS". It is somewhat of a poor design that the flags are available on all platforms but only have an effect on systems with DTrace or SystemTap support - which (for our main platforms) is Linux and macOS.

src/java.base/share/man/java.1 line 4017:

> 4015: .B \f[CB]\-XX:+ExtendedDTraceProbes\f[R]
> 4016: \f[B]Linux and macOS:\f[R] Enables additional \f[CB]dtrace\f[R] tool probes
> 4017: that affect the performance.

Existing grammatical nit: please delete 'the'.

src/java.base/share/man/java.1 line 4020:

> 4018: By default, this option is disabled and \f[CB]dtrace\f[R] performs only
> 4019: standard probes.
> 4020: Use the combination of these flags instead: -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes, -XX:+DTraceMonitorProbes

The flags should be in a code font (use `-XX:...` in the markdown source).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From dholmes at openjdk.java.net  Thu Feb 10 06:29:09 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 06:29:09 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v2]
In-Reply-To: <iKFBKtAcCffA535XuX1w6NWSdxVLSIdicmQGa5NIkEI=.781ee020-3b31-426a-a4d5-f726734d5713@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <ASoWE73-tR1YogRLU3COO_OeqOklC8kWeX2ZR5570As=.051ad89d-83c5-47e5-b820-76ea1d455dec@github.com>
 <iKFBKtAcCffA535XuX1w6NWSdxVLSIdicmQGa5NIkEI=.781ee020-3b31-426a-a4d5-f726734d5713@github.com>
Message-ID: <Gs8iBFYlUYAQfKETo1QC6mc9mS1CynmI5FKtshGN9Hk=.dcef6c27-20e5-484e-b68b-2f8887ffb386@github.com>

On Wed, 9 Feb 2022 04:16:33 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 122:
>> 
>>> 120:     // try_pop could take old_tail before our update, it gets recycled and
>>> 121:     // re-added to the end, and then we successfully cmpxchg, rendering the
>>> 122:     // list in _tail circular.
>> 
>> Doesn't this contradict the "We won any races with try_pop ... so we're done"!
>
> The client of this class is expected to prevent ABA from occurring.  Some of the mechanisms that might be used for doing so include separate phases for push/pop and preventing recycling while some thread might be in the midst of one of the problem operations.  The only current user of this class is G1DirtyCardQueueSet, where a combination of GlobalCounter critical sections and safepoint boundaries are used to ensure ABA can't happen.

Understood, but it still, to me, reads oddly to claim "we're done" and then have an ABA disclaimer.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From dholmes at openjdk.java.net  Thu Feb 10 06:54:07 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 06:54:07 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v2]
In-Reply-To: <5kzE-8-k1Npd6wlCu_kc2xyP1NyC7iJQA4ymcl5O1Ac=.0dd5114e-d4a2-4c39-970d-93995239323b@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <5kzE-8-k1Npd6wlCu_kc2xyP1NyC7iJQA4ymcl5O1Ac=.0dd5114e-d4a2-4c39-970d-93995239323b@github.com>
Message-ID: <M1wapNKiLPadXnUS-NEGj_U83dE5aUPsccNettcpf2Y=.7998829c-dca7-4a69-873b-8a2815a49844@github.com>

On Thu, 10 Feb 2022 05:54:43 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this update of the usage and implementation comments for
>> NonblockingQueue to discuss the ABA issue in push/append operations.
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
> 
>   walulyai review

src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 122:

> 120:     // concurrent try_pop could take old_tail before our update, it gets
> 121:     // recycled and re-added to the end, and then we successfully cmpxchg,
> 122:     // rendering the list in _tail circular.

Suggestions:

1. start the new comment on a new line
2.  "Note that ABA would be possible if a concurrent try_pop takes old_tail before our update, ...

Cheers,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From shade at openjdk.java.net  Thu Feb 10 08:43:07 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 10 Feb 2022 08:43:07 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
Message-ID: <q3-4cdI150Y5l-PBuS6gTADIbi42hnPiJR7WdyRxozQ=.f3d01537-9d19-4d63-b38e-c5b7ea5de46a@github.com>

On Tue, 8 Feb 2022 17:24:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Show watermark in better place on the chart

All right, thanks for reviews! Last call for comments. I am planning to integrate it later today.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From duke at openjdk.java.net  Thu Feb 10 08:46:50 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Thu, 10 Feb 2022 08:46:50 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v8]
In-Reply-To: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
Message-ID: <6_ddanyI-FFaerYCGBHYYGlJQZpUvypaIIoPOq6S3wM=.b77c72f2-e29a-4d31-826c-f42c737978d1@github.com>

> Deprecated ExtendedDTraceProbes.
> Edited help messages and man pages accordingly, added the 3 flags to man pages.
> Added flag to VMDeprecatedOptions test.
> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
> 
> Checked that tests are not affected.

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  fixes to documentation requested by reviewers

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7110/files
  - new: https://git.openjdk.java.net/jdk/pull/7110/files/b05ecfa2..af11b456

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=06-07

  Stats: 6 lines in 2 files changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7110.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110

PR: https://git.openjdk.java.net/jdk/pull/7110

From duke at openjdk.java.net  Thu Feb 10 08:46:55 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Thu, 10 Feb 2022 08:46:55 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v7]
In-Reply-To: <vxtwyVlif2rYsFlp6_WqozyayJWnDH2xREbBwdpf3Gw=.8e15015b-d894-4f50-ae9d-f6fe2175d761@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <bYJZF0VnxWgLfom7mJvhXaLlQpcmaH4IlQ7qetFdcoM=.80f1e58e-a5f9-43d9-af10-8a3752fe5f66@github.com>
 <vxtwyVlif2rYsFlp6_WqozyayJWnDH2xREbBwdpf3Gw=.8e15015b-d894-4f50-ae9d-f6fe2175d761@github.com>
Message-ID: <2wP_TEQJtQvj6ALRgl4BG_QnXkLVqBntoZleXWDIDaU=.ccd0ae17-7ed8-4ec5-955b-476941ab37b9@github.com>

On Thu, 10 Feb 2022 05:54:17 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   updated warning messages and added 3 flags to man-pages
>
> src/hotspot/share/runtime/globals.hpp line 1868:
> 
>> 1866:   product(bool, ExtendedDTraceProbes,    false,                             \
>> 1867:           "(Deprecated) Enable performance-impacting dtrace probes. "       \
>> 1868:           "Use a combination of -XX:+DTraceMethodProbes, "                  \
> 
> You missed changing 'a' to 'the' here.

done

> src/java.base/share/man/java.1 line 2977:
> 
>> 2975: .RE
>> 2976: .TP
>> 2977: .B \f[CB]\-XX:+DTraceAllocProbes\f[R]
> 
> The three newly documented flags should all be marked "Linux and macOS". It is somewhat of a poor design that the flags are available on all platforms but only have an effect on systems with DTrace or SystemTap support - which (for our main platforms) is Linux and macOS.

done

> src/java.base/share/man/java.1 line 4017:
> 
>> 4015: .B \f[CB]\-XX:+ExtendedDTraceProbes\f[R]
>> 4016: \f[B]Linux and macOS:\f[R] Enables additional \f[CB]dtrace\f[R] tool probes
>> 4017: that affect the performance.
> 
> Existing grammatical nit: please delete 'the'.

done

> src/java.base/share/man/java.1 line 4020:
> 
>> 4018: By default, this option is disabled and \f[CB]dtrace\f[R] performs only
>> 4019: standard probes.
>> 4020: Use the combination of these flags instead: -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes, -XX:+DTraceMonitorProbes
> 
> The flags should be in a code font (use `-XX:...` in the markdown source).

done

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From aph at openjdk.java.net  Thu Feb 10 09:57:16 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Thu, 10 Feb 2022 09:57:16 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <q3-4cdI150Y5l-PBuS6gTADIbi42hnPiJR7WdyRxozQ=.f3d01537-9d19-4d63-b38e-c5b7ea5de46a@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
 <q3-4cdI150Y5l-PBuS6gTADIbi42hnPiJR7WdyRxozQ=.f3d01537-9d19-4d63-b38e-c5b7ea5de46a@github.com>
Message-ID: <6jyO3IUay3bgBazoCWsk_1BGQPaWvFHmqFFlYi6lH8k=.cb8fc84f-c9a4-4b35-be0c-7e2be7ede815@github.com>

On Thu, 10 Feb 2022 08:40:11 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> All right, thanks for reviews! Last call for comments. I am planning to integrate it later today.

x86-32 has some weird stack handling, particularly when using the invocation interface. I guess we assume our regression tests will catch breakage there.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Thu Feb 10 10:12:13 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 10 Feb 2022 10:12:13 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <6jyO3IUay3bgBazoCWsk_1BGQPaWvFHmqFFlYi6lH8k=.cb8fc84f-c9a4-4b35-be0c-7e2be7ede815@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
 <q3-4cdI150Y5l-PBuS6gTADIbi42hnPiJR7WdyRxozQ=.f3d01537-9d19-4d63-b38e-c5b7ea5de46a@github.com>
 <6jyO3IUay3bgBazoCWsk_1BGQPaWvFHmqFFlYi6lH8k=.cb8fc84f-c9a4-4b35-be0c-7e2be7ede815@github.com>
Message-ID: <-WugBPZ3_skHnoOjDca2NP6leCXqX8UuMKsSK6kfIic=.e4f07c8e-6ac0-4bc7-a24b-72afadc29b9b@github.com>

On Thu, 10 Feb 2022 09:54:01 GMT, Andrew Haley <aph at openjdk.org> wrote:

> > All right, thanks for reviews! Last call for comments. I am planning to integrate it later today.
> 
> x86-32 has some weird stack handling, particularly when using the invocation interface. I guess we assume our regression tests will catch breakage there.

As you can see in "Additional testing", I ran `tier{1,2,3}` on x86_32 without problems. It is hard to tell how this patch would break x86_32 though: it would still bang the same way when close to guard zone.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From kbarrett at openjdk.java.net  Thu Feb 10 10:12:44 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 10:12:44 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v3]
In-Reply-To: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
Message-ID: <emUt5QKngT4ryie9UvHuQ9qhZ_zAEgph_KMVxQTSU_g=.95a0c693-269a-4789-a139-db4a7afc1503@github.com>

> Please review this update of the usage and implementation comments for
> NonblockingQueue to discuss the ABA issue in push/append operations.

Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:

  dholmes review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7393/files
  - new: https://git.openjdk.java.net/jdk/pull/7393/files/fe7cc130..7aacc083

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=01-02

  Stats: 7 lines in 1 file changed: 3 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7393.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7393/head:pull/7393

PR: https://git.openjdk.java.net/jdk/pull/7393

From kbarrett at openjdk.java.net  Thu Feb 10 10:12:47 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 10:12:47 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v2]
In-Reply-To: <M1wapNKiLPadXnUS-NEGj_U83dE5aUPsccNettcpf2Y=.7998829c-dca7-4a69-873b-8a2815a49844@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <5kzE-8-k1Npd6wlCu_kc2xyP1NyC7iJQA4ymcl5O1Ac=.0dd5114e-d4a2-4c39-970d-93995239323b@github.com>
 <M1wapNKiLPadXnUS-NEGj_U83dE5aUPsccNettcpf2Y=.7998829c-dca7-4a69-873b-8a2815a49844@github.com>
Message-ID: <o0TPp7TFPlIXcAyEX2LbVKVvhW5NqdQWot-VlZqMM6Y=.40dc0c8b-72f2-4cc3-9c77-fd87c77b8097@github.com>

On Thu, 10 Feb 2022 06:50:42 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   walulyai review
>
> src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 122:
> 
>> 120:     // concurrent try_pop could take old_tail before our update, it gets
>> 121:     // recycled and re-added to the end, and then we successfully cmpxchg,
>> 122:     // rendering the list in _tail circular.
> 
> Suggestions:
> 
> 1. start the new comment on a new line
> 2.  "Note that ABA would be possible if a concurrent try_pop takes old_tail before our update, ...
> 
> Cheers,
> David

OK, I separated the note and reworded it a bit.  Better?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From dholmes at openjdk.java.net  Thu Feb 10 10:50:11 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 10:50:11 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v8]
In-Reply-To: <6_ddanyI-FFaerYCGBHYYGlJQZpUvypaIIoPOq6S3wM=.b77c72f2-e29a-4d31-826c-f42c737978d1@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <6_ddanyI-FFaerYCGBHYYGlJQZpUvypaIIoPOq6S3wM=.b77c72f2-e29a-4d31-826c-f42c737978d1@github.com>
Message-ID: <V12800SrHm9uYETZz_lrEWHPskipXy09m35veVSwaho=.1931b3b6-0ea6-457b-836d-b174405d3abe@github.com>

On Thu, 10 Feb 2022 08:46:50 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Deprecated ExtendedDTraceProbes.
>> Edited help messages and man pages accordingly, added the 3 flags to man pages.
>> Added flag to VMDeprecatedOptions test.
>> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
>> 
>> Checked that tests are not affected.
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fixes to documentation requested by reviewers

Thanks for the updates.

David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7110

From dholmes at openjdk.java.net  Thu Feb 10 10:55:05 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 10:55:05 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v3]
In-Reply-To: <emUt5QKngT4ryie9UvHuQ9qhZ_zAEgph_KMVxQTSU_g=.95a0c693-269a-4789-a139-db4a7afc1503@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <emUt5QKngT4ryie9UvHuQ9qhZ_zAEgph_KMVxQTSU_g=.95a0c693-269a-4789-a139-db4a7afc1503@github.com>
Message-ID: <48Qi6eikXOhu71mgUEc3PE-CYTHJJTpZxOVzbl6vPtQ=.66340aa2-f28c-4569-92c7-993ca4d78b1b@github.com>

On Thu, 10 Feb 2022 10:12:44 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this update of the usage and implementation comments for
>> NonblockingQueue to discuss the ABA issue in push/append operations.
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
> 
>   dholmes review

Thanks - reads well.

David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7393

From kbarrett at openjdk.java.net  Thu Feb 10 11:31:41 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 11:31:41 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v4]
In-Reply-To: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
Message-ID: <nit9AO0Uzht2Gv8pJkNgXB7gfSKXHChhEZdgaFR0WPc=.e4642254-96ac-41b6-9bb0-08eb53f4cc7b@github.com>

> Please review this update of the usage and implementation comments for
> NonblockingQueue to discuss the ABA issue in push/append operations.

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - Merge branch 'master' into nbq-aba
 - dholmes review
 - walulyai review
 - document ABA for push/append

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7393/files
  - new: https://git.openjdk.java.net/jdk/pull/7393/files/7aacc083..4f03fd58

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7393&range=02-03

  Stats: 3101 lines in 57 files changed: 2579 ins; 211 del; 311 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7393.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7393/head:pull/7393

PR: https://git.openjdk.java.net/jdk/pull/7393

From kbarrett at openjdk.java.net  Thu Feb 10 11:31:43 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 11:31:43 GMT
Subject: RFR: 8280832: Update usage docs for NonblockingQueue [v4]
In-Reply-To: <uxyJYryQEmcgF0H7VT8nVba5JpOk-SmOXQO5QJFUg68=.a5a46d1d-6116-4da5-9d6c-e067755a7508@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
 <uxyJYryQEmcgF0H7VT8nVba5JpOk-SmOXQO5QJFUg68=.a5a46d1d-6116-4da5-9d6c-e067755a7508@github.com>
Message-ID: <6RaGYU4SOe70MtRx7l7SoqoaqMOv0KTrp__xCI97GX4=.34730bb9-58c4-4d6d-8f5f-574e6fe19463@github.com>

On Wed, 9 Feb 2022 10:10:35 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>> 
>>  - Merge branch 'master' into nbq-aba
>>  - dholmes review
>>  - walulyai review
>>  - document ABA for push/append
>
> Minor suggestion

Thanks @walulyai and @dholmes-ora for reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From kbarrett at openjdk.java.net  Thu Feb 10 11:31:44 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Feb 2022 11:31:44 GMT
Subject: Integrated: 8280832: Update usage docs for NonblockingQueue
In-Reply-To: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
References: <Fi9BsWonIe78EDfwMxMCE5I4WofF4u0IDrKoqWhjVpI=.03a7aa7e-de43-4dac-9bee-ca791e39dc52@github.com>
Message-ID: <p6UsmWd09uMoSj_C-q-_oJdvGEysCFhbJTBfwxnNQ9M=.3a47ae6a-811b-47e3-9d80-30643dcd05d7@github.com>

On Tue, 8 Feb 2022 22:58:26 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this update of the usage and implementation comments for
> NonblockingQueue to discuss the ABA issue in push/append operations.

This pull request has now been integrated.

Changeset: 3ce1c5b6
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/3ce1c5b6ce02749ef8f9d35409b7bcbf27f47203
Stats:     9 lines in 2 files changed: 8 ins; 0 del; 1 mod

8280832: Update usage docs for NonblockingQueue

Reviewed-by: iwalulya, dholmes

-------------

PR: https://git.openjdk.java.net/jdk/pull/7393

From jbhateja at openjdk.java.net  Thu Feb 10 12:24:14 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 10 Feb 2022 12:24:14 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts
In-Reply-To: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
Message-ID: <1EkBcO28e83W0erDN6flFX6eR88aovKxVIGJqOiF40I=.5db87001-570d-4679-9b3a-7937b72233ed@github.com>

On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

> Hi,
> 
> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
> 
> Thank you very much.

src/hotspot/cpu/x86/x86.ad line 7288:

> 7286:         break;
> 7287:       default: assert(false, "%s", type2name(to_elem_bt));
> 7288:     }

Please move this into a macro assembly routine.

src/hotspot/cpu/x86/x86.ad line 7310:

> 7308:       default: assert(false, "%s", type2name(to_elem_bt));
> 7309:     }
> 7310:   %}

Same as above.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From ayang at openjdk.java.net  Thu Feb 10 13:32:42 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 10 Feb 2022 13:32:42 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v2]
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <NxkYx2Ke66KLxpKMn25JqPA2xlK_ZxGlSS731zOcwX0=.144b0703-c18a-4e22-a9f0-3579f47a56e3@github.com>

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:

  review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7124/files
  - new: https://git.openjdk.java.net/jdk/pull/7124/files/16874ed0..8e98e826

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=00-01

  Stats: 40 lines in 6 files changed: 5 ins; 25 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7124.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7124/head:pull/7124

PR: https://git.openjdk.java.net/jdk/pull/7124

From ayang at openjdk.java.net  Thu Feb 10 13:32:44 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 10 Feb 2022 13:32:44 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <D0WP-rOWqElwoHZ0Lr1OnGNPtJG8zBYbSr0a7zTpGac=.e3e56fd6-66ea-4486-bbfc-e74528db5aa2@github.com>

On Tue, 18 Jan 2022 12:03:46 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

I have moved the mutex inside `PSOldGen` to reduce its scope.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From ayang at openjdk.java.net  Thu Feb 10 14:58:46 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 10 Feb 2022 14:58:46 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v3]
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <xTw2OFGUcy9AWewyPi8qC76fDg4U1-LmBgDNp1VjdvU=.e9522917-6318-42c5-b5b4-4c0447c9133e@github.com>

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:

  fix release build

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7124/files
  - new: https://git.openjdk.java.net/jdk/pull/7124/files/8e98e826..fa5dcce9

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=01-02

  Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7124.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7124/head:pull/7124

PR: https://git.openjdk.java.net/jdk/pull/7124

From volker.simonis at gmail.com  Thu Feb 10 15:08:06 2022
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 10 Feb 2022 16:08:06 +0100
Subject: Internal compiler error for slowdebug build with gcc 7.5.0 on Ubuntu
 18.04
Message-ID: <CA+3eh130pAA3r2GmYZdhinWQYL+0TSTbP4B-e_ZHuPrGnBDDjg@mail.gmail.com>

Hi,

When compiling the latest HS sources in slowdebug mode with gcc 7.5.0
(the default compiler on Ubuntu 18.04) I get the following internal
compiler error for the file compileBroker.cpp:

/OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp: In
static member function 'static voi
d CompileBroker::invoke_compiler_on_method(CompileTask*)':
/OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp:2393:1:
internal compiler error: Max. number of generated reload insns per
insn is achieved (90)

 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.

I know that gcc 7.5.0 isn't officially supported but was just curious
if somebody has seen this before? Googling around shows that this
issue seems to have been fixed several times in gcc 4.9 and
specifically for ppc/rs6000.

I've installed and tried gcc 8.4.0 but the error remains the same:

GNU C++14 (Ubuntu 8.4.0-1ubuntu1~18.04) version 8.4.0 (x86_64-linux-gnu)
    compiled by GNU C version 8.4.0, GMP version 6.1.2, MPFR version
4.0.1, MPC version 1.1.0, isl version isl-0.19-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU assembler version 2.30 (x86_64-linux-gnu) using BFD version (GNU
Binutils for Ubuntu) 2.30
Compiler executable checksum: 67fba09f596cc8a67df33f8529603bfb
during RTL pass: reload
/OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp: In
static member function ?static void
CompileBroker::invoke_compiler_on_method(CompileTask*)?:
/OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp:2393:1:
internal compiler error: Max. number of generated reload insns per
insn is achieved (90)

 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-8/README.Bugs> for instructions.

According to the "Supported Build Platforms" Wiki [1] it seems that at
least SAP is using gcc 8. Have you run into this issue as well? Any
ideas how to fix it without upgrading to gcc 10?

Thank you and best regards,
Volker

PS: the release build works perfectly fine with gcc 7.5.0

[1] https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms

From duke at openjdk.java.net  Thu Feb 10 15:14:44 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Thu, 10 Feb 2022 15:14:44 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
Message-ID: <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>

> Hi,
> 
> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
> 
> Thank you very much.

Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:

 - minor rename
 - address reviews

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7358/files
  - new: https://git.openjdk.java.net/jdk/pull/7358/files/22a70fe1..8028be52

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7358&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7358&range=00-01

  Stats: 81 lines in 4 files changed: 32 ins; 44 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7358.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7358/head:pull/7358

PR: https://git.openjdk.java.net/jdk/pull/7358

From duke at openjdk.java.net  Thu Feb 10 15:14:46 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Thu, 10 Feb 2022 15:14:46 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <EyIVF9jkyQkp_CGkS61PGbLfOmWpMJMKjkZfzBPms_U=.b8b1e26e-1fc9-44f0-9009-3b8c3f453cfc@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <EyIVF9jkyQkp_CGkS61PGbLfOmWpMJMKjkZfzBPms_U=.b8b1e26e-1fc9-44f0-9009-3b8c3f453cfc@github.com>
Message-ID: <lC0CLtI3PU_gmVK9uuDeekxAi5ajNIEk7zEJyJKqPB8=.a2c708e2-3433-466b-adb4-46e3ea21143c@github.com>

On Wed, 9 Feb 2022 22:52:47 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - minor rename
>>  - address reviews
>
> src/hotspot/cpu/x86/assembler_x86.cpp line 4782:
> 
>> 4780:   vector_len == AVX_256bit? VM_Version::supports_avx2() :
>> 4781:   vector_len == AVX_512bit? VM_Version::supports_evex() : 0, " ");
>> 4782:   InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
> 
> legacy_mode should be false here instead of _legacy_mode_bw.

Fixed, thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From duke at openjdk.java.net  Thu Feb 10 15:14:49 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Thu, 10 Feb 2022 15:14:49 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <1EkBcO28e83W0erDN6flFX6eR88aovKxVIGJqOiF40I=.5db87001-570d-4679-9b3a-7937b72233ed@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <1EkBcO28e83W0erDN6flFX6eR88aovKxVIGJqOiF40I=.5db87001-570d-4679-9b3a-7937b72233ed@github.com>
Message-ID: <1U-v8HDdffTAyMecRVwaQhZUi3mmITIGDpuXsbHni5o=.b0bc2c3f-ac7f-4c0d-831c-7586673d5aea@github.com>

On Thu, 10 Feb 2022 05:05:05 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - minor rename
>>  - address reviews
>
> src/hotspot/cpu/x86/x86.ad line 7288:
> 
>> 7286:         break;
>> 7287:       default: assert(false, "%s", type2name(to_elem_bt));
>> 7288:     }
> 
> Please move this into a macro assembly routine.

Fixed, thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From lkorinth at openjdk.java.net  Thu Feb 10 15:47:23 2022
From: lkorinth at openjdk.java.net (Leo Korinth)
Date: Thu, 10 Feb 2022 15:47:23 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
Message-ID: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>

Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.

-------------

Commit messages:
 - 8281585: Remove unused imports under test/lib and jtreg/gc

Changes: https://git.openjdk.java.net/jdk/pull/7426/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7426&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281585
  Stats: 92 lines in 60 files changed: 0 ins; 92 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7426.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7426/head:pull/7426

PR: https://git.openjdk.java.net/jdk/pull/7426

From aph at openjdk.java.net  Thu Feb 10 16:21:12 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Thu, 10 Feb 2022 16:21:12 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <GXUcAo55K4vReK737EJE8VWuNh5fUP0O01nxczF5fV8=.0bb1ca06-d8d7-4440-92bb-eaad4e22a169@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
 <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
 <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>
 <n5enwyIUBovTlCjVXWbKBCayt4O2227qRPzbmzajzr0=.ba48446a-9279-453c-9cde-82b7755e1767@github.com>
 <GXUcAo55K4vReK737EJE8VWuNh5fUP0O01nxczF5fV8=.0bb1ca06-d8d7-4440-92bb-eaad4e22a169@github.com>
Message-ID: <9wCVZ8gCStf_tUT8_WQjhLzXqqQlQMsijeiBaAXDVVk=.aace6af6-bf1b-40c9-ba19-6fd0ab9b1b0a@github.com>

On Tue, 8 Feb 2022 09:40:49 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Doing this caused 7 failures across a full jtreg run, namely:
>> 
>> serviceability/sa/ClhsdbFindPC.java#xcomp-core
>> vmTestbase/jit/misctests/fpustack/GraphApplet.java
>> vmTestbase/nsk/jdi/MonitorWaitRequest/MonitorWaitRequest001/TestDescription.java
>> vmTestbase/nsk/jdi/MonitorWaitedRequest/MonitorWaitedRequest001/TestDescription.java
>> vmTestbase/nsk/jdwp/ThreadReference/ForceEarlyReturn/forceEarlyReturn002/forceEarlyReturn002.java
>> vmTestbase/nsk/jdwp/ThreadReference/OwnedMonitorsStackDepthInfo/ownedMonitorsStackDepthInfo002/ownedMonitorsStackDepthInfo002.java
>> vmTestbase/nsk/jvmti/RedefineClasses/StressRedefine/TestDescription.java
>> 
>> ....I'll investigate.
>
>> Doing this caused 7 failures across a full jtreg run, namely:
> 
> I'm glad we caught that.

Status? Is branch protection really incompatible with PreserveFramePointer?

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Thu Feb 10 16:39:52 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 10 Feb 2022 16:39:52 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <-nQf8_Gh666U_KH2wCMBEApxI3GFXre1cghHN41KoVg=.c0bc85fd-16ed-49f5-a595-73893facf6df@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <1oSiO-f26IoFOcPDhOOeWrr8x2cH_Wyv4aAjI9gX9-0=.21f677c9-61a4-469e-891c-f35bc469b7e2@github.com>
 <uwm53WfSm6lIeOTDR7dRew2L39VCIAQNaWFgi9g4vwc=.e66d74a1-c261-457d-8d22-7499400b64ee@github.com>
 <CPrtaN-iAmOQIS0MCMo9Ss3z_RDqKkeifZ2Ij6FRcCo=.62f9ade2-8472-4b3f-b448-ef73655eeb0b@github.com>
 <QbRR80JhYnzAnB9o9HMSJFU-lVpINvDXfh4AisP9VEM=.513ff0b8-d42e-4ef1-8c8b-88db1b72b772@github.com>
 <-V7ptCS4QdcpFHOomMnTPPYvFtKSQ0nswzFNXQDoWLg=.2d72897f-ef45-4867-892f-64df085eca85@github.com>
 <-nQf8_Gh666U_KH2wCMBEApxI3GFXre1cghHN41KoVg=.c0bc85fd-16ed-49f5-a595-73893facf6df@github.com>
Message-ID: <tSbASAhYFyOZexuxf45FDIFZ2nsz2b7qtqGHc-x_IfI=.e1ea8c5e-02e1-4759-afaf-74d4f184f7a9@github.com>

On Mon, 7 Feb 2022 15:12:04 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> How about extending the existing enter() function: 
>> 
>> // Enter a new stack frame for the current method.
>> // nested:     Indicates a frame has already been entered (and not left) for the current method. 
>> void MacroAssembler::enter(bool nested=false) {
>>    if (nested) strip()
>>    protect()
>>    stp()
>>    mov()
>> }
>> 
>> This would add an additional bool check for every call of enter() - that's at code generation time, so probably not an issue.
>
> So, `nested` is true iff we are, say, pushing an extra frame for a runtime call in the middle of generated code, but for some mysterious reason the logic is inline instead of being implemented in the obvious way as a stub.
> 
> Please do this as:
> 
> ` MacroAssembler::enter(bool strip_return_address=false)`
> 
> and I'll be happy. Please make sure that all calls are commented, as in
> 
> `__ enter(/*strip_return_address*/true);`
> 
> and I'll be happy.

Just about to resolve this ... then spotted the "make sure that all calls are commented". Will fix up.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Thu Feb 10 16:39:51 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 10 Feb 2022 16:39:51 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v20]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <fVz9lmTGhMrS6aKv-bS3isMfwujZLJ123Ce8hGjiQ_A=.ec3ec98f-e7a3-44c6-8df5-c86ee187b261@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Merge enter_subframe into enter

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/614a3262..f779513b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=19
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=18-19

  Stats: 20 lines in 5 files changed: 5 ins; 9 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Thu Feb 10 16:39:53 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 10 Feb 2022 16:39:53 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <9wCVZ8gCStf_tUT8_WQjhLzXqqQlQMsijeiBaAXDVVk=.aace6af6-bf1b-40c9-ba19-6fd0ab9b1b0a@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
 <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
 <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>
 <n5enwyIUBovTlCjVXWbKBCayt4O2227qRPzbmzajzr0=.ba48446a-9279-453c-9cde-82b7755e1767@github.com>
 <GXUcAo55K4vReK737EJE8VWuNh5fUP0O01nxczF5fV8=.0bb1ca06-d8d7-4440-92bb-eaad4e22a169@github.com>
 <9wCVZ8gCStf_tUT8_WQjhLzXqqQlQMsijeiBaAXDVVk=.aace6af6-bf1b-40c9-ba19-6fd0ab9b1b0a@github.com>
Message-ID: <Uty-dsfgY9W0NYBBRCf8a_RKA3RpTVSEPTeowxIBeoI=.1e872980-760a-452f-87a7-1e725e92ab4b@github.com>

On Thu, 10 Feb 2022 16:18:18 GMT, Andrew Haley <aph at openjdk.org> wrote:

>>> Doing this caused 7 failures across a full jtreg run, namely:
>> 
>> I'm glad we caught that.
>
> Status? Is branch protection really incompatible with PreserveFramePointer?

Eventually found a missing signing in the exception handling. I'm running the full suite now, so should hopefully get something posted tomorrow.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From coleenp at openjdk.java.net  Thu Feb 10 17:14:06 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Thu, 10 Feb 2022 17:14:06 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v3]
In-Reply-To: <xTw2OFGUcy9AWewyPi8qC76fDg4U1-LmBgDNp1VjdvU=.e9522917-6318-42c5-b5b4-4c0447c9133e@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <xTw2OFGUcy9AWewyPi8qC76fDg4U1-LmBgDNp1VjdvU=.e9522917-6318-42c5-b5b4-4c0447c9133e@github.com>
Message-ID: <Dc_EKOfmG7MrZ3TvYsilH8dMG9XW_7Q2COPuBouxoEw=.9d434ad0-a9d9-4378-af90-0d978a1f4049@github.com>

On Thu, 10 Feb 2022 14:58:46 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix release build

src/hotspot/share/gc/parallel/psOldGen.cpp line 48:

> 46: #else
> 47:   _Expand_lock(Mutex::safepoint, "PSOldGenExpand_lock", true)
> 48: #endif

As per our coding convention, this should be _expand_lock.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From ayang at openjdk.java.net  Thu Feb 10 17:23:42 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 10 Feb 2022 17:23:42 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v4]
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:

  lower case

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7124/files
  - new: https://git.openjdk.java.net/jdk/pull/7124/files/fa5dcce9..d5a2a9ca

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=02-03

  Stats: 11 lines in 2 files changed: 0 ins; 0 del; 11 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7124.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7124/head:pull/7124

PR: https://git.openjdk.java.net/jdk/pull/7124

From ayang at openjdk.java.net  Thu Feb 10 17:23:46 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 10 Feb 2022 17:23:46 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v3]
In-Reply-To: <Dc_EKOfmG7MrZ3TvYsilH8dMG9XW_7Q2COPuBouxoEw=.9d434ad0-a9d9-4378-af90-0d978a1f4049@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <xTw2OFGUcy9AWewyPi8qC76fDg4U1-LmBgDNp1VjdvU=.e9522917-6318-42c5-b5b4-4c0447c9133e@github.com>
 <Dc_EKOfmG7MrZ3TvYsilH8dMG9XW_7Q2COPuBouxoEw=.9d434ad0-a9d9-4378-af90-0d978a1f4049@github.com>
Message-ID: <lCm6dfbAQpy0fAqhzLfTWlgz1ZVPkBlIhKbiNsutSLA=.4c542011-cfa8-4e92-838c-4a29e724f680@github.com>

On Thu, 10 Feb 2022 17:10:44 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix release build
>
> src/hotspot/share/gc/parallel/psOldGen.cpp line 48:
> 
>> 46: #else
>> 47:   _Expand_lock(Mutex::safepoint, "PSOldGenExpand_lock", true)
>> 48: #endif
> 
> As per our coding convention, this should be _expand_lock.

Renamed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From mdoerr at openjdk.java.net  Thu Feb 10 17:46:08 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Thu, 10 Feb 2022 17:46:08 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <kQIeMXXaN7ec1LcVHEVNHqh0v7rAhWwuUq4HpLzS1js=.8eb9906a-e068-467b-b151-218ae34db944@github.com>

On Wed, 26 Jan 2022 12:51:31 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

Hi Claes, it can get implemented similarly on PPC64: https://github.com/openjdk/jdk/pull/7430
You can integrate it if you prefer that, but better after it got a Review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From psandoz at openjdk.java.net  Thu Feb 10 18:31:05 2022
From: psandoz at openjdk.java.net (Paul Sandoz)
Date: Thu, 10 Feb 2022 18:31:05 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>
Message-ID: <Nk_U5o2qYVe1pOilo9jJjGOVr61df5VHOdRpTUC4p70=.18f03ba2-6ba4-4117-abff-5f743e9209d1@github.com>

On Thu, 10 Feb 2022 15:14:44 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - minor rename
>  - address reviews

Running some tests.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From psandoz at openjdk.java.net  Thu Feb 10 18:59:05 2022
From: psandoz at openjdk.java.net (Paul Sandoz)
Date: Thu, 10 Feb 2022 18:59:05 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>
Message-ID: <Mw8wnUYFgKMIS124_zvUI4wtV_s7YHydLEwiwkOC0Fw=.72a1e123-3272-40d7-8d95-746fa3da061b@github.com>

On Thu, 10 Feb 2022 15:14:44 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - minor rename
>  - address reviews

Observing the following failures on CPUs with "Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz" with HotSpot flags:

-XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation


TestVectorCastAVX512.java:

Failed IR Rules (1)
------------------
- Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUI256toL512(int[],long[])":
  * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastI2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
    - counts: Graph contains wrong number of nodes:
        Regex 1: (\\d+(\\s){2}(VectorUCastI2X.*)+(\\s){2}===.*)
        Expected 1 but found 0 nodes.


TestVectorCastAVX1.java:

- Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toS64(byte[],short[])":
  * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastB2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
    - counts: Graph contains wrong number of nodes:
        Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
        Expected 1 but found 0 nodes.

- Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toI128(byte[],int[])":
  * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastB2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
    - counts: Graph contains wrong number of nodes:
        Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
        Expected 1 but found 0 nodes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From mgronlun at openjdk.java.net  Thu Feb 10 19:18:13 2022
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Thu, 10 Feb 2022 19:18:13 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <d9NT33YZFZeFH0JrWvYV-4wNcTQZgIEae0REudAdUBU=.8ac42dd7-b7c5-423f-98b7-af904f0add83@github.com>

On Wed, 26 Jan 2022 06:41:41 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

Hi Takuya, thanks for your contribution.

src/hotspot/share/jfr/jni/jfrJavaSupport.hpp line 103:

> 101: 
> 102:   // critical
> 103:   static void abort(jstring errorMsg, TRAPS, bool dump_core=true);

Not sure this is necessary. The existing core dump logic already handles the case where a core file cannot be generated due to disk full.

test/hotspot/jtreg/runtime/jfr/TestJFRDiskFull.java line 127:

> 125:         raf.close();
> 126:     }
> 127: }

I appreciate the effort, but we can't have a test that intentionally provokes a disk full situation. Instead, the updated error message will have to be manually verified.

-------------

Changes requested by mgronlun (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7227

From jamil.j.nimeh at oracle.com  Thu Feb 10 21:03:14 2022
From: jamil.j.nimeh at oracle.com (Jamil Nimeh)
Date: Thu, 10 Feb 2022 13:03:14 -0800
Subject: Questions re: loading 512-bit constant data through ExternalAddress
 calls
Message-ID: <3b8b37ae-321b-6bcc-1df6-14f308581e72@oracle.com>

Hello all,

Sorry in advance for the long email, but I thought it might be good to 
give a little background on what I'm trying to do:

I have an intrinsic that I'm working on for the ChaCha20 block 
function.? I have versions of it to support different processor 
capabilities, specifically SSE2+AVX, AVX2 and AVX512.? The first two are 
working great.? The AVX512 is giving me some headaches with a couple 
specific instructions.

I prototyped all of these in C using inline assembly before I got down 
to playing in hotspot and for the AVX512 implementation, there are a few 
places where one of the arguments for the EVEX.512 variant of vpaddd 
would be literal data at a memory location.? I achieved this in assembly 
like this:


// state is backed by uint32[16] and keystream is uint8[256]
void cc2Ax512(uint32_t *state, uint8_t *keystream) {
 ??? asm (
 ??????? ".data;"
"ctrAddMaskAvx512:;"
 ??????????? ".long 0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0;"

 ??????? ".text;"
 ??? ??? ??? // load data into zmm0/1/2/3 - that all works fine
 ??? ??????? "vpaddd %%zmm3, %%zmm3, [ctrAddMaskAvx512];"

 ??????????? // complete the rest of the function

 ?? ???? : // No output registers
 ??????? : "m"(state), "m"(keystream)
 ??????? : "rbx", "rdx", "ecx"
 ??? );
}

I didn't put the whole routine in there for brevity, but the data loads 
from that ctrAddMaskAvx512 address and adds properly to the values 
already in zmm3 at the time of the vpaddd.

When it came time to try the equivalent approach in hotspot, I looked 
around for anything else that might be doing this and I found some 
examples of constant values being created in functions and passed into 
what look like ExternalAddress calls (or are they constructors?? I 
haven't gone looking in-depth yet on that one).

I used the ghash_shufflemask_addr() as a template for what I thought I 
was supposed to do:

  * in stubGenerator_x86_64.cpp I created a function
    chacha20_ctradd_avx512() that is basically 8 emit_data64() calls
    with the 512 bits of data I wanted to reference.? The ghash function
    I was using as a reference only writes 128-bits, but otherwise my
    function is put together the same way.
  * further down in stubGenerator_x86_64.cpp I also assign this function
    to a StubRoutines::x86 field,
    "StubRoutines::x86::_chacha20_counter_addmask_avx512 =
    chacha20_ctradd_avx512();"
  * in stubRoutines_x86.cpp/hpp I define
    "_chacha20_counter_addmask_avx512" and create a method
    chacha20_counter_addmask_avx512() that simply returns
    _chacha20_counter_addmask_avx512.? All of this follows the
    ghash_shufflemask_addr approach.
  * Finally when I wish to use it, say for a vpaddd call, it would look
    something like this:
      o __ vpaddd(zmm_dVec, zmm_dVec,
        ExternalAddress(StubRoutines::x86::chacha20_counter_addmask_avx512()),
        Assembler::AVX_512bit, rax);
      o By comparison, the ghash approach I was using as a template is
        used with movdqu, so it's just a 128-bit move, but the source is
        an ExternalAddress similar to what I'm doing so I thought the
        technique would work more or less for EVEX variants that can
        have memory source addresses.

This all compiles and for some weird reason I saw it actually work 
correctly one time.? But most of the time the output after the add call 
is completely unrecognizable, as if it's adding data from some oddball 
address.? If I comment out that vpaddd statement, the data in the 
register is exactly what I would expect it to be before the add takes 
place.? So I'm fairly confident that the statements before that 
particular add are correct.

Here's where it gets weird.? I have a similar method for my AVX2 version 
of the intrinsic.? In that case, it's only doing 4 emit_data64 calls, 
and it passes it the same way into a vpaddd, but of course the 
vector_len is Assembler::AVX_256bit.? It works perfectly every time.? I 
don't have a good sense of why it's always working there but not with my 
512-bit counterpart.

I could definitely use some hotspot insights.? This approach in general 
was my best guess at loading/using 512-bit literals as source arguments 
but I'm definitely open to alternatives.? I have also tried the built-in 
generate_vector_custom_i32() function since that would allow me to do 
away with my own custom functions, so long as it doesn't hurt from a 
performance standpoint.? It seems to fail in the same way that my own 
functions do.

I am fairly new to assembly and these intrinsics so if you have 
suggestions/comments bear in mind that I don't eat/sleep/breathe hotspot 
like I would imagine some of the folks on this alias do. :)? But at 
least from a functional perspective, I know once I can get these literal 
512-bit values working the rest of the intrinsic function should fall 
into place because my C/assembly prototype works like a champ for all 
vector length variants.

Definitely open to your insights/comments,

Thanks,

--Jamil

From eastig at amazon.co.uk  Thu Feb 10 23:02:08 2022
From: eastig at amazon.co.uk (Astigeevich, Evgeny)
Date: Thu, 10 Feb 2022 23:02:08 +0000
Subject: RFC: AArch64: Set Segmented CodeCache default size to 127M
Message-ID: <64AB1C1E-4151-4979-BF15-CC71D00E98DB@amazon.com>

Hello,

We?d like to discuss a proposal for setting TieredCompilation Segmented CodeCache default size to 127M on AArch64 (https://bugs.openjdk.java.net/browse/JDK-8280150).

The current default size of TieredCompilation CodeCache is 240M: 116M "non-profiled" segment + 116M "profiled" segment + 8M "non-nmethods" segment. AArch64 ISA has direct calls and jumps range limited to 128M. The C1/C2 compilers generate far jumps, calls and trampolines to overcome the limitation of direct jumps/calls. They use MacroAssembler::far_branches which compares ReservedCodeCacheSize with the direct jumps/calls range. With 240M CodeCache JIT has to use far jumps/trampolines. Such far jumps/trampolines result in performance and code size overhead.

Our observations [1] suggest most applications running on AArch64 platforms have hot code not exceeding 128M.

AArch64 has a default ReservedCodeCacheSize of 48M. For tiered compilation the value is multiplied by 5 getting it to 240M. We experimented with CodeCache configuration: 48M "non-profiled" segment + 48M "profiled" segment + 8M "non-nmethods" segment. We ran SpecJbb2015, DaCapo at f480064 (https://github.com/dacapobench/dacapobench/tree/dev-chopin), Renaissance 0.14, and internal services.

We did not see any statistically significant regressions. SpecJbb improved max-jOPS by +1.68% and critical-jOPS by +1.34%. For DaCapo, eclipse improved by 3.57%, tomcat by 1.45% and tradesoap by 3.03%. Only two Renaissance benchmarks had statistically significant results: dotty (+9.0%)  and finagle-http (+3.9%). Others had changes which were comparable with the coefficient of variation. All benchmarks had significant decreases in max use of the non-profiled and profiled segments (see data below).

To mitigate risks of 104M not being enough we?d like to change the default size of TieredCompilation CodeCache to 127M (which is just below the size where the JIT would generate far jumps and trampolines): 60M "non-profiled" segment + 60M "profiled" segment + 7M "non-nmethods" segment. We did partial runs with 127M CodeCache. Their results were similar to the 104M configuration.

Average maximum used memory(Kb) in segments (it was checked numbers of compiled methods were similar in both cases):

NPS=non-profiled segment
PS=profiled segment
NNS=non-nmethods segment

SpecJbb
+----------+---------+--------+---------+--------+--------+----------+---------+----------+
| 116M NPS | 116M PS | 8M NNS | 48M NPS | 48M PS | 8M NNS | diff NPS | diff PS | diff NNS |
+----------+---------+--------+---------+--------+--------+----------+---------+----------+
|    12491 |   13968 |   4274 |   10649 |  12276 |   4234 | -14.7%   | -12.1%  | -0.9%    |
+----------+---------+--------+---------+--------+--------+----------+---------+----------+

DaCapo
+------------+----------+---------+--------+---------+--------+--------+----------+---------+----------+
| benchmark  | 116M NPS | 116M PS | 8M NNS | 48M NPS | 48M PS | 8M NNS | diff NPS | diff PS | diff NNS |
+------------+----------+---------+--------+---------+--------+--------+----------+---------+----------+
| avrora     |    2301  |   6324  |  4167  |   1887  |  5049  |  4080  | -18.00%  | -20.20% | -2.10%   |
| batik      |    6108  |   5301  |  4128  |   4686  |  4289  |  4114  | -23.30%  | -19.10% | -0.30%   |
| biojava    |    2018  |   5907  |  4047  |   1703  |  5364  |  4026  | -15.60%  | -9.20%  | -0.50%   |
| eclipse    |   30862  |  26824  |  4275  |  27314  | 24330  |  4180  | -11.50%  | -9.30%  | -2.20%   |
| jme        |    1567  |   5987  |  3502  |   1315  |  5205  |  3491  | -16.10%  | -13.10% | -0.30%   |
| lusearch   |    5424  |   9145  |  4201  |   4699  |  7147  |  4100  | -13.40%  | -21.90% | -2.40%   |
| pmd        |   12011  |  14438  |  4232  |  10701  | 12456  |  4140  | -10.90%  | -13.70% | -2.20%   |
| sunflow    |    1707  |   4341  |  4082  |   1220  |  3174  |  4040  | -28.60%  | -26.90% | -1.00%   |
| tomcat     |   15228  |  23595  |  4292  |  13519  | 20686  |  4187  | -11.20%  | -12.30% | -2.50%   |
| graphchi   |    1243  |   5238  |  4009  |   1063  |  4375  |  3998  | -14.50%  | -16.50% | -0.30%   |
| xalan      |    5270  |   8363  |  4191  |   4784  |  6643  |  4100  | -9.20%   | -20.60% | -2.20%   |
| fop        |   11597  |  20814  |  4336  |  10361  | 18485  |  4256  | -10.70%  | -11.20% | -1.80%   |
| luindex    |    4013  |   5531  |  3697  |   3083  |  4384  |  3507  | -23.20%  | -20.70% | -5.20%   |
| zxing      |    4577  |   7267  |  4255  |   4044  |  5820  |  4164  | -11.60%  | -19.90% | -2.10%   |
| tradebeans |   10313  |  26983  |  4603  |   9210  | 24954  |  4522  | -10.70%  | -7.50%  | -1.80%   |
| tradesoap  |   16939  |  35276  |  4649  |  15245  | 30888  |  4549  | -10.00%  | -12.40% | -2.10%   |
+------------+----------+---------+--------+---------+--------+--------+----------+---------+----------+

Renaissance
+------------------+----------+---------+--------+---------+--------+--------+----------+---------+----------+
|    benchmark     | 116M NPS | 116M PS | 8M NNS | 48M NPS | 48M PS | 8M NNS | diff NPS | diff PS | diff NNS |
+------------------+----------+---------+--------+---------+--------+--------+----------+---------+----------+
| akka-uct         |     4053 |    9615 |   3661 |    3001 |   8381 |   3559 | -26.00%  | -12.80% | -2.80%   |
| als              |    20732 |   39367 |   4554 |   18914 |  32400 |   4464 | -8.80%   | -17.70% | -2.00%   |
| chi-square       |     7922 |   23568 |   3828 |    7160 |  20603 |   3759 | -9.60%   | -12.60% | -1.80%   |
| dec-tree         |    23938 |   55512 |   4026 |   21857 |  36866 |   3946 | -8.70%   | -33.60% | -2.00%   |
| dotty            |    42405 |   40963 |   3712 |   37997 |  32770 |   3621 | -10.40%  | -20.00% | -2.50%   |
| finagle-chirper  |    21150 |   19833 |   3795 |   18652 |  17479 |   3693 | -11.80%  | -11.90% | -2.70%   |
| finagle-http     |    11950 |   19553 |   3778 |   10675 |  17234 |   3709 | -10.70%  | -11.90% | -1.80%   |
| fj-kmeans        |      960 |    4756 |   3504 |     882 |   4437 |   3484 | -8.10%   | -6.70%  | -0.60%   |
| future-genetic   |     1760 |    5470 |   3526 |    1466 |   4449 |   3497 | -16.70%  | -18.70% | -0.80%   |
| gauss-mix        |    11910 |   21406 |   4459 |   10675 |  18741 |   4382 | -10.40%  | -12.40% | -1.70%   |
| log-regression   |    25230 |   42802 |   4108 |   22791 |  34542 |   3989 | -9.70%   | -19.30% | -2.90%   |
| mnemonics        |     1094 |    3914 |   3501 |    1010 |   3669 |   3480 | -7.70%   | -6.30%  | -0.60%   |
| movie-lens       |    20571 |   23472 |   4495 |   18500 |  20728 |   4424 | -10.10%  | -11.70% | -1.60%   |
| naive-bayes      |    24305 |   45967 |   4030 |   22124 |  35135 |   3929 | -9.00%   | -23.60% | -2.50%   |
| page-rank        |     9386 |   24226 |   3817 |    8554 |  22081 |   3769 | -8.90%   | -8.90%  | -1.30%   |
| par-mnemonics    |     1217 |    4318 |   3501 |    1128 |   4098 |   3477 | -7.30%   | -5.10%  | -0.70%   |
| philosophers     |     2647 |    5765 |   3571 |    2146 |   4293 |   3506 | -18.90%  | -25.50% | -1.80%   |
| reactors         |     2663 |    5266 |   3632 |    2278 |   4321 |   3513 | -14.50%  | -17.90% | -3.30%   |
| rx-scrabble      |     2511 |    6721 |   3535 |    2131 |   5037 |   3506 | -15.10%  | -25.10% | -0.80%   |
| scala-doku       |     2106 |    6408 |   3522 |    1775 |   4744 |   3500 | -15.70%  | -26.00% | -0.60%   |
| scala-kmeans     |     1104 |    4634 |   3497 |    1002 |   4345 |   3481 | -9.20%   | -6.20%  | -0.50%   |
| scala-stm-bench7 |     3492 |    6611 |   3601 |    3158 |   5302 |   3509 | -9.60%   | -19.80% | -2.60%   |
| scrabble         |     1816 |    6046 |   3546 |    1460 |   4902 |   3496 | -19.60%  | -18.90% | -1.40%   |
+------------------+----------+---------+--------+---------+--------+--------+----------+---------+----------+

[1] CodeCache usage data from:
- Latest versions of SpecJbb, DaCapo and Renaissance benchmarks.
- An internal service with 15000+ compiled Java methods running without compilation issues with 64M CodeCache (TieredCompilation off) and with 127M segmented CodeCache.
- A recommendation to use 64M CodeCache (TieredCompilation off) to improve performance (https://github.com/aws/aws-graviton-getting-started/blob/main/java.md).
- IDEs like IntelliJ, CLion can use more 130M but they don't rely on the default values.


Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom.


From dholmes at openjdk.java.net  Thu Feb 10 23:24:11 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 23:24:11 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
In-Reply-To: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
Message-ID: <gNCEsWHLoCaCXEcBoDBdbYwkOzaU0kfg6ETTiY3B6eo=.03139b10-04c8-4cfb-ba58-122f95ec9b78@github.com>

On Thu, 10 Feb 2022 15:39:53 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.

Looks fine. The proof of these changes is in compiling the files - how did you test the non-gc-test changes?

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7426

From dholmes at openjdk.java.net  Thu Feb 10 23:33:09 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 10 Feb 2022 23:33:09 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
In-Reply-To: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
Message-ID: <bqegymgOsLLZC3MT1x8Jbj0nvCJUoAYz53h8_v1JD40=.ba7132c8-7bc2-4edf-b2cf-217a98a8b90e@github.com>

On Thu, 10 Feb 2022 15:39:53 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.

Forgot to mention copyright years need updating before integrating! Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7426

From yyang at openjdk.java.net  Fri Feb 11 03:35:10 2022
From: yyang at openjdk.java.net (Yi Yang)
Date: Fri, 11 Feb 2022 03:35:10 GMT
Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes
 [v2]
In-Reply-To: <s6FyxHfrGVFyYkOPhFJhDDUkQoeoQCTvBZrD4kNjpLg=.8ab61d97-f4de-453a-a6d9-b50978975ae1@github.com>
References: <EjGKmvDvj3kTRu5Pb-Tb826DkSUl5fInmV7aWZ_XS7I=.1626f32a-567b-4847-ab3f-9c9986be9ef0@github.com>
 <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com>
 <s6FyxHfrGVFyYkOPhFJhDDUkQoeoQCTvBZrD4kNjpLg=.8ab61d97-f4de-453a-a6d9-b50978975ae1@github.com>
Message-ID: <XvmcTjewg772xycbaP4--4YQl2cv07sBQgBiBPkIDsk=.fa096f89-bd80-40c4-8b0a-9968676d52d9@github.com>

On Tue, 18 Jan 2022 02:59:11 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>> 
>>   8275775 Add VM.classes to print details of all classes
>
> src/hotspot/share/oops/instanceKlass.cpp line 2069:
> 
>> 2067:   ResourceMark rm;
>> 2068:   _st->print("%-18s", "KlassAddr");
>> 2069:   _st->print("  ");
> 
> Can't you just print the two spaces in the previous line:
> 
> _st->print("%-18s  ", "KlassAddr");
> 
> and save all the additional print calls. This applies throughout where you have "  ".

@dholmes-ora David, Can you please take a look at the latest version? I've addressed all problems you pointed out.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7105

From kvn at openjdk.java.net  Fri Feb 11 04:17:10 2022
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 11 Feb 2022 04:17:10 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v8]
In-Reply-To: <6_ddanyI-FFaerYCGBHYYGlJQZpUvypaIIoPOq6S3wM=.b77c72f2-e29a-4d31-826c-f42c737978d1@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <6_ddanyI-FFaerYCGBHYYGlJQZpUvypaIIoPOq6S3wM=.b77c72f2-e29a-4d31-826c-f42c737978d1@github.com>
Message-ID: <Ukn6-pc_SCy86-welsyGs-ufyfFpJRARiJ8A86B8dIM=.8df6af5e-fab9-4d0f-b8b2-3992e7d5d4e4@github.com>

On Thu, 10 Feb 2022 08:46:50 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Deprecated ExtendedDTraceProbes.
>> Edited help messages and man pages accordingly, added the 3 flags to man pages.
>> Added flag to VMDeprecatedOptions test.
>> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
>> 
>> Checked that tests are not affected.
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fixes to documentation requested by reviewers

Good.

-------------

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7110

From kvn at openjdk.java.net  Fri Feb 11 05:07:09 2022
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 11 Feb 2022 05:07:09 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <4k5B_eeCIPWe4rTYueR7n0lixNRMFzItoV9U7lCfIbM=.ada0c192-7ca6-4d4b-bdcb-a912e7867aa5@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
 <fPNQA8wA48h1_aQm_llWCYTF5gXOCih34TG0CV8RXPQ=.c074b674-eee9-4b23-841e-31009f2c266f@github.com>
 <5tnuK3pwhbOWk8dJlEkELJoxEFhmDyZFwpG5DfkozQ4=.b3cad787-bb0d-4716-91ed-079669da8eb0@github.com>
 <4k5B_eeCIPWe4rTYueR7n0lixNRMFzItoV9U7lCfIbM=.ada0c192-7ca6-4d4b-bdcb-a912e7867aa5@github.com>
Message-ID: <8jJY1_Q3nc_G2FTOG8A76nIXXWk1nE65wNN-8czawEI=.378b4e8b-cc86-4a0d-8506-8211f4d1bcc6@github.com>

On Wed, 9 Feb 2022 18:55:06 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Dunno, maybe? I see the lot of other "small" options are `intx`, and the change like that would proliferate to all architectures that set `OptoLoopAlignment` as their `product_pd`. It also raises the question if `CodeEntryAlignment` should also be `int`? I'd rather keep this patch small, to be honest.
>
> Your comment makes sense.  Thanks.

Yes, flags types clean up should be separate issue.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From kvn at openjdk.java.net  Fri Feb 11 05:07:09 2022
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 11 Feb 2022 05:07:09 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
Message-ID: <8eS0mvN89K2bpyqyXyhKajH5mWjuzuCRQP7zlDK3E5g=.0e48304f-c9f6-45c7-bfe1-44812dd4a57f@github.com>

On Tue, 8 Feb 2022 18:19:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
> 
> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
> 
> The default values for options are different per platform, so tests are x86_64 specific.
> 
> No default value is changed, this only unblocks experiments.
> 
> Additional testing:
>  - [x] New tests on Linux x86_64 fastdebug
>  - [x] New tests on Linux x86_64 release

I agree with changes.

-------------

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7388

From kbarrett at openjdk.java.net  Fri Feb 11 05:43:45 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Feb 2022 05:43:45 GMT
Subject: RFR: 8281626: NonblockingQueue should use nullptr
Message-ID: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>

Please review this change to use nullptr instead of NULL throughout the NonblockingQueue class.

Testing:
mach5 tier1

-------------

Commit messages:
 - use nullptr throughout

Changes: https://git.openjdk.java.net/jdk/pull/7438/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7438&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281626
  Stats: 39 lines in 2 files changed: 0 ins; 0 del; 39 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7438.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7438/head:pull/7438

PR: https://git.openjdk.java.net/jdk/pull/7438

From shade at openjdk.java.net  Fri Feb 11 06:42:09 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 11 Feb 2022 06:42:09 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <8eS0mvN89K2bpyqyXyhKajH5mWjuzuCRQP7zlDK3E5g=.0e48304f-c9f6-45c7-bfe1-44812dd4a57f@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
 <8eS0mvN89K2bpyqyXyhKajH5mWjuzuCRQP7zlDK3E5g=.0e48304f-c9f6-45c7-bfe1-44812dd4a57f@github.com>
Message-ID: <OrXNyH4o_1tIGqMWwJDNSsZrydE4GoNOON9VmRuzYWg=.86a5b264-ff40-4bda-ae03-8a952d4d55c6@github.com>

On Fri, 11 Feb 2022 05:03:54 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

> I agree with changes.

Thank you, I'll wait for another reviewer and then integrate.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From shade at openjdk.java.net  Fri Feb 11 06:46:07 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 11 Feb 2022 06:46:07 GMT
Subject: RFR: 8281626: NonblockingQueue should use nullptr
In-Reply-To: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
References: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
Message-ID: <gep_Euu696bgDB2tU5vqRAhPCgaC8QU0NUTgC1l-fd4=.54a5111e-2b9a-441f-b1cb-318ecee73fc8@github.com>

On Fri, 11 Feb 2022 05:37:27 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to use nullptr instead of NULL throughout the NonblockingQueue class.
> 
> Testing:
> mach5 tier1

Looks fine!

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7438

From dholmes at openjdk.java.net  Fri Feb 11 07:01:10 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 11 Feb 2022 07:01:10 GMT
Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes
 [v6]
In-Reply-To: <jlZbcfBmSScpYs16RHqMU-lH2wqZvX8FySG6slcGnn8=.539efa97-68e9-438a-80d8-c2d30bba6279@github.com>
References: <EjGKmvDvj3kTRu5Pb-Tb826DkSUl5fInmV7aWZ_XS7I=.1626f32a-567b-4847-ab3f-9c9986be9ef0@github.com>
 <jlZbcfBmSScpYs16RHqMU-lH2wqZvX8FySG6slcGnn8=.539efa97-68e9-438a-80d8-c2d30bba6279@github.com>
Message-ID: <Vent0pwplawz50noKUXLF5PVZ7xYMQQel8byoqHhpkg=.c0067abc-c751-4922-a3bc-3e4a9138b8f7@github.com>

On Thu, 27 Jan 2022 09:17:09 GMT, Yi Yang <yyang at openjdk.org> wrote:

>> Add VM.classes to print details of all classes, output looks like:
>> 
>> 1. jcmd VM.classes
>> 
>> KlassAddr Size State Flags LoaderName ClassName
>> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400
>> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000
>> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00
>> ...
>> 
>> 2. jcmd VM.classes verbose
>> 
>> KlassAddr Size State Flags LoaderName ClassName
>> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400
>> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400}
>>  - instance size: 2
>>  - klass size: 62
>>  - access: final synchronized
>>  - state: inited
>>  - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400'
>>  - super: 'java/lang/Object'
>>  - sub:
>>  - arrays: NULL
>>  - methods: Array<T>(0x00007f620841f210)
>>  - method ordering: Array<T>(0x0000000800a7e5a8)
>>  - default_methods: Array<T>(0x0000000000000000)
>>  - local interfaces: Array<T>(0x00000008005af748)
>>  - trans. interfaces: Array<T>(0x00000008005af748)
>>  - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380
>>  - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder
>>  - source file: 'LambdaForm$MH'
>>  - class annotations: Array<T>(0x0000000000000000)
>>  - class type annotations: Array<T>(0x0000000000000000)
>>  - field annotations: Array<T>(0x0000000000000000)
>>  - field type annotations: Array<T>(0x0000000000000000)
>>  - inner classes: Array<T>(0x00000008005af6d8)
>>  - nest members: Array<T>(0x00000008005af6d8)
>>  - permitted subclasses: Array<T>(0x00000008005af6d8)
>>  - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400'
>>  - vtable length 5 (start addr: 0x0000000800c0b5b8)
>>  - itable length 2 (start addr: 0x0000000800c0b5e0)
>>  - ---- static fields (1 words):
>>  - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112
>>  - ---- non-static fields (0 words):
>>  - non-static oop maps:
>> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000
>> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000}
>>  - instance size: 2
>>  - klass size: 62
>>  - access: final synchronized
>>  - state: inited
>>  - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000'
>>  - super: 'java/lang/Object'
>>  - sub:
>>  - arrays: NULL
>>  - methods: Array<T>(0x00007f620841ea68)
>>  - method ordering: Array<T>(0x0000000800a7e5a8)
>>  - default_methods: Array<T>(0x0000000000000000)
>>  - local interfaces: Array<T>(0x00000008005af748)
>>  - trans. interfaces: Array<T>(0x00000008005af748)
>>  - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0
>>  - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder
>>  - source file: 'LambdaForm$DMH'
>>  - class annotations: Array<T>(0x0000000000000000)
>>  - class type annotations: Array<T>(0x0000000000000000)
>>  - field annotations: Array<T>(0x0000000000000000)
>>  - field type annotations: Array<T>(0x0000000000000000)
>>  - inner classes: Array<T>(0x00000008005af6d8)
>>  - nest members: Array<T>(0x00000008005af6d8)
>>  - permitted subclasses: Array<T>(0x00000008005af6d8)
>>  - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000'
>>  - vtable length 5 (start addr: 0x0000000800c0b1b8)
>>  - itable length 2 (start addr: 0x0000000800c0b1e0)
>>  - ---- static fields (1 words):
>>  - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112
>>  - ---- non-static fields (0 words):
>> ...
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix

Hi Yi,

I had been expecting to see further updates as not all issues seem resolved. I have a few further typos and nits below. But I'd like to see someone from serviceability actually approve this.

Thanks,
David

src/hotspot/share/services/diagnosticCommand.cpp line 962:

> 960:                                      DCmdWithParser(output, heap),
> 961:   _verbose("-verbose",
> 962:            "Dump the detail content of Java class. "

s/detail/detailed/
s/of/of a/

src/hotspot/share/services/diagnosticCommand.cpp line 964:

> 962:            "Dump the detail content of Java class. "
> 963:            "Some classes are annotated with flags: "
> 964:            "F = has finializer method, "

typo finializer - but should be finalize

Is this actually only present for "non-trivial finalize" method?

src/hotspot/share/services/diagnosticCommand.cpp line 966:

> 964:            "F = has finializer method, "
> 965:            "f = has final method, "
> 966:            "V = has vanilla constructor, "

What is a vanilla constructor? There is no such term in JLS.

src/hotspot/share/services/diagnosticCommand.cpp line 968:

> 966:            "V = has vanilla constructor, "
> 967:            "W = methods rewritten, "
> 968:            "C = marked with contended annotation, "

@contended

-------------

PR: https://git.openjdk.java.net/jdk/pull/7105

From dholmes at openjdk.java.net  Fri Feb 11 07:01:11 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 11 Feb 2022 07:01:11 GMT
Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes
 [v6]
In-Reply-To: <r6pEU07Q_5yHU6KXrm2dg4IE_9WBq2eUHvT44AiVT1Y=.04132b01-85c8-419a-bf65-0bacb24244ae@github.com>
References: <EjGKmvDvj3kTRu5Pb-Tb826DkSUl5fInmV7aWZ_XS7I=.1626f32a-567b-4847-ab3f-9c9986be9ef0@github.com>
 <jlZbcfBmSScpYs16RHqMU-lH2wqZvX8FySG6slcGnn8=.539efa97-68e9-438a-80d8-c2d30bba6279@github.com>
 <r6pEU07Q_5yHU6KXrm2dg4IE_9WBq2eUHvT44AiVT1Y=.04132b01-85c8-419a-bf65-0bacb24244ae@github.com>
Message-ID: <pe-7AhQ0tk_YOADyhexj8ixw5CUb9kqbNDoHtRH95-k=.9a9a362b-02d7-444c-a3c5-d86cf7abfe46@github.com>

On Thu, 27 Jan 2022 16:00:54 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix
>
> src/hotspot/share/oops/instanceKlass.cpp line 2081:
> 
>> 2079:   _st->print(INTPTR_FORMAT "  ", p2i(k));
>> 2080:   // klass size
>> 2081:   _st->print("%-4d  ", k->size());
> 
> Should be `%4d` so that the numbers are aligned correctly.

This issue seem still outstanding.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7105

From dholmes at openjdk.java.net  Fri Feb 11 07:10:09 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 11 Feb 2022 07:10:09 GMT
Subject: RFR: 8281626: NonblockingQueue should use nullptr
In-Reply-To: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
References: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
Message-ID: <4ErT0Fu5rDCnLIq-LfhqnvVMV6uALY8wLUisGI7gN-E=.e92008cf-4b9b-443b-b994-92a3f4dfb03c@github.com>

On Fri, 11 Feb 2022 05:37:27 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to use nullptr instead of NULL throughout the NonblockingQueue class.
> 
> Testing:
> mach5 tier1

Looks good and trivial.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7438

From duke at openjdk.java.net  Fri Feb 11 08:02:11 2022
From: duke at openjdk.java.net (duke)
Date: Fri, 11 Feb 2022 08:02:11 GMT
Subject: Withdrawn: 8276618: Pad cacheline for Thread::_rcu_counter
In-Reply-To: <6kHhrYgTQ2_ST7TG7H0Syf6_QR8OW4qTc1KGIRJMhWE=.e29aee68-ca4e-46b0-a930-fc38e5176ca9@github.com>
References: <6kHhrYgTQ2_ST7TG7H0Syf6_QR8OW4qTc1KGIRJMhWE=.e29aee68-ca4e-46b0-a930-fc38e5176ca9@github.com>
Message-ID: <NwpV8ADJfa0LbUBwqdYccP2SQRuV89SUUM1QyY3lmhY=.e1c33065-0031-47ca-82c6-93041d1d047e@github.com>

On Thu, 4 Nov 2021 05:09:48 GMT, Hamlin Li <mli at openjdk.org> wrote:

> Currently, Thread::_rcu_counter is not padded by cacheline, it should be beneficail to do so.
> 
> The initial spebjbb test shows about 10.5% improvement of critical, and 0.7% improvement of max in specjbb2015.
> 
> 
> 
> ========= test result (1st round) ==========
> rcu		base
> 45096		38980
> 41741		41468
> 42349		41053
> 44485		42030
> 47103		39915
> 43864		36004
> 
> ==== average ====
> 44106.33333		39908.33333
> 
> ==== improvement ====
> 10.5%
> 
> ========= test result (2nd round) ==========
> Second round of run includes 3 types: 
> 1. pad gc data & pad rcu
> 2. pad rcu only
> 3. base
> 
> Although the improvement is not that much as the previous round (10%), but still got about 3~4% improvement.
> 
> gc data & rcu	rcu	base
> 41284	41860	37099
> 42296	42166	44692
> 42810	43423	41801
> 43492	45603	40274
> 43808	40641	39627
> 43029	40242	39793
> 42543	41662	41544
> 43420	42702	37991
> 44212	43354	40319
> 42692	43442	45264
> 44773	44577	44213
> 40835	41870	42008
> 44282	44167	42527
> 
> ==== average ====
> 43036.61538	42746.84615	41319.38462
> 
> ==== improvement ====
> gc data + rcu / base: 4.156%
> rcu / base: 3.45%
> 
> 
> 
> 
> ========= configuration and environment ==========
> specjbb arguments:
>   GROUP_COUNT=4
>   TI_JVM_COUNT=1
> 
>   SPEC_OPTS_C="-Dspecjbb.group.count=$GROUP_COUNT -Dspecjbb.txi.pergroup.count=$TI_JVM_COUNT"
>   SPEC_OPTS_TI=""
>   SPEC_OPTS_BE=""
> 
>   JAVA_OPTS_C="-server -Xms2g -Xmx2g -XX:+UseParallelGC"
>   JAVA_OPTS_TI="-server -Xms2g -Xmx2g -XX:+UseParallelGC"
>   JAVA_OPTS_BE="-server -XX:+UseG1GC -Xms32g -Xmx32g"
> 
>   MODE_ARGS_C="-ikv"
>   MODE_ARGS_TI="-ikv"
>   MODE_ARGS_BE="-ikv"
> 
>   NUM_OF_RUNS=1
> 
> HW:
>   Architecture:        x86_64
>   CPU op-mode(s):      32-bit, 64-bit
>   Byte Order:          Little Endian
>   CPU(s):              224
>   On-line CPU(s) list: 0-223
>   Thread(s) per core:  2
>   Core(s) per socket:  28
>   Socket(s):           4
>   NUMA node(s):        4
>   Vendor ID:           GenuineIntel
>   CPU family:          6
>   Model:               85
>   Model name:          Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz
>   Stepping:            4
>   CPU MHz:             1001.925
>   CPU max MHz:         2101.0000
>   CPU min MHz:         1000.0000
>   BogoMIPS:            4200.00
>   Virtualization:      VT-x
>   L1d cache:           32K
>   L1i cache:           32K
>   L2 cache:            1024K
>   L3 cache:            39424K
>   NUMA node0 CPU(s):   0-27,112-139
>   NUMA node1 CPU(s):   28-55,140-167
>   NUMA node2 CPU(s):   56-83,168-195
>   NUMA node3 CPU(s):   84-111,196-223
> 
>               total        used        free      shared  buff/cache   available
> Mem:           3.0T        3.8G        2.9T         18M         25G        2.9T
> Swap:           99G          0B         99G

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6246

From vladimir.x.ivanov at oracle.com  Thu Feb 10 19:29:45 2022
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 10 Feb 2022 22:29:45 +0300
Subject: RFC : Approach to handle Allocation Merges in C2 Scalar
 Replacement
In-Reply-To: <BY5PR21MB1473BE17FE1EDBD3EF223DF99A2E9@BY5PR21MB1473.namprd21.prod.outlook.com>
References: <BY5PR21MB1473BE17FE1EDBD3EF223DF99A2E9@BY5PR21MB1473.namprd21.prod.outlook.com>
Message-ID: <919984fa-fae7-944a-ca5d-eebd3b516f52@oracle.com>

(BCCing hotspot-dev and moving the discussion to hotspot-compiler-dev.)

Hi Cesar,

Thanks for looking into enhancing EA.

Overall, the proposal looks reasonable.

I suggest to look more closely at split_unique_types().
It introduces a dedicated class of alias categories for fields of the 
allocation being eliminated and clones memory graph. I don't see why it 
shouldn't work for multiple allocations.

Moreover, split_unique_types() will break if you start optimizing 
multiple allocations at once. The notion of unique alias should be 
adjusted and cover the union of unique aliases for all interacting 
allocations.

Seems like you need to enhance SR to work on non-intersecting clusters 
of allocations.

One thing to take care of: scalar replacement relies on 
TypeOopPtr::instance_id().

   // If not InstanceTop or InstanceBot, indicates that this is
   // a particular instance of this type which is distinct.
   // This is the node index of the allocation node creating this instance.
   int           _instance_id;

It'll break when multiple allocations are in play.

Best regards,
Vladimir Ivanov

On 09.02.2022 04:45, Cesar Soares Lucas wrote:
> Hi there again!
> 
> Can you please give me feedback on the following approach to at least partially
> address [1], the scalar replacement allocation merge issue?
> 
> The problem that I am trying to solve arises when allocations are merged after a
> control flow split. The code below shows _one example_ of such a situation.
> 
> public int ex1(boolean cond, int x, int y) {
>  ? ? Point p = new Point(x, y);
>  ? ? if (cond)
>  ? ? ? ? p = new Point(y, x);
>  ? ? // Allocations for p are merged here.
>  ? ? return p.calc();
> }
> 
> Assuming the method calls on "p" are inlined then the allocations will not
> escape the method. The C2 IR for this method will look like this:
> 
> public int ex1(boolean cond, int first, int second) {
>  ? ? p0 = Allocate(...);
>  ? ? ...
>  ? ? p0.x = first;
>  ? ? p0.y = second;
> 
>  ? ? if (cond) {
>  ? ? ? ? p1 = Allocate(...);
>  ? ? ? ? ...
>  ? ? ? ? p1.x = second;
>  ? ? ? ? p1.y = first;
>  ? ? }
> 
>  ? ? p = phi(p0, p1)
> 
>  ? ? return p.x - p.y;
> }
> 
> However, one of the constraints implemented here [2], specifically the third
> one, will prevent the objects from being scalar replaced.
> 
> The approach that I'm considering for solving the problem is to replace the Phi
> node `p = phi(p0, p1)` with new Phi nodes for each of the fields of the objects
> in the original Phi. The IR for `ex1` would look something like this after the
> transformation:
> 
> public int ex1(boolean cond, int first, int second) {
>  ? ? p0 = Allocate(...);
>  ? ? ...
>  ? ? p0.x = first;
>  ? ? p0.y = second;
> 
>  ? ? if (cond) {
>  ? ? ? ? p1 = Allocate(...);
>  ? ? ? ? ...
>  ? ? ? ? p1.x = second;
>  ? ? ? ? p1.y = first;
>  ? ? }
> 
>  ? ? pX = phi(first, second)
>  ? ? pY = phi(second, first)
> 
>  ? ? return pX - pY;
> }
> 
> I understand that this transformation might not be applicable for all cases and
> that it's not as simple as illustrated above. Also, it seems to me that much of
> what I'd have to implement is already implemented in other steps of the Scalar
> Replacement pipeline (which is a good thing). To work around these
> implementation details I plan to use as much of the existing code as possible.
> The algorithm for the transformation would be like this:
> 
> split_phis(phi)
>  ? ? # If output of phi escapes, or something uses its identity, etc
>  ? ? # then we can't remove it. The conditions here might possible be the
>  ? ? # same as the ones implemented in `PhaseMacroExpand::can_eliminate_allocation`
>  ? ? if cant_remove_phi_output(phi)
>  ? ? ? ? return ;
> 
>  ? ? # Collect a set of tuples(F,U) containing nodes U that uses field F
>  ? ? # member of the object resulting from `phi`.
>  ? ? fields_used = collect_fields_used_after_phi(phi)
> 
>  ? ? foreach field in fields_used
>  ? ? ? ? producers = {}
> 
>  ? ? ? ? # Create a list with the last Store for each field "field" on the
>  ? ? ? ? # scope of each of the Phi input objects.
>  ? ? ? ? foreach o in phi.inputs
>  ? ? ? ? ? ? # The function called below might re-use a lot of the code/logic in `PhaseMacroExpand::scalar_replacement`
>  ? ? ? ? ? ? producers += last_store_to_o_field(0, field)
>          
>  ? ? ? ? # Create a new phi node whose inputs are the Store's to 'field'
>  ? ? ? ? field_phi = create_new_phi(producers)
> 
>  ? ? ? ? update_consumers(field, field_phi)
> 
> The implementation that I envisioned would be as a "pre-process" [3] step just
> after EA but before the constraint checks in `adjust_scalar_replaceable_state`
> [2]. If we agree that the overall Scalar Replacement implementation goes through
> the following major phases:
> 
>  ? ? 1. Identify the Escape Status of objects.
>  ? ? 2. Adjust object Escape and/or Scalar Replacement status based on a set of constraints.
>  ? ? 3. Make call to Split_unique_types [4].
>  ? ? 4 Iterate over object and array allocations.
>  ? ? ? ? 4.1 Check if allocation can be eliminated.
>  ? ? ? ? 4.2 Perform scalar replacement. Replace uses of object in Safepoints.
>  ? ? ? ? 4.3 Process users of CheckCastPP other than Safepoint: AddP, ArrayCopy and CastP2X.
> 
> The transformation that I am proposing would change the overall flow to look
> like this:
> 
>  ? ? 1. Identify the Escape Status of objects.
>  ? ? 2. ----> New: "Split phi functions" <----
>  ? ? 2. Adjust object Escape and/or Scalar Replacement status based on a set of constraints.
>  ? ? 3. Make call to Split_unique_types [14].
>  ? ? 4 Iterate over object and array allocations.
>  ? ? ? ? 4.1 ----> Moved to split_phi: "Check if allocation can be eliminated" <----
>  ? ? ? ? 4.2 Perform scalar replacement. Replace uses of object in Safepoints.
>  ? ? ? ? 4.3 Process users of CheckCastPP other than Safepoint: AddP, ArrayCopy and CastP2X.
> 
> Please let me know what you think and thank you for taking the time to review
> this!
> 
> 
> Regards,
> Cesar
> 
> Notes:
> 
>  ? ? [1] I am not sure yet how this approach will play with the case of a merge
>  ? ? ? ? with NULL.
>   
>  ? ? [2] https://github.com/openjdk/jdk/blob/2f71a6b39ed6bb869b4eb3e81bc1d87f4b3328ff/src/hotspot/share/opto/escape.cpp#L1809
> 
>  ? ? [3] Another option would be to "patch" the current implementation to be able
>  ? ? ? ? to handle the merges. I am not certain that the "patch" approach would be
>  ? ? ? ? better, however, the "pre-process" approach is certainly much easier to test
>  ? ? ? ? and more readable.
> 
>  ? ? [4] I cannot say I understand 100% the effects of executing
>  ? ? ? ? split_unique_types(). Would the transformation that I am proposing need to
>  ? ? ? ? be after the call to split_unique_types?

From shade at openjdk.java.net  Fri Feb 11 08:50:11 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 11 Feb 2022 08:50:11 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
Message-ID: <ow1YUPb4X7dkwOapXrOHUoBZS0Ss8Oo9Vvhp5KYaSdM=.366add18-5330-443d-9289-f75793a32ef3@github.com>

On Tue, 8 Feb 2022 17:24:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_64 fastdebug, `tier4`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Show watermark in better place on the chart

I am sure nothing bad is going to happen if I integrate this on Friday!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Fri Feb 11 08:50:11 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 11 Feb 2022 08:50:11 GMT
Subject: Integrated: 8072070: Improve interpreter stack banging
In-Reply-To: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
Message-ID: <tT73XB9xPuBrjFosmzaAyFunnBxTbIp5CjFkR0KNYXg=.e6351c37-6f4f-44a5-abda-22b0ccda16aa@github.com>

On Thu, 27 Jan 2022 18:42:15 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
> 
> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
> 
> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
> 
> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
> 
> I think it is fairly complete, and so would like to solicit more feedback and testing here.
> 
> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
> 
> 
>  compiler.compiler: +77%
>  compiler.sunflow: +69%
>  compress: +166%
>  crypto.rsa: +15%
>  crypto.signverify: +70%
>  mpegaudio: +8%
>  serial: +50%
>  sunflow: +57%
>  xml.transform: +61%
>  xml.validation: +43%
> 
> 
> My new `java.lang.invoke` benchmarks improve a lot as well:
> 
> 
> Benchmark              Mode  Cnt    Score    Error  Units
> 
> # Mainline
> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
> 
> # This WIP
> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
> 
> 
> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
> 
> 
> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>         67,296,528      instructions              #    0.85  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
> 
>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
> 
> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
> 
>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
> 
>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>         66,742,892      instructions              #    0.86  insn per cycle         
>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
> 
>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
> 
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_64 fastdebug, `tier2`
>  - [x] Linux x86_64 fastdebug, `tier3`
>  - [x] Linux x86_64 fastdebug, `tier4`
>  - [x] Linux x86_32 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier2`
>  - [x] Linux x86_32 fastdebug, `tier3`

This pull request has now been integrated.

Changeset: 3a13425b
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/3a13425bc9088cbb6d95e1a46248d7eba27fb1a6
Stats:     177 lines in 5 files changed: 155 ins; 4 del; 18 mod

8072070: Improve interpreter stack banging

Reviewed-by: xliu, coleenp, mdoerr

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From duke at openjdk.java.net  Fri Feb 11 08:52:48 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Fri, 11 Feb 2022 08:52:48 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v9]
In-Reply-To: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
Message-ID: <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>

> Deprecated ExtendedDTraceProbes.
> Edited help messages and man pages accordingly, added the 3 flags to man pages.
> Added flag to VMDeprecatedOptions test.
> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
> 
> Checked that tests are not affected.

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  fix in response to suggestion by David Holmes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7110/files
  - new: https://git.openjdk.java.net/jdk/pull/7110/files/af11b456..78d8e00a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=08
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=07-08

  Stats: 9 lines in 2 files changed: 2 ins; 0 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7110.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110

PR: https://git.openjdk.java.net/jdk/pull/7110

From lkorinth at openjdk.java.net  Fri Feb 11 08:54:51 2022
From: lkorinth at openjdk.java.net (Leo Korinth)
Date: Fri, 11 Feb 2022 08:54:51 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
 [v2]
In-Reply-To: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
Message-ID: <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>

> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.

Leo Korinth has updated the pull request incrementally with one additional commit since the last revision:

  updating copyright

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7426/files
  - new: https://git.openjdk.java.net/jdk/pull/7426/files/6aaa1a3a..7d3e7a1b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7426&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7426&range=00-01

  Stats: 59 lines in 59 files changed: 0 ins; 0 del; 59 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7426.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7426/head:pull/7426

PR: https://git.openjdk.java.net/jdk/pull/7426

From lkorinth at openjdk.java.net  Fri Feb 11 09:01:06 2022
From: lkorinth at openjdk.java.net (Leo Korinth)
Date: Fri, 11 Feb 2022 09:01:06 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
 [v2]
In-Reply-To: <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
 <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
Message-ID: <VVVpzoFDz5orsHkmEMALgrVU4bFR32pasfx0pbJ8r6s=.91156db9-c115-49bf-b131-a3031c307920@github.com>

On Fri, 11 Feb 2022 08:54:51 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

>> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.
>
> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision:
> 
>   updating copyright

I have a maven project that compiles test/lib and jtreg/gc, so everything changed does compile, I should have mentioned that. I have updated copyright year on all files now as well.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7426

From kbarrett at openjdk.java.net  Fri Feb 11 09:09:10 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Feb 2022 09:09:10 GMT
Subject: RFR: 8281626: NonblockingQueue should use nullptr
In-Reply-To: <gep_Euu696bgDB2tU5vqRAhPCgaC8QU0NUTgC1l-fd4=.54a5111e-2b9a-441f-b1cb-318ecee73fc8@github.com>
References: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
 <gep_Euu696bgDB2tU5vqRAhPCgaC8QU0NUTgC1l-fd4=.54a5111e-2b9a-441f-b1cb-318ecee73fc8@github.com>
Message-ID: <GskiGwuHDbEhYESDC9dg7fqZnhx4fUUA2l5eawBood0=.e81aad54-4afc-4e1f-baf3-0f3c4726556b@github.com>

On Fri, 11 Feb 2022 06:42:40 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Please review this change to use nullptr instead of NULL throughout the NonblockingQueue class.
>> 
>> Testing:
>> mach5 tier1
>
> Looks fine!

Thanks @shipilev and @dholmes-ora for reviews.  I waffled about suggesting it's trivial; I'll take your suggestion, and go ahead and push now.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7438

From kbarrett at openjdk.java.net  Fri Feb 11 09:09:11 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Feb 2022 09:09:11 GMT
Subject: Integrated: 8281626: NonblockingQueue should use nullptr
In-Reply-To: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
References: <uKz-duTdBMh81DIh7G2Kmqw_aQsX9xy69x1LifXcNqc=.acbba2f9-ad40-4eb3-8485-41368bdc1245@github.com>
Message-ID: <moTavREs3iXkZteUtyWih2yVIxmSSJ5OA32U-paBU6s=.f3a212ff-e8a2-458c-acef-9c9c8fe22f1e@github.com>

On Fri, 11 Feb 2022 05:37:27 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to use nullptr instead of NULL throughout the NonblockingQueue class.
> 
> Testing:
> mach5 tier1

This pull request has now been integrated.

Changeset: 90939cb8
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/90939cb80193c671cae635b7a4e41bd2e6bcdbd5
Stats:     39 lines in 2 files changed: 0 ins; 0 del; 39 mod

8281626: NonblockingQueue should use nullptr

Reviewed-by: shade, dholmes

-------------

PR: https://git.openjdk.java.net/jdk/pull/7438

From adinn at openjdk.java.net  Fri Feb 11 09:59:13 2022
From: adinn at openjdk.java.net (Andrew Dinn)
Date: Fri, 11 Feb 2022 09:59:13 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
Message-ID: <Et5jCbW5ViRaEPLbC11sMO-yFkeejfxaHc3-KugEuns=.9020ecb4-b65e-48dd-8063-2eeebda31ed5@github.com>

On Tue, 8 Feb 2022 17:24:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_64 fastdebug, `tier4`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Show watermark in better place on the chart

ship it and be damned :-)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From sspitsyn at openjdk.java.net  Fri Feb 11 10:14:09 2022
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Fri, 11 Feb 2022 10:14:09 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
 [v2]
In-Reply-To: <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
 <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
Message-ID: <t48vN2sCao318iQAU2AADAZvSsLLsqOs-NsKs9R26WU=.7868baf5-ca77-4808-9e34-8a52f452c6f2@github.com>

On Fri, 11 Feb 2022 08:54:51 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

>> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.
>
> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision:
> 
>   updating copyright

Looks good.
Thanks,
Serguei

-------------

Marked as reviewed by sspitsyn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7426

From thartmann at openjdk.java.net  Fri Feb 11 10:24:10 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Fri, 11 Feb 2022 10:24:10 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v9]
In-Reply-To: <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>
Message-ID: <wSpjSHHiudmiXrlOZtytG0x9NVO28qI92x5clJrCsFk=.af1cfbea-b4d5-4ace-9f23-596f542b3fb0@github.com>

On Fri, 11 Feb 2022 08:52:48 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Deprecated ExtendedDTraceProbes.
>> Edited help messages and man pages accordingly, added the 3 flags to man pages.
>> Added flag to VMDeprecatedOptions test.
>> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
>> 
>> Checked that tests are not affected.
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix in response to suggestion by David Holmes

Looks good.

-------------

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7110

From sjohanss at openjdk.java.net  Fri Feb 11 10:42:09 2022
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 11 Feb 2022 10:42:09 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v4]
In-Reply-To: <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>
Message-ID: <DwO-LwJtDinVdt-KH4YrqWcTel_SXXaURum5-3tEOYQ=.c84237c1-ee64-49bc-8f01-3f18e08c7370@github.com>

On Thu, 10 Feb 2022 17:23:42 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   lower case

Looks good, even if I preferred the old use of `needs_expand()`.

src/hotspot/share/gc/parallel/psOldGen.cpp line 180:

> 178:     bool needs_expand =
> 179:       pointer_delta(object_space()->end(), object_space()->top()) < word_size;
> 180:     if (needs_expand) {

To me the old code reads better, but I guess it's a matter of taste. The predicate could be moved to PSOldGen to allow asserting that the lock is held.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7124

From mdoerr at openjdk.java.net  Fri Feb 11 11:32:11 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 11 Feb 2022 11:32:11 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
Message-ID: <n6vfVp4A5sNR_PuSQuppzUQUV3TAUJl2v22EGBChz8w=.d88f9734-172d-4f4a-b00f-3bb71281e1d9@github.com>

On Tue, 8 Feb 2022 17:24:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> This is an old issue, I submitted the first RFE about this back in 2015. This shows up every time I benchmark the interpreter-only code. Most recently, it showed up in my work to get `java.lang.invoke` infra work reasonably fast when cold, which includes lots of interpreter paths.
>> 
>> The underlying problem is that template interpreters rebang the entire shadow zone on every method entry. This takes tens of instructions, blows out TLB caches with accessing tens of pages (on some implementations, I reckon, almost the entire L1 TLB cache!), etc. I think we can make it universally better for all template interpreters by introducing the safe limit / growth watermarks for thread stacks, so that we bang only when needed. It also drops the need for special-casing the `native_call`, because we might as well bang the entire shadow zone in native case as well.
>> 
>> This patch makes a pilot change for x86, without touching other architectures. Other architectures can follow this example later. This is why `native_call` argument persists, even though it is not used in x86 case anymore. There is also a new test group that I found useful when debugging on Windows, that group is going to go away before integration.
>> 
>> I tried to capture the current mechanics of stack banging in `stackOverflow.hpp`, hoping the change becomes more obvious, and so that arch-specific template interpreter codes could just reference it without copy-pasting it around.
>> 
>> I think it is fairly complete, and so would like to solicit more feedback and testing here.
>> 
>> Point runs on SPECjvm2008 with `-Xint` shows huge improvements on half of the tests, without any regressions:
>> 
>> 
>>  compiler.compiler: +77%
>>  compiler.sunflow: +69%
>>  compress: +166%
>>  crypto.rsa: +15%
>>  crypto.signverify: +70%
>>  mpegaudio: +8%
>>  serial: +50%
>>  sunflow: +57%
>>  xml.transform: +61%
>>  xml.validation: +43%
>> 
>> 
>> My new `java.lang.invoke` benchmarks improve a lot as well:
>> 
>> 
>> Benchmark              Mode  Cnt    Score    Error  Units
>> 
>> # Mainline
>> MHInvoke.methodHandle  avgt    5  799.671 ? 9.087  ns/op
>> MHInvoke.plain         avgt    5  261.947 ? 1.421  ns/op
>> VHGet.plain            avgt    5  231.372 ? 3.044  ns/op
>> VHGet.varHandle        avgt    5  924.880 ? 6.026  ns/op
>> 
>> # This WIP
>> MHInvoke.methodHandle  avgt    5  240.456 ? 3.931  ns/op
>> MHInvoke.plain         avgt    5   70.851 ? 1.986  ns/op
>> VHGet.plain            avgt    5   52.506 ? 3.768  ns/op
>> VHGet.varHandle        avgt    5  335.785 ? 4.398  ns/op
>> 
>> 
>> It also palpably improves startup even on small HelloWorld, _even when compilers are present_:
>> 
>> 
>> $ perf stat -r 5000 build/baseline/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/baseline/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              22.06 msec task-clock                #    1.030 CPUs utilized            ( +-  0.04% )
>>                 96      context-switches          #    4.353 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  333.181 /sec                     ( +-  0.32% )
>>              2,437      page-faults               #  110.469 K/sec                    ( +-  0.00% )
>>         78,763,038      cycles                    #    3.571 GHz                      ( +-  0.05% )  (77.30%)
>>          2,107,182      stalled-cycles-frontend   #    2.68% frontend cycles idle     ( +-  0.41% )  (77.40%)
>>          2,235,371      stalled-cycles-backend    #    2.84% backend cycles idle      ( +-  1.05% )  (71.39%)
>>         67,296,528      instructions              #    0.85  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (89.79%)
>>         12,483,022      branches                  #  565.911 M/sec                    ( +-  0.01% )  (99.73%)
>>            384,412      branch-misses             #    3.08% of all branches          ( +-  0.07% )  (85.91%)
>> 
>>          0.0214224 +- 0.0000875 seconds time elapsed  ( +-  0.41% )
>> 
>> $ perf stat -r 5000 build/interp-bang/bin/java -Xms128m -Xmx128m Hello > /dev/null
>> 
>>  Performance counter stats for 'build/interp-bang/bin/java -Xms128m -Xmx128m Hello' (5000 runs):
>> 
>>              21.78 msec task-clock                #    1.031 CPUs utilized            ( +-  0.05% )
>>                 98      context-switches          #    4.519 K/sec                    ( +-  0.07% )
>>                  7      cpu-migrations            #  339.292 /sec                     ( +-  0.31% )
>>              2,434      page-faults               #  111.755 K/sec                    ( +-  0.00% )
>>         77,746,317      cycles                    #    3.569 GHz                      ( +-  0.05% )  (76.94%)
>>          2,143,121      stalled-cycles-frontend   #    2.76% frontend cycles idle     ( +-  0.45% )  (76.03%)
>>          2,059,440      stalled-cycles-backend    #    2.65% backend cycles idle      ( +-  1.11% )  (71.82%)
>>         66,742,892      instructions              #    0.86  insn per cycle         
>>                                                   #    0.03  stalled cycles per insn  ( +-  0.03% )  (91.40%)
>>         12,494,797      branches                  #  573.634 M/sec                    ( +-  0.01% )  (99.80%)
>>            386,145      branch-misses             #    3.09% of all branches          ( +-  0.08% )  (85.56%)
>> 
>>          0.0211278 +- 0.0000877 seconds time elapsed  ( +-  0.42% )
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1`
>>  - [x] Linux x86_64 fastdebug, `tier2`
>>  - [x] Linux x86_64 fastdebug, `tier3`
>>  - [x] Linux x86_64 fastdebug, `tier4`
>>  - [x] Linux x86_32 fastdebug, `tier1`
>>  - [x] Linux x86_32 fastdebug, `tier2`
>>  - [x] Linux x86_32 fastdebug, `tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Show watermark in better place on the chart

Seems like Power is not affected by this TLB / Cache bottleneck. We use 64k pages and typically 2 store instructions for banging. On the other side, I think it's a good thing to avoid touching any storage which we don't need. So, we could overwork the PPC64 implementation, too (optionally). Or wait until more experiments have been made.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From shade at openjdk.java.net  Fri Feb 11 11:35:20 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Fri, 11 Feb 2022 11:35:20 GMT
Subject: RFR: 8072070: Improve interpreter stack banging [v5]
In-Reply-To: <n6vfVp4A5sNR_PuSQuppzUQUV3TAUJl2v22EGBChz8w=.d88f9734-172d-4f4a-b00f-3bb71281e1d9@github.com>
References: <8sseq_si2gPMLJGfdJ33Icebfs_tAdFhPMB1Uszu3dI=.f5a439be-69aa-4aaf-8e0b-5ddf7865b376@github.com>
 <lKWuy_nU2JoCJEo_L-ainEgtYE2nisLbr726ltO8lao=.c0200b26-195b-47fa-bf97-6f33e724ad3f@github.com>
 <n6vfVp4A5sNR_PuSQuppzUQUV3TAUJl2v22EGBChz8w=.d88f9734-172d-4f4a-b00f-3bb71281e1d9@github.com>
Message-ID: <G18nrKnGZngPYeScGPWOZC85jtpIZkWescQO7dRut3Q=.b3fcc532-dfaa-488f-8c5a-f477912d512a@github.com>

On Fri, 11 Feb 2022 11:29:12 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> Seems like Power is not affected by this TLB / Cache bottleneck. We use 64k pages and typically 2 store instructions for banging. On the other side, I think it's a good thing to avoid touching any storage which we don't need. So, we could overwork the PPC64 implementation, too (optionally). Or wait until more experiments have been made.

Yes, larger VM pages mean fewer addresses to touch. OTOH, in my related experiments with removing the stack banging on compiled entry whatsoever, we seem to redeem single-digit percent improvements, even though we only touch one location far away. 

Anyhow, I think a good plan is to wait and see if this x86 pilot change runs into any interesting problems, before translating it to other architectures.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7247

From duke at openjdk.java.net  Fri Feb 11 11:37:56 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Fri, 11 Feb 2022 11:37:56 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v21]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <q7COfX45QCv3kkh7_TESYcpjBed_V0bC-ehP6IZjmIc=.83dac5d1-1da3-4c05-b11b-202ba6934d6d@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with two additional commits since the last revision:

 - Add comments to enter calls
 - Set PreserveFramePointer if use_rop_protection is set

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/f779513b..2062cce7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=20
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=19-20

  Stats: 26 lines in 8 files changed: 16 ins; 0 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Fri Feb 11 11:37:56 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Fri, 11 Feb 2022 11:37:56 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v18]
In-Reply-To: <Uty-dsfgY9W0NYBBRCf8a_RKA3RpTVSEPTeowxIBeoI=.1e872980-760a-452f-87a7-1e725e92ab4b@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <8eyrOM5Brgjz4517k80s5RW3HhTDdhevVZOCS8jbIl0=.b41a377e-2235-4310-9b4c-e75e473eb236@github.com>
 <XinD5VcGuT9VrlzDYqaJkwk29Q_CjJAtbHUDCgAfxWo=.bbfac0b0-b844-4a9a-a6fc-bd210928aadc@github.com>
 <aB-65S2vlvi8YgK05r0nIiLnxaOoCueWt030UI_QhgQ=.4a1eccba-b87a-43e4-babe-14c75c755aa5@github.com>
 <SlWAICcj0RKZKakcqy3yPPvV_FrEu0An9LYrjkFiUvA=.905f6d29-edb9-4ad0-812c-0cdc1b748000@github.com>
 <32e7_CnkkIaj2GOsvi9mT-xzgLO8B60uHrzMEAZXHko=.2ea9eaff-39c6-4401-9820-4536f03d5ec7@github.com>
 <PSXG9ufu1E8eEMInOxEogYJiWgeg051cY10oQv9G1T4=.bf62e54d-16b7-4032-ae44-55dee24a0877@github.com>
 <n5enwyIUBovTlCjVXWbKBCayt4O2227qRPzbmzajzr0=.ba48446a-9279-453c-9cde-82b7755e1767@github.com>
 <GXUcAo55K4vReK737EJE8VWuNh5fUP0O01nxczF5fV8=.0bb1ca06-d8d7-4440-92bb-eaad4e22a169@github.com>
 <9wCVZ8gCStf_tUT8_WQjhLzXqqQlQMsijeiBaAXDVVk=.aace6af6-bf1b-40c9-ba19-6fd0ab9b1b0a@github.com>
 <Uty-dsfgY9W0NYBBRCf8a_RKA3RpTVSEPTe
 owxIBeoI=.1e872980-760a-452f-87a7-1e725e92ab4b@github.com>
Message-ID: <44zejAzpVh55H_lUbDPm3eCzG5NoUjvO2zJVQRZ83G8=.22ae70e3-76fb-4e33-9cae-41c4a4876a54@github.com>

On Thu, 10 Feb 2022 16:32:25 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> Status? Is branch protection really incompatible with PreserveFramePointer?
>
> Eventually found a missing signing in the exception handling. I'm running the full suite now, so should hopefully get something posted tomorrow.

New patches fix the failures

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From redestad at openjdk.java.net  Fri Feb 11 11:40:07 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Fri, 11 Feb 2022 11:40:07 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <phrOwRfOHWB0534sOpCrPyNKtKPblEbmZYHJ6N75hQU=.220ce4be-02ba-4625-ad83-5e04a1be2458@github.com>

On Wed, 26 Jan 2022 12:51:31 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

> Hi Claes, it can get implemented similarly on PPC64: #7430 You can integrate it if you prefer that, but better after it got a Review.

Hi Martin, perfect!

Ideally we can get all platforms that has a `hasNegatives` intrinsic moved over so we can just switch it over big-bang style: remove the `@IntrinsicCandidate`, avoid contortions to pick the "right" implementation on the Java level based on which intrinsic is available and drop all VM-internal scaffolding for `hasNegatives`. Then it makes perfect sense to fold your patch into this PR, rather than have a tail of follow-ups.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From redestad at openjdk.java.net  Fri Feb 11 12:11:54 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Fri, 11 Feb 2022 12:11:54 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 23 additional commits since the last revision:

 - Merge branch 'master' into count_positives
 - Restore partial vector checks in AVX2 and SSE intrinsic variants
 - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
 - Simplify changes to encodeUTF8
 - Fix little-endian error caught by testing
 - Reduce jumps in the ascii path
 - Remove unused tail_mask
 - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
 - Add more comments, simplify tail branching in AVX512 variant
 - Resolve issues in the precise implementation
 - ... and 13 more: https://git.openjdk.java.net/jdk/compare/42073fce...c4bb3612

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7231/files
  - new: https://git.openjdk.java.net/jdk/pull/7231/files/2a855eb6..c4bb3612

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=00-01

  Stats: 18287 lines in 533 files changed: 12765 ins; 2983 del; 2539 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7231.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231

PR: https://git.openjdk.java.net/jdk/pull/7231

From dholmes at openjdk.java.net  Fri Feb 11 12:50:07 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 11 Feb 2022 12:50:07 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v9]
In-Reply-To: <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>
Message-ID: <TPkx1fjl2gz0Yu_QXM03bdFuGUwyFkLKXJ49doST7Ls=.9fdb866e-7550-444f-831c-56de579cf49a@github.com>

On Fri, 11 Feb 2022 08:52:48 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Deprecated ExtendedDTraceProbes.
>> Edited help messages and man pages accordingly, added the 3 flags to man pages.
>> Added flag to VMDeprecatedOptions test.
>> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
>> 
>> Checked that tests are not affected.
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix in response to suggestion by David Holmes

Marked as reviewed by dholmes (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From dholmes at openjdk.java.net  Fri Feb 11 13:01:04 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 11 Feb 2022 13:01:04 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
 [v2]
In-Reply-To: <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
 <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
Message-ID: <j_McUJTi9SjzQn5HR7wI1D65_B4Neet6gdKVHUX8Hq8=.65d4186a-b537-4b95-aef2-f02980a32a5e@github.com>

On Fri, 11 Feb 2022 08:54:51 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

>> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.
>
> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision:
> 
>   updating copyright

Looks good.

Thanks.

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7426

From mdoerr at openjdk.java.net  Fri Feb 11 15:38:16 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Fri, 11 Feb 2022 15:38:16 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
 <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
Message-ID: <TeSpphTYf2qYgZS8OOuE1obj5n5-lE8K9a09Z6QsE5s=.1f13f488-38cd-4632-a31d-9becbe7ae2d2@github.com>

On Fri, 11 Feb 2022 12:11:54 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
>> 
>> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
>> 
>> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
>> 
>> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
>> 
>> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).
>
> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 23 additional commits since the last revision:
> 
>  - Merge branch 'master' into count_positives
>  - Restore partial vector checks in AVX2 and SSE intrinsic variants
>  - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
>  - Simplify changes to encodeUTF8
>  - Fix little-endian error caught by testing
>  - Reduce jumps in the ascii path
>  - Remove unused tail_mask
>  - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
>  - Add more comments, simplify tail branching in AVX512 variant
>  - Resolve issues in the precise implementation
>  - ... and 13 more: https://git.openjdk.java.net/jdk/compare/811eb365...c4bb3612

Hi Claes,
doing it for all platforms and cleaning it up sounds good. My PPC64 contribution is already tested and reviewed. I'll try to find a volunteer for s390 which uses exactly the same algorithm as PPC64.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From smonteith at openjdk.java.net  Fri Feb 11 15:43:07 2022
From: smonteith at openjdk.java.net (Stuart Monteith)
Date: Fri, 11 Feb 2022 15:43:07 GMT
Subject: RFR: 8239927: Product variable PrefetchFieldsAhead is unused and
 should be removed [v4]
In-Reply-To: <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com>
References: <xGDii7-Onzmeui72U0dCU-ZBAXy2P_mzOo-62Mc0Psg=.0c90f99c-6bdc-4d36-b153-58926565c826@github.com>
 <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com>
Message-ID: <YRrUwstjOExMB_9tFH8lSF51-kZ0HvDCe91aO7oUA5M=.b1608a73-5357-4f64-ae65-7ff6cc7c33b5@github.com>

On Thu, 20 Jan 2022 15:58:09 GMT, Bhavana-Kilambi <duke at openjdk.java.net> wrote:

>> The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. 
>> But as it's not used anywhere, removing this option from the JDK source.
>
> Bhavana-Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
> 
>  - Merge master
>  - 8239927: Product variable PrefetchFieldsAhead is unused and should be removed

Hello @dholmes-ora , would it be possible for you to give this a proper review now it has been CSR approved?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6783

From redestad at openjdk.java.net  Fri Feb 11 15:45:10 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Fri, 11 Feb 2022 15:45:10 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
 <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
Message-ID: <FD5ToNyofDPRfnDbGlxnLgrcWugJyLC-hU4ISA4bogI=.eb5b81bf-4f76-47ff-8e5e-2d1eb16b0171@github.com>

On Fri, 11 Feb 2022 12:11:54 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
>> 
>> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
>> 
>> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
>> 
>> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
>> 
>> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).
>
> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 23 additional commits since the last revision:
> 
>  - Merge branch 'master' into count_positives
>  - Restore partial vector checks in AVX2 and SSE intrinsic variants
>  - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
>  - Simplify changes to encodeUTF8
>  - Fix little-endian error caught by testing
>  - Reduce jumps in the ascii path
>  - Remove unused tail_mask
>  - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
>  - Add more comments, simplify tail branching in AVX512 variant
>  - Resolve issues in the precise implementation
>  - ... and 13 more: https://git.openjdk.java.net/jdk/compare/690b05fa...c4bb3612

Good! I'm currently reading up on aarch64 asm and trying to port that intrinsic over. It might take some time..

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From aph-open at littlepinkcloud.com  Fri Feb 11 16:23:15 2022
From: aph-open at littlepinkcloud.com (Andrew Haley)
Date: Fri, 11 Feb 2022 16:23:15 +0000
Subject: RFC: AArch64: Set Segmented CodeCache default size to 127M
In-Reply-To: <64AB1C1E-4151-4979-BF15-CC71D00E98DB@amazon.com>
References: <64AB1C1E-4151-4979-BF15-CC71D00E98DB@amazon.com>
Message-ID: <155db069-9cdd-6e90-6e02-d87be2ab204b@littlepinkcloud.com>

On 2/10/22 23:02, Astigeevich, Evgeny wrote:
> We?d like to discuss a proposal for setting TieredCompilation Segmented CodeCache default size to 127M on AArch64 (https://bugs.openjdk.java.net/browse/JDK-8280150).

I don't think so, at least not without a lot more information.

This would halve the size of the code cache, potentially causing
severe regressions in production. I have seen bug reports from
customers mystified at poor OpenJDK performance which have turned out
to be code cache thrashing. This is very hard to diagnose without
making some inspired guesses at what the root cause may be. We'd be
moving the threshold for cache exhaustion much closer to our default
configuration.

So, this is a trade off between a small expected gain and a much
larger (but hopefully rare) loss.

I'd like to see more information. What was the *average performance
gain* of all your benchmarks? I don't think anyone is interested in
cherry-picked best cases.

A quick back-of-the-envelope calculation tells me that about 3.5% of
the code cache is occupied by trampolines and the extra bytes used by
far calls. However, many of the far calls are never needed; I don't
have stats for that, but I'd guess about half of them. But given the
(plausible ?)  assumption that the dynamic frequency of calls is the
same as the static frequency, I wouldn't be surprised if the cost of
trampoline calls is about 2% of the total instruction count, so it'd
be nice to be rid of them if there were no cost; but there is a cost.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From aph at openjdk.java.net  Fri Feb 11 16:42:16 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Fri, 11 Feb 2022 16:42:16 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v21]
In-Reply-To: <q7COfX45QCv3kkh7_TESYcpjBed_V0bC-ehP6IZjmIc=.83dac5d1-1da3-4c05-b11b-202ba6934d6d@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <q7COfX45QCv3kkh7_TESYcpjBed_V0bC-ehP6IZjmIc=.83dac5d1-1da3-4c05-b11b-202ba6934d6d@github.com>
Message-ID: <U-Z_r06zb3PTwB5u0vU_3OsNTfe3_sZRtpcNdYl50dI=.337ae303-8e9c-4cce-95af-8ce57a4282d1@github.com>

On Fri, 11 Feb 2022 11:37:56 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add comments to enter calls
>  - Set PreserveFramePointer if use_rop_protection is set

src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 439:

> 437:   if (_rop_protection == true) {
> 438:     PreserveFramePointer = true;
> 439:   }

You need an error message for -PreserveFramePointer +UseROPProtection.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph at openjdk.java.net  Fri Feb 11 16:52:16 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Fri, 11 Feb 2022 16:52:16 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v21]
In-Reply-To: <q7COfX45QCv3kkh7_TESYcpjBed_V0bC-ehP6IZjmIc=.83dac5d1-1da3-4c05-b11b-202ba6934d6d@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <q7COfX45QCv3kkh7_TESYcpjBed_V0bC-ehP6IZjmIc=.83dac5d1-1da3-4c05-b11b-202ba6934d6d@github.com>
Message-ID: <_8bU9rjmxZtiKw_7zHvR5kZxEGV0zPYsmLjwwzb78Eg=.41b11771-c173-4492-bcff-400a632a5ed1@github.com>

On Fri, 11 Feb 2022 11:37:56 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add comments to enter calls
>  - Set PreserveFramePointer if use_rop_protection is set

This is looking pretty nice now. With the check for -XX:-UseFramePointer argument consistency we're done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Fri Feb 11 17:12:15 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Fri, 11 Feb 2022 17:12:15 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v21]
In-Reply-To: <_8bU9rjmxZtiKw_7zHvR5kZxEGV0zPYsmLjwwzb78Eg=.41b11771-c173-4492-bcff-400a632a5ed1@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <q7COfX45QCv3kkh7_TESYcpjBed_V0bC-ehP6IZjmIc=.83dac5d1-1da3-4c05-b11b-202ba6934d6d@github.com>
 <_8bU9rjmxZtiKw_7zHvR5kZxEGV0zPYsmLjwwzb78Eg=.41b11771-c173-4492-bcff-400a632a5ed1@github.com>
Message-ID: <E9LzReZFAb7Vn3Cxg0AG9BEWtDaR_00Kvbzo665O5Aw=.5d39ed86-c36f-4dda-822f-3ee74a38853b@github.com>

On Fri, 11 Feb 2022 16:48:33 GMT, Andrew Haley <aph at openjdk.org> wrote:

> This is looking pretty nice now. With the check for -XX:-UseFramePointer argument consistency we're done.

Excellent! I'm away all next week, so will add the check when I get back.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From hseigel at openjdk.java.net  Fri Feb 11 19:27:49 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 11 Feb 2022 19:27:49 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v4]
In-Reply-To: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
Message-ID: <96SsHcTOA9N7kiXUsC4fgpqNYqcbh-CRbP31D7bNerg=.0d59fef5-83df-48a6-a04e-3f0819f076ba@github.com>

> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
> 
> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  Use new ALLOW_CALL call macro

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7248/files
  - new: https://git.openjdk.java.net/jdk/pull/7248/files/dd1820eb..abb2b0ac

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=02-03

  Stats: 91 lines in 6 files changed: 14 ins; 48 del; 29 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7248.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7248/head:pull/7248

PR: https://git.openjdk.java.net/jdk/pull/7248

From dholmes at openjdk.java.net  Sat Feb 12 01:22:19 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Sat, 12 Feb 2022 01:22:19 GMT
Subject: RFR: 8239927: Product variable PrefetchFieldsAhead is unused and
 should be removed [v4]
In-Reply-To: <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com>
References: <xGDii7-Onzmeui72U0dCU-ZBAXy2P_mzOo-62Mc0Psg=.0c90f99c-6bdc-4d36-b153-58926565c826@github.com>
 <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com>
Message-ID: <on5uqZI_qYChKSHMKmwdKdASSLhKFmhwVMTHTe8wSSE=.9abe98fc-6dcc-4631-ad08-5a7f597845c6@github.com>

On Thu, 20 Jan 2022 15:58:09 GMT, Bhavana-Kilambi <duke at openjdk.java.net> wrote:

>> The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. 
>> But as it's not used anywhere, removing this option from the JDK source.
>
> Bhavana-Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
> 
>  - Merge master
>  - 8239927: Product variable PrefetchFieldsAhead is unused and should be removed

Looks good and trivial.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6783

From duke at openjdk.java.net  Sat Feb 12 09:38:09 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Sat, 12 Feb 2022 09:38:09 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v9]
In-Reply-To: <TPkx1fjl2gz0Yu_QXM03bdFuGUwyFkLKXJ49doST7Ls=.9fdb866e-7550-444f-831c-56de579cf49a@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>
 <TPkx1fjl2gz0Yu_QXM03bdFuGUwyFkLKXJ49doST7Ls=.9fdb866e-7550-444f-831c-56de579cf49a@github.com>
Message-ID: <tbNz_cSRkS0RS8BWTuDKurr23ebLhNJ1c3573U7qoM0=.7a22a82b-7027-4b0a-b13b-7f64c95aefa2@github.com>

On Fri, 11 Feb 2022 12:46:51 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix in response to suggestion by David Holmes
>
> Marked as reviewed by dholmes (Reviewer).

Thanks @dholmes-ora , @vnkozlov , @TobiHartmann @hseigel  for the reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From aph-open at littlepinkcloud.com  Sat Feb 12 10:23:55 2022
From: aph-open at littlepinkcloud.com (Andrew Haley)
Date: Sat, 12 Feb 2022 10:23:55 +0000
Subject: RFC: AArch64: Set Segmented CodeCache default size to 127M
In-Reply-To: <155db069-9cdd-6e90-6e02-d87be2ab204b@littlepinkcloud.com>
References: <64AB1C1E-4151-4979-BF15-CC71D00E98DB@amazon.com>
 <155db069-9cdd-6e90-6e02-d87be2ab204b@littlepinkcloud.com>
Message-ID: <3e317d33-7d48-93ae-3787-886797483d62@littlepinkcloud.com>

On 2/11/22 16:23, Andrew Haley wrote:
> A quick back-of-the-envelope calculation tells me that about 3.5% of
> the code cache is occupied by trampolines and the extra bytes used by
> far calls. However, many of the far calls

s/far calls/trampolines/

> are never needed; I don't
> have stats for that, but I'd guess about half of them. But given the
> (plausible ?)  assumption that the dynamic frequency of calls is the
> same as the static frequency, I wouldn't be surprised if the cost of
> trampoline calls is about 2% of the total instruction count, so it'd
> be nice to be rid of them if there were no cost; but there is a cost.


-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From duke at openjdk.java.net  Sat Feb 12 13:12:06 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Sat, 12 Feb 2022 13:12:06 GMT
Subject: Integrated: 8278423: ExtendedDTraceProbes should be deprecated
In-Reply-To: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
Message-ID: <z5mRD29WKTMVX2bHLYPV0ORQk49uR6uDBBUEcFKSxq8=.f7047bdd-58df-4594-9222-011a15f00c20@github.com>

On Mon, 17 Jan 2022 13:08:17 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

> Deprecated ExtendedDTraceProbes.
> Edited help messages and man pages accordingly, added the 3 flags to man pages.
> Added flag to VMDeprecatedOptions test.
> Replaced the flag with 3 flags in SDTProbesGNULinuxTest.java.
> 
> Checked that tests are not affected.

This pull request has now been integrated.

Changeset: 67077a04
Author:    Emanuel Peter <emanuel.peter at oracle.com>
Committer: David Holmes <dholmes at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/67077a04307b512219a46b6c4c274ce308ee46de
Stats:     36 lines in 5 files changed: 26 ins; 0 del; 10 mod

8278423: ExtendedDTraceProbes should be deprecated

Reviewed-by: dholmes, hseigel, kvn, thartmann

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From dholmes at openjdk.java.net  Sat Feb 12 13:53:07 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Sat, 12 Feb 2022 13:53:07 GMT
Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v9]
In-Reply-To: <tbNz_cSRkS0RS8BWTuDKurr23ebLhNJ1c3573U7qoM0=.7a22a82b-7027-4b0a-b13b-7f64c95aefa2@github.com>
References: <yhVStXBDm8GKl5mKRR7U_3MkHfwPDjioaF6nPCL4uE0=.119cbf79-2533-424f-8ae2-065c672e794c@github.com>
 <prQrZIYCsK6lQEF8HSETJXty2SaX8plS-K5It2Hrd7M=.42157e4c-a5cf-4ccd-8a68-14fbb44c9b0c@github.com>
 <TPkx1fjl2gz0Yu_QXM03bdFuGUwyFkLKXJ49doST7Ls=.9fdb866e-7550-444f-831c-56de579cf49a@github.com>
 <tbNz_cSRkS0RS8BWTuDKurr23ebLhNJ1c3573U7qoM0=.7a22a82b-7027-4b0a-b13b-7f64c95aefa2@github.com>
Message-ID: <1F1lA8O1LUcjG7jN9zKWggndsrjftGjyY9SNkGi0IQ0=.6980ebbb-58f5-4a08-97ae-ab4272e47e88@github.com>

On Sat, 12 Feb 2022 09:34:28 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> Marked as reviewed by dholmes (Reviewer).
>
> Thanks @dholmes-ora , @vnkozlov , @TobiHartmann @hseigel  for the reviews.

@eme64 the test fails in our CI as it encounters builds for which DTrace is not enabled - see JDK-8281675.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7110

From jbhateja at openjdk.java.net  Sun Feb 13 02:55:14 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Sun, 13 Feb 2022 02:55:14 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v2]
In-Reply-To: <2TVKx_BFFyAK2ooOWKpdsEIMFzJngYxlWjbgeZ2y4Mc=.5deb2173-8107-476d-92ca-1835d69ce336@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <LQMZEAy-QU55kNt5fwFQSI8JPGuYz-nRWhuWVkKMt5c=.e5245c2f-c111-4c3d-829c-db44bca43e47@github.com>
 <2TVKx_BFFyAK2ooOWKpdsEIMFzJngYxlWjbgeZ2y4Mc=.5deb2173-8107-476d-92ca-1835d69ce336@github.com>
Message-ID: <SKEroM6QsoBV4Btj6kAemSCqqRfHT4mm33Avdy1L8l4=.fcd38193-3821-4573-8bca-300e22f875fe@github.com>

On Fri, 21 Jan 2022 00:49:04 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

> The JVM currently initializes the x86 mxcsr to round to nearest even, see below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode, exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works for Math.rint which is specified to be round to nearest even. Please see: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html : section 4.8.4
> 
> The rounding mode needed for Math.round is round to positive infinity which needs a different x86 mxcsr initialization(0x5F80).

Hi @sviswa7 ,
As per JLS 17 section 15.4 Java follows round to nearest rounding policy for all floating point operations except conversion to integer and remainder where it uses round toward zero.  

So it may not be feasible to modify global MXCSR.RC setting,  also modifying MXCSR setting just before rounding and re-setting back to its original value after operation will also not work as OOO processor is free to re-order LMXCSR instruction if used without any barriers and thus it may also influence other floating point operation. 
I am pushing an incremental patch which is vectorizes existing rounding APIs and is showing significant gain over existing implementation.

Best Regards,
Jatin

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Sun Feb 13 03:09:43 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Sun, 13 Feb 2022 03:09:43 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | -- | --
> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
 - 8279508: Adding a test for scalar intrinsification.
 - 8279508: Auto-vectorize Math.round API

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/575d2935..2dc364fa

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=01-02

  Stats: 33695 lines in 1192 files changed: 23243 ins; 5703 del; 4749 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Sun Feb 13 03:23:06 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sun, 13 Feb 2022 03:23:06 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
Message-ID: <HY93TdvVThuY7yV2NPR7CqRW7EjDEffrRo2mHd1wTpM=.a0bc8b13-cb9d-4ef0-a151-2aa231e0c1fc@github.com>

On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Sun Feb 13 05:18:34 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sun, 13 Feb 2022 05:18:34 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v3]
In-Reply-To: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
Message-ID: <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>

> Hi,
> 
> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
> 
> Thank you very much.

Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:

  missing ForceInline

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7358/files
  - new: https://git.openjdk.java.net/jdk/pull/7358/files/8028be52..cf78527b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7358&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7358&range=01-02

  Stats: 10 lines in 2 files changed: 6 ins; 1 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7358.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7358/head:pull/7358

PR: https://git.openjdk.java.net/jdk/pull/7358

From duke at openjdk.java.net  Sun Feb 13 05:18:36 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sun, 13 Feb 2022 05:18:36 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <Mw8wnUYFgKMIS124_zvUI4wtV_s7YHydLEwiwkOC0Fw=.72a1e123-3272-40d7-8d95-746fa3da061b@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>
 <Mw8wnUYFgKMIS124_zvUI4wtV_s7YHydLEwiwkOC0Fw=.72a1e123-3272-40d7-8d95-746fa3da061b@github.com>
Message-ID: <RhoQzZ82rJ1zpohq2Z-8JEuJCA-hM46UXwz5H-GOchU=.d5c06efe-41d1-404d-8a9a-bef65102dd32@github.com>

On Thu, 10 Feb 2022 18:55:29 GMT, Paul Sandoz <psandoz at openjdk.org> wrote:

>> Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - minor rename
>>  - address reviews
>
> Observing the following failures on CPUs with "Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz" with HotSpot flags:
> 
> -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation
> 
> 
> TestVectorCastAVX512.java:
> 
> Failed IR Rules (1)
> ------------------
> - Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUI256toL512(int[],long[])":
>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastI2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
>     - counts: Graph contains wrong number of nodes:
>         Regex 1: (\\d+(\\s){2}(VectorUCastI2X.*)+(\\s){2}===.*)
>         Expected 1 but found 0 nodes.
> 
> 
> TestVectorCastAVX1.java:
> 
> - Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toS64(byte[],short[])":
>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastB2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
>     - counts: Graph contains wrong number of nodes:
>         Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
>         Expected 1 but found 0 nodes.
> 
> - Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toI128(byte[],int[])":
>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastB2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
>     - counts: Graph contains wrong number of nodes:
>         Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
>         Expected 1 but found 0 nodes.

@PaulSandoz Thanks a lot for your testing, the reason seems to be due to `LaneType::asIntegral` missing `ForceInline` annotation. I have run the reshape test 10 times without getting any failure while with previous patch there is often 1 or 2.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From duke at openjdk.java.net  Sun Feb 13 08:39:11 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sun, 13 Feb 2022 08:39:11 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
Message-ID: <MdXxpQTvJnax2dacUVyo4VxKXA9UwCQ0qe67mNwmHgs=.e9714016-6e63-43e5-a1dc-3af6e46d21e9@github.com>

On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

Also, it seems you have tried using `roundss/sd/ps/pd` followed by a cast to correct the rounding behaviour but decided to take another approach. Some comments around the functions explaining why that is so would be preferable. Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From aph at openjdk.java.net  Sun Feb 13 11:01:06 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Sun, 13 Feb 2022 11:01:06 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
Message-ID: <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>

On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:

> 4064: }
> 4065: 
> 4066: void C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, XMMRegister src, XMMRegister xtmp1,

What does this do? Comment, even pseudo code, would be nice.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Sun Feb 13 13:12:07 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Sun, 13 Feb 2022 13:12:07 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
 <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
Message-ID: <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>

On Sun, 13 Feb 2022 10:58:19 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>> 
>>  - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
>>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>>  - 8279508: Adding a test for scalar intrinsification.
>>  - 8279508: Auto-vectorize Math.round API
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:
> 
>> 4064: }
>> 4065: 
>> 4066: void C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, XMMRegister src, XMMRegister xtmp1,
> 
> What does this do? Comment, even pseudo code, would be nice.

> Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.

Hi @merykitty ,  You are correct, we can embed RC mode in instruction encoding round instructions (towards -inf,+inf, zero). But to match the semantics of Math.round API one needs to add 0.5[f] to input value and then perform rounding over resultant value, which is why @sviswa7 suggested to use a global rounding mode driven by MXCSR.RC so that intermediate floating inexact values also are resolved as desired, but OOO execution may misplace LDMXCSR and hence may have undesired side effects.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Sun Feb 13 13:16:16 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Sun, 13 Feb 2022 13:16:16 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
 <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
 <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>
Message-ID: <iCaaelBgPdReusZLD-8eM-XDbw_xWsV5mv1vq4umRcg=.58de564c-df8f-4d20-96fd-e64389421cc0@github.com>

On Sun, 13 Feb 2022 13:08:41 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:
>> 
>>> 4064: }
>>> 4065: 
>>> 4066: void C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, XMMRegister src, XMMRegister xtmp1,
>> 
>> What does this do? Comment, even pseudo code, would be nice.
>
>> Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.
> 
> Hi @merykitty ,  You are correct, we can embed RC mode in instruction encoding of round instruction (towards -inf,+inf, zero). But to match the semantics of Math.round API one needs to add 0.5[f] to input value and then perform rounding over resultant value, which is why @sviswa7 suggested to use a global rounding mode driven by MXCSR.RC so that intermediate floating inexact values also are resolved as desired, but OOO execution may misplace LDMXCSR and hence may have undesired side effects.

> What does this do? Comment, even pseudo code, would be nice.

Thanks @theRealAph , I shall append the comments over the routine.
BTW, entire rounding algorithm can also be implemented using  Vector API which can perform if-conversion using masked operations.

class roundf {
   public static VectorSpecies ISPECIES = IntVector.SPECIES_512;
   public static VectorSpecies SPECIES = FloatVector.SPECIES_512;

   public static int round_vector(float[] a, int[] r, int ctr) {
      IntVector shiftVBC = (IntVector) ISPECIES.broadcast(24 - 2 + 127);
      for (int i = 0; i < a.length; i += SPECIES.length()) {
         FloatVector fv = FloatVector.fromArray(SPECIES, a, i);
         IntVector iv = fv.reinterpretAsInts();
         IntVector biasedExpV = iv.lanewise(VectorOperators.AND, 0x7F800000);
         biasedExpV = biasedExpV.lanewise(VectorOperators.ASHR, 23);
         IntVector shiftV = shiftVBC.lanewise(VectorOperators.SUB, biasedExpV);
         VectorMask cond = shiftV.lanewise(VectorOperators.AND, -32)
               .compare(VectorOperators.EQ, 0);
         IntVector res = iv.lanewise(VectorOperators.AND, 0x007FFFFF)
               .lanewise(VectorOperators.OR, 0x007FFFFF + 1);
         VectorMask cond1 = iv.compare(VectorOperators.LT, 0);
         VectorMask cond2 = cond1.and(cond);
         res = res.lanewise(VectorOperators.NEG, cond2);
         res = res.lanewise(VectorOperators.ASHR, shiftV)
               .lanewise(VectorOperators.ADD, 1)
               .lanewise(VectorOperators.ASHR, 1);
         res = fv.convert(VectorOperators.F2I, 0)
               .reinterpretAsInts()
               .blend(res, cond);
         res.intoArray(r, i);
      }
      return r[ctr];
   }

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Mon Feb 14 01:37:09 2022
From: duke at openjdk.java.net (Bhavana-Kilambi)
Date: Mon, 14 Feb 2022 01:37:09 GMT
Subject: Integrated: 8239927: Product variable PrefetchFieldsAhead is unused
 and should be removed
In-Reply-To: <xGDii7-Onzmeui72U0dCU-ZBAXy2P_mzOo-62Mc0Psg=.0c90f99c-6bdc-4d36-b153-58926565c826@github.com>
References: <xGDii7-Onzmeui72U0dCU-ZBAXy2P_mzOo-62Mc0Psg=.0c90f99c-6bdc-4d36-b153-58926565c826@github.com>
Message-ID: <izBATc0xYLWaASqHTK3OVq2a2JIGUGD1yUJkyVg7lcY=.db84a708-296f-4147-ad89-eeb0e5ee12e6@github.com>

On Thu, 9 Dec 2021 11:51:05 GMT, Bhavana-Kilambi <duke at openjdk.java.net> wrote:

> The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. 
> But as it's not used anywhere, removing this option from the JDK source.

This pull request has now been integrated.

Changeset: adbe0661
Author:    Bhavana Kilambi <bhavana.kilambi at arm.com>
Committer: Ningsheng Jian <njian at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/adbe0661029f12a36a44af52b83b189384d33a27
Stats:     13 lines in 3 files changed: 1 ins; 10 del; 2 mod

8239927: Product variable PrefetchFieldsAhead is unused and should be removed

Reviewed-by: njian, dholmes

-------------

PR: https://git.openjdk.java.net/jdk/pull/6783

From ioi.lam at oracle.com  Mon Feb 14 06:07:16 2022
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 13 Feb 2022 22:07:16 -0800
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <5d25e7ceeabd9186dd6fe5e9e6e04d0d11ef26c0.camel@redhat.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
 <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
 <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>
 <5d25e7ceeabd9186dd6fe5e9e6e04d0d11ef26c0.camel@redhat.com>
Message-ID: <3a76d11a-6816-5179-5a32-fd87e94ae90a@oracle.com>

On 2/8/2022 3:32 AM, Severin Gehwolf wrote:
> On Mon, 2022-02-07 at 22:29 -0800, Ioi Lam wrote:
>> On 2022/02/07 10:36, Severin Gehwolf wrote:
>>> On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
>>>> Case (4) is the cause for the bug in JDK-8279484
>>>>
>>>> Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares to 2.
>>>> This means:
>>>>
>>>> - This container is guaranteed a minimum amount of CPU resources
>>>> - If no other containers are executing, this container can use as
>>>>  ??? much CPU as available on the host
>>>> - If other containers are executing, the amount of CPU available
>>>>  ??? to this container is (2 / (sum of cpu.shares of all active
>>>>  ??? containers))
>>>>
>>>>
>>>> The fundamental problem with the current JVM implementation is that it
>>>> treats "CPU request" as a maximum value, the opposite of what Kubernetes
>>>> does. Because of this, in case (4), the JVM artificially limits itself
>>>> to a single CPU. This leads to CPU underutilization.
>>> I agree with your analysis. Key point is that in such a setup
>>> Kubernetes sets CPU shares value to 2. Though, it's a very specific
>>> case.
>>>
>>> In contrast to Kubernetes the JVM doesn't have insight into what other
>>> containers are doing (or how they are configured). It would, perhaps,
>>> be good to know what Kubernetes does for containers when the
>>> environment (i.e. other containers) changes. Do they get restarted?
>>> Restarted with different values for cpu shares?
>> My understanding is that Kubernetes will try to do load balancing and
>> may migrate the containers. According to this:
>>
>> https://stackoverflow.com/questions/64891872/kubernetes-dynamic-configurationn-of-cpu-resource-limit
>>
>> If you change the CPU limits, a currently running container will be shut
>> down and restarted (using the new limit), and may be relocated to a
>> different host if necessary.
>>
>> I think this means that a JVM process doesn't need to worry about the
>> CPU limit changing during its lifetime :-)
>>> Either way, what are our options to fix this? Does it need fixing?
>>>
>>>  ? * Should we no longer take cpu shares as a means to limit CPU into
>>>  ??? account? It would be a significant change to how previous JDKs
>>>  ??? worked. Maybe that wouldn't be such a bad idea :)
>> I think we should get rid of it. This feature was designed to work with
>> Kubernetes, but has no effect in most cases. The only time it takes
>> effect (when no resource limits are set) it does the opposite of what
>> the user expects.
> I tend to agree. We should start with a CSR review of this, though, as
> it would be a behavioural change as compared to previous versions of
> the JDK.

Hi Severin,

Sorry for the delay. I've created a CSR. Could you take a look?

https://bugs.openjdk.java.net/browse/JDK-8281571

>
>> Also, the current implementation is really tied to specific behaviors of
>> Kubernetes + docker (the 1024 and 100 constants). This will cause
>> problems with other container/orchestration software that use different
>> algorithms and constants.
> There are other container orchestration frameworks, like Mesos, which
> behave in a similar way (1024 constant is being used). The good news is
> that mesos seems to have moved to a hard-limit default. See:
>
> https://mesos.apache.org/documentation/latest/quota/#deprecated-quota-guarantees
>
>>>  ? * How likely is CPU underutilization to happen in practise?
>>>  ??? Considering the container is not the only container on the node,
>>>  ??? then according to your formula, it'll get one CPU or less anyway.
>>>  ??? Underutilization would, thus, only happen when it's an idle node
>>>  ??? with no other containers running. That would suggest to do nothing
>>>  ??? and let the user override it as they see fit.
>> I think under utilization happens when the containers have a bursty
>> usage pattern. If other containers do not fully utilize their CPU
>> quotas, we should distribute the unused CPUs to the busy containers.
> Right, but this isn't really something the JVM process should care
> about. It's really a core feature of the orchestration framework to do
> that. All we could do is to not limit CPU for those cases. On the other
> hand there is the risk of resource starvation too. Consider a node with
> many cores, 50 say, and a very small cpu share setting via container
> limits. The experience running a JVM application in such a set up would
> be very mediocre as the JVM thinks it can use 50 cores (100% of the
> time), yet it would only get this when the rest of the
> containers/universe is idle.

I think we have a general problem that's not specific to containers. If 
we are running 50 active Java processes on a bare-bone Linux, then each 
of them would be default use? a 50-thread ForkJoinPool. In each process 
is given an equal amount of CPU resources, it would make sense for each 
of them to have a single thread FJP so we can avoid all thread context 
switching.

Or, maybe the Linux kernel is already good enough? If each process is 
bound to a single physical CPU, context switching between the threads of 
the same process should be pretty lightweight. It would be worthwhile 
writing a test case ....

Thanks
- Ioi


>
> Thanks,
> Severin
>


From david.holmes at oracle.com  Mon Feb 14 07:02:17 2022
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 14 Feb 2022 17:02:17 +1000
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <3a76d11a-6816-5179-5a32-fd87e94ae90a@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
 <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
 <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>
 <5d25e7ceeabd9186dd6fe5e9e6e04d0d11ef26c0.camel@redhat.com>
 <3a76d11a-6816-5179-5a32-fd87e94ae90a@oracle.com>
Message-ID: <0d081302-9dfb-3e48-13c0-8ee151bfb626@oracle.com>

On 14/02/2022 4:07 pm, Ioi Lam wrote:
> On 2/8/2022 3:32 AM, Severin Gehwolf wrote:
>> On Mon, 2022-02-07 at 22:29 -0800, Ioi Lam wrote:
>>> On 2022/02/07 10:36, Severin Gehwolf wrote:
>>>> On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
>>>>> Case (4) is the cause for the bug in JDK-8279484
>>>>>
>>>>> Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares 
>>>>> to 2.
>>>>> This means:
>>>>>
>>>>> - This container is guaranteed a minimum amount of CPU resources
>>>>> - If no other containers are executing, this container can use as
>>>>> ???? much CPU as available on the host
>>>>> - If other containers are executing, the amount of CPU available
>>>>> ???? to this container is (2 / (sum of cpu.shares of all active
>>>>> ???? containers))
>>>>>
>>>>>
>>>>> The fundamental problem with the current JVM implementation is that it
>>>>> treats "CPU request" as a maximum value, the opposite of what 
>>>>> Kubernetes
>>>>> does. Because of this, in case (4), the JVM artificially limits itself
>>>>> to a single CPU. This leads to CPU underutilization.
>>>> I agree with your analysis. Key point is that in such a setup
>>>> Kubernetes sets CPU shares value to 2. Though, it's a very specific
>>>> case.
>>>>
>>>> In contrast to Kubernetes the JVM doesn't have insight into what other
>>>> containers are doing (or how they are configured). It would, perhaps,
>>>> be good to know what Kubernetes does for containers when the
>>>> environment (i.e. other containers) changes. Do they get restarted?
>>>> Restarted with different values for cpu shares?
>>> My understanding is that Kubernetes will try to do load balancing and
>>> may migrate the containers. According to this:
>>>
>>> https://stackoverflow.com/questions/64891872/kubernetes-dynamic-configurationn-of-cpu-resource-limit 
>>>
>>>
>>> If you change the CPU limits, a currently running container will be shut
>>> down and restarted (using the new limit), and may be relocated to a
>>> different host if necessary.
>>>
>>> I think this means that a JVM process doesn't need to worry about the
>>> CPU limit changing during its lifetime :-)
>>>> Either way, what are our options to fix this? Does it need fixing?
>>>>
>>>> ?? * Should we no longer take cpu shares as a means to limit CPU into
>>>> ???? account? It would be a significant change to how previous JDKs
>>>> ???? worked. Maybe that wouldn't be such a bad idea :)
>>> I think we should get rid of it. This feature was designed to work with
>>> Kubernetes, but has no effect in most cases. The only time it takes
>>> effect (when no resource limits are set) it does the opposite of what
>>> the user expects.
>> I tend to agree. We should start with a CSR review of this, though, as
>> it would be a behavioural change as compared to previous versions of
>> the JDK.
> 
> Hi Severin,
> 
> Sorry for the delay. I've created a CSR. Could you take a look?
> 
> https://bugs.openjdk.java.net/browse/JDK-8281571
> 
>>
>>> Also, the current implementation is really tied to specific behaviors of
>>> Kubernetes + docker (the 1024 and 100 constants). This will cause
>>> problems with other container/orchestration software that use different
>>> algorithms and constants.
>> There are other container orchestration frameworks, like Mesos, which
>> behave in a similar way (1024 constant is being used). The good news is
>> that mesos seems to have moved to a hard-limit default. See:
>>
>> https://mesos.apache.org/documentation/latest/quota/#deprecated-quota-guarantees 
>>
>>
>>>> ?? * How likely is CPU underutilization to happen in practise?
>>>> ???? Considering the container is not the only container on the node,
>>>> ???? then according to your formula, it'll get one CPU or less anyway.
>>>> ???? Underutilization would, thus, only happen when it's an idle node
>>>> ???? with no other containers running. That would suggest to do nothing
>>>> ???? and let the user override it as they see fit.
>>> I think under utilization happens when the containers have a bursty
>>> usage pattern. If other containers do not fully utilize their CPU
>>> quotas, we should distribute the unused CPUs to the busy containers.
>> Right, but this isn't really something the JVM process should care
>> about. It's really a core feature of the orchestration framework to do
>> that. All we could do is to not limit CPU for those cases. On the other
>> hand there is the risk of resource starvation too. Consider a node with
>> many cores, 50 say, and a very small cpu share setting via container
>> limits. The experience running a JVM application in such a set up would
>> be very mediocre as the JVM thinks it can use 50 cores (100% of the
>> time), yet it would only get this when the rest of the
>> containers/universe is idle.
> 
> I think we have a general problem that's not specific to containers. If 
> we are running 50 active Java processes on a bare-bone Linux, then each 
> of them would be default use? a 50-thread ForkJoinPool. In each process 
> is given an equal amount of CPU resources, it would make sense for each 
> of them to have a single thread FJP so we can avoid all thread context 
> switching.

The JVM cannot optimise this situation because it has no knowledge of 
the system, its load, or the workload characteristics. It also doesn't 
know how the scheduler may apportion CPU resources. Sizing heuristics 
within the JDK itself are pretty basic. If the user/deployer has better 
knowledge of what would constitute an "optimum" configuration then they 
have control knobs (system properties, VM flags) they can use to 
implement that.

> Or, maybe the Linux kernel is already good enough? If each process is 
> bound to a single physical CPU, context switching between the threads of 
> the same process should be pretty lightweight. It would be worthwhile 
> writing a test case ....

Binding a process to a single CPU would be potentially very bad for some 
workloads. Neither end-point is likely to be "best" in general.

Cheers,
David

> 
> Thanks
> - Ioi
> 
> 
>>
>> Thanks,
>> Severin
>>
> 

From shade at openjdk.java.net  Mon Feb 14 08:06:19 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 14 Feb 2022 08:06:19 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
Message-ID: <tI_-aI8q7kpKOb1gyi8NmvJ01vdAGuecSvLeyq3BhaE=.65a9a861-89da-4084-a718-a5db02c10d8b@github.com>

On Tue, 8 Feb 2022 18:19:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
> 
> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
> 
> The default values for options are different per platform, so tests are x86_64 specific.
> 
> No default value is changed, this only unblocks experiments.
> 
> Additional testing:
>  - [x] New tests on Linux x86_64 fastdebug
>  - [x] New tests on Linux x86_64 release

Anyone? :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From aph at openjdk.java.net  Mon Feb 14 09:16:10 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 14 Feb 2022 09:16:10 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <iCaaelBgPdReusZLD-8eM-XDbw_xWsV5mv1vq4umRcg=.58de564c-df8f-4d20-96fd-e64389421cc0@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
 <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
 <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>
 <iCaaelBgPdReusZLD-8eM-XDbw_xWsV5mv1vq4umRcg=.58de564c-df8f-4d20-96fd-e64389421cc0@github.com>
Message-ID: <EKNLSOdo1lrE-PbeZZ9YC9LonG9FhmDitLo_Wa60vYk=.c4324df4-3b7d-4b12-939e-e74de8e6ae76@github.com>

On Sun, 13 Feb 2022 13:12:35 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>>> Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.
>> 
>> Hi @merykitty ,  You are correct, we can embed RC mode in instruction encoding of round instruction (towards -inf,+inf, zero). But to match the semantics of Math.round API one needs to add 0.5[f] to input value and then perform rounding over resultant value, which is why @sviswa7 suggested to use a global rounding mode driven by MXCSR.RC so that intermediate floating inexact values are resolved as desired, but OOO execution may misplace LDMXCSR and hence may have undesired side effects.
>
>> What does this do? Comment, even pseudo code, would be nice.
> 
> Thanks @theRealAph , I shall append the comments over the routine.
> BTW, entire rounding algorithm can also be implemented using  Vector API which can perform if-conversion using masked operations.
> 
> class roundf {
>    public static VectorSpecies ISPECIES = IntVector.SPECIES_512;
>    public static VectorSpecies SPECIES = FloatVector.SPECIES_512;
> 
>    public static int round_vector(float[] a, int[] r, int ctr) {
>       IntVector shiftVBC = (IntVector) ISPECIES.broadcast(24 - 2 + 127);
>       for (int i = 0; i < a.length; i += SPECIES.length()) {
>          FloatVector fv = FloatVector.fromArray(SPECIES, a, i);
>          IntVector iv = fv.reinterpretAsInts();
>          IntVector biasedExpV = iv.lanewise(VectorOperators.AND, 0x7F800000);
>          biasedExpV = biasedExpV.lanewise(VectorOperators.ASHR, 23);
>          IntVector shiftV = shiftVBC.lanewise(VectorOperators.SUB, biasedExpV);
>          VectorMask cond = shiftV.lanewise(VectorOperators.AND, -32)
>                .compare(VectorOperators.EQ, 0);
>          IntVector res = iv.lanewise(VectorOperators.AND, 0x007FFFFF)
>                .lanewise(VectorOperators.OR, 0x007FFFFF + 1);
>          VectorMask cond1 = iv.compare(VectorOperators.LT, 0);
>          VectorMask cond2 = cond1.and(cond);
>          res = res.lanewise(VectorOperators.NEG, cond2);
>          res = res.lanewise(VectorOperators.ASHR, shiftV)
>                .lanewise(VectorOperators.ADD, 1)
>                .lanewise(VectorOperators.ASHR, 1);
>          res = fv.convert(VectorOperators.F2I, 0)
>                .reinterpretAsInts()
>                .blend(res, cond);
>          res.intoArray(r, i);
>       }
>       return r[ctr];
>    }

That pseudocode would make a very useful comment too. This whole patch is very thinly commented.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From lkorinth at openjdk.java.net  Mon Feb 14 12:08:13 2022
From: lkorinth at openjdk.java.net (Leo Korinth)
Date: Mon, 14 Feb 2022 12:08:13 GMT
Subject: Integrated: 8281585: Remove unused imports under test/lib and jtreg/gc
In-Reply-To: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
Message-ID: <sWUohWI0UjvTZcEoutCYu2f-nmyo5fTH6iCyyy_jxEU=.3fc25386-6e28-4103-8595-7df504ae0544@github.com>

On Thu, 10 Feb 2022 15:39:53 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.

This pull request has now been integrated.

Changeset: 2604a88f
Author:    Leo Korinth <lkorinth at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/2604a88fbb6d0f9aec51c7d607ea275bc34a672c
Stats:     151 lines in 60 files changed: 0 ins; 92 del; 59 mod

8281585: Remove unused imports under test/lib and jtreg/gc

Reviewed-by: dholmes, sspitsyn

-------------

PR: https://git.openjdk.java.net/jdk/pull/7426

From lkorinth at openjdk.java.net  Mon Feb 14 12:08:12 2022
From: lkorinth at openjdk.java.net (Leo Korinth)
Date: Mon, 14 Feb 2022 12:08:12 GMT
Subject: RFR: 8281585: Remove unused imports under test/lib and jtreg/gc
 [v2]
In-Reply-To: <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
References: <h90_aeouWu61wQWZosouJTUIwlE5rCP9fkXpjhdYSLk=.5e8ea5c8-6919-4451-97cc-9982f835d636@github.com>
 <TI1tBVjplEhphoHzgP6aj2kT2tt_GlhCiCkcpSA1f6w=.84d16506-5ed6-48fd-86f2-b7321d831444@github.com>
Message-ID: <YrQZLmQ_E0CymQDNGWvrWvdf8HetQu_r4RPZ5-vg9CU=.056b5ea7-1e62-4794-87d3-45c0beb8fd88@github.com>

On Fri, 11 Feb 2022 08:54:51 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

>> Remove unused imports under test/lib and jtreg/gc. They create lots of warnings if editing using an IDE. Tests in hotspot_gc passed.
>
> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision:
> 
>   updating copyright

Thanks David and Serguei!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7426

From volker.simonis at gmail.com  Mon Feb 14 13:42:15 2022
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 14 Feb 2022 14:42:15 +0100
Subject: Internal compiler error for slowdebug build with gcc 7.5.0 on
 Ubuntu 18.04
In-Reply-To: <CA+3eh130pAA3r2GmYZdhinWQYL+0TSTbP4B-e_ZHuPrGnBDDjg@mail.gmail.com>
References: <CA+3eh130pAA3r2GmYZdhinWQYL+0TSTbP4B-e_ZHuPrGnBDDjg@mail.gmail.com>
Message-ID: <CA+3eh13xdGaAR_qCix85hqn_Hyk-TAWYswyuTmVKBpgy5vF4Qw@mail.gmail.com>

I found the root cause of this issue. On my machine it was caused by
the fact that I've installed version `release-4.6` of systemtap like
so:
```
$ git clone git://sourceware.org/git/systemtap.git
$ cd systemtap/
$ git checkout release-4.6
$ ./configure && make         // no errors
$ sudo make install
```

This leads to the described GCC internal error (even with GCC 10.3.0):
```
gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~18.04~1)
...
during RTL pass: reload
/OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp: In
static member function ?static void
CompileBroker::invoke_compiler_on_method(CompileTask*)?:
/OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp:2415:1:
internal compiler error: maximum number of generated reload insns per
insn achieved (90)
 2415 | }
      | ^
Please submit a full bug report,
```

By uninstalling systemtap or by upgrading to a newer version (after
`sys/sdt.h fp constraints cont'd, x86-64 edition` [1]) the problem
goes away. That systemtap change works around the yet unfixed GCC bug
`2028798 - gcc: reload failures on x86-64 after Systemtap 4.6 upgrade
` [2] described before.

All very strange and maybe another argument for deprecating DTRACE support [3]?

[1] https://sourceware.org/git/?p=systemtap.git;a=commit;h=1d3653936fc1fd13135a723a27e6c7e959793ad0
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2028798
[3] https://bugs.openjdk.java.net/browse/JDK-8278423

On Thu, Feb 10, 2022 at 4:08 PM Volker Simonis <volker.simonis at gmail.com> wrote:
>
> Hi,
>
> When compiling the latest HS sources in slowdebug mode with gcc 7.5.0
> (the default compiler on Ubuntu 18.04) I get the following internal
> compiler error for the file compileBroker.cpp:
>
> /OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp: In
> static member function 'static voi
> d CompileBroker::invoke_compiler_on_method(CompileTask*)':
> /OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp:2393:1:
> internal compiler error: Max. number of generated reload insns per
> insn is achieved (90)
>
>  }
>  ^
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
>
> I know that gcc 7.5.0 isn't officially supported but was just curious
> if somebody has seen this before? Googling around shows that this
> issue seems to have been fixed several times in gcc 4.9 and
> specifically for ppc/rs6000.
>
> I've installed and tried gcc 8.4.0 but the error remains the same:
>
> GNU C++14 (Ubuntu 8.4.0-1ubuntu1~18.04) version 8.4.0 (x86_64-linux-gnu)
>     compiled by GNU C version 8.4.0, GMP version 6.1.2, MPFR version
> 4.0.1, MPC version 1.1.0, isl version isl-0.19-GMP
>
> GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
> GNU assembler version 2.30 (x86_64-linux-gnu) using BFD version (GNU
> Binutils for Ubuntu) 2.30
> Compiler executable checksum: 67fba09f596cc8a67df33f8529603bfb
> during RTL pass: reload
> /OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp: In
> static member function ?static void
> CompileBroker::invoke_compiler_on_method(CompileTask*)?:
> /OpenJDK/Git/jdk/src/hotspot/share/compiler/compileBroker.cpp:2393:1:
> internal compiler error: Max. number of generated reload insns per
> insn is achieved (90)
>
>  }
>  ^
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <file:///usr/share/doc/gcc-8/README.Bugs> for instructions.
>
> According to the "Supported Build Platforms" Wiki [1] it seems that at
> least SAP is using gcc 8. Have you run into this issue as well? Any
> ideas how to fix it without upgrading to gcc 10?
>
> Thank you and best regards,
> Volker
>
> PS: the release build works perfectly fine with gcc 7.5.0
>
> [1] https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms

From vlivanov at openjdk.java.net  Mon Feb 14 13:58:30 2022
From: vlivanov at openjdk.java.net (Vladimir Ivanov)
Date: Mon, 14 Feb 2022 13:58:30 GMT
Subject: RFR: 8280901: MethodHandle::linkToNative stub is missing w/ -Xint
Message-ID: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>

MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.

Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 

Testing: test/jdk/java/foreign/ w/ -Xint

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.java.net/jdk/pull/7459/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7459&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8280901
  Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7459.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7459/head:pull/7459

PR: https://git.openjdk.java.net/jdk/pull/7459

From mcimadamore at openjdk.java.net  Mon Feb 14 14:20:09 2022
From: mcimadamore at openjdk.java.net (Maurizio Cimadamore)
Date: Mon, 14 Feb 2022 14:20:09 GMT
Subject: RFR: 8280901: MethodHandle::linkToNative stub is missing w/ -Xint
In-Reply-To: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
References: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
Message-ID: <nwo22ETtlpDIBsL0VjjZtYLy_y5GQdyybjO7x2AESX4=.aae9ccc3-1acf-455b-b827-662e041de783@github.com>

On Mon, 14 Feb 2022 13:40:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.
> 
> Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 
> 
> Testing: test/jdk/java/foreign/ w/ -Xint

Thanks for the fix - maybe consider adding some extra test combinations in TestMatrix (this test is not automated, so it's not run by build and test infra).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7459

From shade at openjdk.java.net  Mon Feb 14 17:08:36 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 14 Feb 2022 17:08:36 GMT
Subject: RFR: 8281744: x86: Use short jumps in TIG::set_vtos_entry_points
Message-ID: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>

Performance in `-Xint` mode seems to be bottlenecked on the code size, rather than particular instruction hotspots, which means code density is important.

There are forward branches in `TemplateInterpreterGenerator::set_vtos_entry_points`, which cannot be shortened by `MacroAssembler`, unless we tell it specifically that the upcoming branch target would be within the 8-bit offset. Which it apparently is in this particular case, because there are just a handful of `push`-es between the jump and its target. If a jump offset is more than 8 bits, the interpreter would catch fire just about everywhere, since `set_vtos_entry_points` is used at every bytecode entry. `fastdebug` builds assert the offset sanity directly.

Current patch improves `SPECjvm2008:serial` performance in `-Xint` mode for about 7% on Ryzen 7 5700G. (More perf runs pending).

There are other places in template interpreter where forward jumps can be short, I'll do them separately, since they are riskier and also less important.

Additional testing:
 - [x] Linux x86_64 fastdebug, `tier1`
 - [x] Linux x86_32 fastdebug, `tier1`

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.java.net/jdk/pull/7463/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7463&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281744
  Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7463.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7463/head:pull/7463

PR: https://git.openjdk.java.net/jdk/pull/7463

From jbhateja at openjdk.java.net  Mon Feb 14 17:18:07 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Mon, 14 Feb 2022 17:18:07 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <EKNLSOdo1lrE-PbeZZ9YC9LonG9FhmDitLo_Wa60vYk=.c4324df4-3b7d-4b12-939e-e74de8e6ae76@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
 <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
 <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>
 <iCaaelBgPdReusZLD-8eM-XDbw_xWsV5mv1vq4umRcg=.58de564c-df8f-4d20-96fd-e64389421cc0@github.com>
 <EKNLSOdo1lrE-PbeZZ9YC9LonG9FhmDitLo_Wa60vYk=.c4324df4-3b7d-4b12-939e-e74de8e6ae76@github.com>
Message-ID: <UMm_6uonBzgdEoJzE3zbiZ2MTeBRh8FWEgR1wjk-SMI=.ecc686f6-f233-4335-a810-89f540992f93@github.com>

On Mon, 14 Feb 2022 09:12:54 GMT, Andrew Haley <aph at openjdk.org> wrote:

>>> What does this do? Comment, even pseudo code, would be nice.
>> 
>> Thanks @theRealAph , I shall append the comments over the routine.
>> BTW, entire rounding algorithm can also be implemented using  Vector API which can perform if-conversion using masked operations.
>> 
>> class roundf {
>>    public static VectorSpecies ISPECIES = IntVector.SPECIES_512;
>>    public static VectorSpecies SPECIES = FloatVector.SPECIES_512;
>> 
>>    public static int round_vector(float[] a, int[] r, int ctr) {
>>       IntVector shiftVBC = (IntVector) ISPECIES.broadcast(24 - 2 + 127);
>>       for (int i = 0; i < a.length; i += SPECIES.length()) {
>>          FloatVector fv = FloatVector.fromArray(SPECIES, a, i);
>>          IntVector iv = fv.reinterpretAsInts();
>>          IntVector biasedExpV = iv.lanewise(VectorOperators.AND, 0x7F800000);
>>          biasedExpV = biasedExpV.lanewise(VectorOperators.ASHR, 23);
>>          IntVector shiftV = shiftVBC.lanewise(VectorOperators.SUB, biasedExpV);
>>          VectorMask cond = shiftV.lanewise(VectorOperators.AND, -32)
>>                .compare(VectorOperators.EQ, 0);
>>          IntVector res = iv.lanewise(VectorOperators.AND, 0x007FFFFF)
>>                .lanewise(VectorOperators.OR, 0x007FFFFF + 1);
>>          VectorMask cond1 = iv.compare(VectorOperators.LT, 0);
>>          VectorMask cond2 = cond1.and(cond);
>>          res = res.lanewise(VectorOperators.NEG, cond2);
>>          res = res.lanewise(VectorOperators.ASHR, shiftV)
>>                .lanewise(VectorOperators.ADD, 1)
>>                .lanewise(VectorOperators.ASHR, 1);
>>          res = fv.convert(VectorOperators.F2I, 0)
>>                .reinterpretAsInts()
>>                .blend(res, cond);
>>          res.intoArray(r, i);
>>       }
>>       return r[ctr];
>>    }
>
> That pseudocode would make a very useful comment too. This whole patch is very thinly commented.

> > Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.
> 
> Hi @merykitty , You are correct, we can embed RC mode in instruction encoding of round instruction (towards -inf,+inf, zero). But to match the semantics of Math.round API one needs to add 0.5[f] to input value and then perform rounding over resultant value, which is why @sviswa7 suggested to use a global rounding mode driven by MXCSR.RC so that intermediate floating inexact values are resolved as desired, but OOO execution may misplace LDMXCSR and hence may have undesired side effects.

**Just want to correct above statement, LDMXCSR will not be re-ordered/re-scheduled early OOO backend.**

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From shade at openjdk.java.net  Mon Feb 14 17:19:09 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 14 Feb 2022 17:19:09 GMT
Subject: RFR: 8280901: MethodHandle::linkToNative stub is missing w/ -Xint
In-Reply-To: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
References: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
Message-ID: <J50gg-J7DWAVGdWZ7XX-tblrw7cDxPqv_ZWiMFYEFH4=.7c20f65b-453e-4e8c-bffe-8e7ef59ad6e4@github.com>

On Mon, 14 Feb 2022 13:40:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.
> 
> Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 
> 
> Testing: test/jdk/java/foreign/ w/ -Xint

Looks fine to me, thanks for fixing!

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7459

From psandoz at openjdk.java.net  Mon Feb 14 19:35:10 2022
From: psandoz at openjdk.java.net (Paul Sandoz)
Date: Mon, 14 Feb 2022 19:35:10 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v3]
In-Reply-To: <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>
Message-ID: <BdB8jNfGoM7uzfoD53-axEryX0d0cKvaHUxLtIUVYDE=.c9f59d32-63e8-4d93-9cd3-ea463a7fc77b@github.com>

On Sun, 13 Feb 2022 05:18:34 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   missing ForceInline

Marked as reviewed by psandoz (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From psandoz at openjdk.java.net  Mon Feb 14 19:35:10 2022
From: psandoz at openjdk.java.net (Paul Sandoz)
Date: Mon, 14 Feb 2022 19:35:10 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v2]
In-Reply-To: <RhoQzZ82rJ1zpohq2Z-8JEuJCA-hM46UXwz5H-GOchU=.d5c06efe-41d1-404d-8a9a-bef65102dd32@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <y5ea-XiYdEiEZ8Nv1TLY-1g22N0ONyfK-7Zdb_U4Alw=.8da05ea8-2f44-4c2a-b601-9c17d5fe6669@github.com>
 <Mw8wnUYFgKMIS124_zvUI4wtV_s7YHydLEwiwkOC0Fw=.72a1e123-3272-40d7-8d95-746fa3da061b@github.com>
 <RhoQzZ82rJ1zpohq2Z-8JEuJCA-hM46UXwz5H-GOchU=.d5c06efe-41d1-404d-8a9a-bef65102dd32@github.com>
Message-ID: <DBc-yBG5N0VyxPLAEXDLvD442bp6YPJ8mGZEw6IA5cI=.b96fc9ad-30c2-4231-88b1-54803b53c73d@github.com>

On Sun, 13 Feb 2022 05:14:47 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Observing the following failures on CPUs with "Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz" with HotSpot flags:
>> 
>> -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation
>> 
>> 
>> TestVectorCastAVX512.java:
>> 
>> Failed IR Rules (1)
>> ------------------
>> - Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUI256toL512(int[],long[])":
>>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastI2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
>>     - counts: Graph contains wrong number of nodes:
>>         Regex 1: (\\d+(\\s){2}(VectorUCastI2X.*)+(\\s){2}===.*)
>>         Expected 1 but found 0 nodes.
>> 
>> 
>> TestVectorCastAVX1.java:
>> 
>> - Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toS64(byte[],short[])":
>>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastB2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
>>     - counts: Graph contains wrong number of nodes:
>>         Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
>>         Expected 1 but found 0 nodes.
>> 
>> - Method "public static void compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toI128(byte[],int[])":
>>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, applyIfAnd={}, applyIfOr={}, counts={"(\\\\d+(\\\\s){2}(VectorUCastB2X.*)+(\\\\s){2}===.*)", "1"}, applyIfNot={})"
>>     - counts: Graph contains wrong number of nodes:
>>         Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
>>         Expected 1 but found 0 nodes.
>
> @PaulSandoz Thanks a lot for your testing, the reason seems to be due to `LaneType::asIntegral` missing `ForceInline` annotation. I have run the reshape test 10 times without getting any failure while with previous patch there is often 1 or 2.
> Thanks.

@merykitty testing now passes. Java bits look good. Needs HotSpot reviewer.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From hseigel at openjdk.java.net  Mon Feb 14 19:48:46 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Mon, 14 Feb 2022 19:48:46 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v5]
In-Reply-To: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
Message-ID: <QODvIzjG2BmpGqB5fpM8ycZNCHP0CqSCUcw96-NPHoo=.5936ea58-59cd-4397-92db-e8dd006e7654@github.com>

> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
> 
> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  rename macro, fix semi-colon issue, fix zero lseek64 and ftruncate64 build issue

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7248/files
  - new: https://git.openjdk.java.net/jdk/pull/7248/files/abb2b0ac..d062fb50

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=03-04

  Stats: 25 lines in 6 files changed: 0 ins; 0 del; 25 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7248.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7248/head:pull/7248

PR: https://git.openjdk.java.net/jdk/pull/7248

From rehn at openjdk.java.net  Mon Feb 14 20:05:11 2022
From: rehn at openjdk.java.net (Robbin Ehn)
Date: Mon, 14 Feb 2022 20:05:11 GMT
Subject: RFR: 8281744: x86: Use short jumps in TIG::set_vtos_entry_points
In-Reply-To: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
References: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
Message-ID: <qw148epvbRJOtheOVT0EK3hhqrTqt3mPUbDQ3h_jO0w=.ae35e4a8-0d89-43e3-b719-3382fa13d4d3@github.com>

On Mon, 14 Feb 2022 15:47:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Performance in `-Xint` mode seems to be bottlenecked on the code size, rather than particular instruction hotspots, which means code density is important.
> 
> There are forward branches in `TemplateInterpreterGenerator::set_vtos_entry_points`, which cannot be shortened by `MacroAssembler`, unless we tell it specifically that the upcoming branch target would be within the 8-bit offset. Which it apparently is in this particular case, because there are just a handful of `push`-es between the jump and its target. If a jump offset is more than 8 bits, the interpreter would catch fire just about everywhere, since `set_vtos_entry_points` is used at every bytecode entry. `fastdebug` builds assert the offset sanity directly.
> 
> Current patch improves `SPECjvm2008:serial` performance in `-Xint` mode for about 7% on Ryzen 7 5700G. (More perf runs pending).
> 
> There are other places in template interpreter where forward jumps can be short, I'll do them separately, since they are riskier and also less important.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier1`

Thanks

-------------

Marked as reviewed by rehn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7463

From coleenp at openjdk.java.net  Mon Feb 14 20:37:10 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 14 Feb 2022 20:37:10 GMT
Subject: RFR: 8281744: x86: Use short jumps in TIG::set_vtos_entry_points
In-Reply-To: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
References: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
Message-ID: <kouBuDj48CeZ4TvUxRQunes8yS13BTzzBZV7FLN_L0M=.c930f6de-692e-409e-8817-f37caf091dd4@github.com>

On Mon, 14 Feb 2022 15:47:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Performance in `-Xint` mode seems to be bottlenecked on the code size, rather than particular instruction hotspots, which means code density is important.
> 
> There are forward branches in `TemplateInterpreterGenerator::set_vtos_entry_points`, which cannot be shortened by `MacroAssembler`, unless we tell it specifically that the upcoming branch target would be within the 8-bit offset. Which it apparently is in this particular case, because there are just a handful of `push`-es between the jump and its target. If a jump offset is more than 8 bits, the interpreter would catch fire just about everywhere, since `set_vtos_entry_points` is used at every bytecode entry. `fastdebug` builds assert the offset sanity directly.
> 
> Current patch improves `SPECjvm2008:serial` performance in `-Xint` mode for about 7% on Ryzen 7 5700G. (More perf runs pending).
> 
> There are other places in template interpreter where forward jumps can be short, I'll do them separately, since they are riskier and also less important.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier1`

Looks good

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7463

From kvn at openjdk.java.net  Mon Feb 14 21:14:10 2022
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Mon, 14 Feb 2022 21:14:10 GMT
Subject: RFR: 8280901: MethodHandle::linkToNative stub is missing w/ -Xint
In-Reply-To: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
References: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
Message-ID: <uQedP_hvqdm8Va3yb8yUZo5uwOMM0iFs1jsjyQ8tCNA=.6796065e-c762-4a76-b441-17d84c5e0900@github.com>

On Mon, 14 Feb 2022 13:40:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.
> 
> Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 
> 
> Testing: test/jdk/java/foreign/ w/ -Xint

Marked as reviewed by kvn (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7459

From dholmes at openjdk.java.net  Mon Feb 14 22:49:14 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 14 Feb 2022 22:49:14 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v5]
In-Reply-To: <QODvIzjG2BmpGqB5fpM8ycZNCHP0CqSCUcw96-NPHoo=.5936ea58-59cd-4397-92db-e8dd006e7654@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <QODvIzjG2BmpGqB5fpM8ycZNCHP0CqSCUcw96-NPHoo=.5936ea58-59cd-4397-92db-e8dd006e7654@github.com>
Message-ID: <IP1CgEQUri6O7ONsG1teSc_yTE5aVlHNbsI5KEC8xk8=.3e64d0e1-795d-4697-890a-0d016d4adc66@github.com>

On Mon, 14 Feb 2022 19:48:46 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
>> 
>> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
>> 
>> Thanks, Harold
>
> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
> 
>   rename macro, fix semi-colon issue, fix zero lseek64 and ftruncate64 build issue

Hi Harold,

This is looking better - thanks - but I think the lseek64 situation needs handling differently.

Thanks,
David

src/hotspot/os/linux/os_linux.cpp line 4924:

> 4922: }
> 4923: 
> 4924: off64_t call_lseek64(int fd, off64_t offset, int whence) {

I think it would be better to just change the `lseek64` calls to `os::lseek` rather than introduce this wrapper function.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7248

From sviswanathan at openjdk.java.net  Tue Feb 15 02:14:14 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Tue, 15 Feb 2022 02:14:14 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v3]
In-Reply-To: <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>
Message-ID: <jKQnKU6fY0r2kOLCfj3lTUnmi9tpAR0JvZeDL3_zoB0=.3233faa1-5575-44bf-97d4-0d361c9b5837@github.com>

On Sun, 13 Feb 2022 05:18:34 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   missing ForceInline

Marked as reviewed by sviswanathan (Reviewer).

Hotspot changes look good to me.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From dlong at openjdk.java.net  Tue Feb 15 04:11:14 2022
From: dlong at openjdk.java.net (Dean Long)
Date: Tue, 15 Feb 2022 04:11:14 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
Message-ID: <KAl1siyQAsPD-UwLEgMUz87wMsuJawqL2JqZVvf6QYA=.590efc87-b2fe-4e77-a2c4-b6a341355836@github.com>

On Tue, 8 Feb 2022 18:19:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
> 
> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
> 
> The default values for options are different per platform, so tests are x86_64 specific.
> 
> No default value is changed, this only unblocks experiments.
> 
> Additional testing:
>  - [x] New tests on Linux x86_64 fastdebug
>  - [x] New tests on Linux x86_64 release

Looks good.

-------------

Marked as reviewed by dlong (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7388

From ioi.lam at oracle.com  Tue Feb 15 05:20:54 2022
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 14 Feb 2022 21:20:54 -0800
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <0d081302-9dfb-3e48-13c0-8ee151bfb626@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
 <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
 <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>
 <5d25e7ceeabd9186dd6fe5e9e6e04d0d11ef26c0.camel@redhat.com>
 <3a76d11a-6816-5179-5a32-fd87e94ae90a@oracle.com>
 <0d081302-9dfb-3e48-13c0-8ee151bfb626@oracle.com>
Message-ID: <99d70a98-e5f2-459e-1606-f2ee4edcfb6f@oracle.com>


On 2/13/2022 11:02 PM, David Holmes wrote:
> On 14/02/2022 4:07 pm, Ioi Lam wrote:
>> On 2/8/2022 3:32 AM, Severin Gehwolf wrote:
>>> On Mon, 2022-02-07 at 22:29 -0800, Ioi Lam wrote:
>>>> On 2022/02/07 10:36, Severin Gehwolf wrote:
>>>>> On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
>>>>>> Case (4) is the cause for the bug in JDK-8279484
>>>>>>
>>>>>> Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and 
>>>>>> cpu.shares to 2.
>>>>>> This means:
>>>>>>
>>>>>> - This container is guaranteed a minimum amount of CPU resources
>>>>>> - If no other containers are executing, this container can use as
>>>>>> ???? much CPU as available on the host
>>>>>> - If other containers are executing, the amount of CPU available
>>>>>> ???? to this container is (2 / (sum of cpu.shares of all active
>>>>>> ???? containers))
>>>>>>
>>>>>>
>>>>>> The fundamental problem with the current JVM implementation is 
>>>>>> that it
>>>>>> treats "CPU request" as a maximum value, the opposite of what 
>>>>>> Kubernetes
>>>>>> does. Because of this, in case (4), the JVM artificially limits 
>>>>>> itself
>>>>>> to a single CPU. This leads to CPU underutilization.
>>>>> I agree with your analysis. Key point is that in such a setup
>>>>> Kubernetes sets CPU shares value to 2. Though, it's a very specific
>>>>> case.
>>>>>
>>>>> In contrast to Kubernetes the JVM doesn't have insight into what 
>>>>> other
>>>>> containers are doing (or how they are configured). It would, perhaps,
>>>>> be good to know what Kubernetes does for containers when the
>>>>> environment (i.e. other containers) changes. Do they get restarted?
>>>>> Restarted with different values for cpu shares?
>>>> My understanding is that Kubernetes will try to do load balancing and
>>>> may migrate the containers. According to this:
>>>>
>>>> https://stackoverflow.com/questions/64891872/kubernetes-dynamic-configurationn-of-cpu-resource-limit 
>>>>
>>>>
>>>> If you change the CPU limits, a currently running container will be 
>>>> shut
>>>> down and restarted (using the new limit), and may be relocated to a
>>>> different host if necessary.
>>>>
>>>> I think this means that a JVM process doesn't need to worry about the
>>>> CPU limit changing during its lifetime :-)
>>>>> Either way, what are our options to fix this? Does it need fixing?
>>>>>
>>>>> ?? * Should we no longer take cpu shares as a means to limit CPU into
>>>>> ???? account? It would be a significant change to how previous JDKs
>>>>> ???? worked. Maybe that wouldn't be such a bad idea :)
>>>> I think we should get rid of it. This feature was designed to work 
>>>> with
>>>> Kubernetes, but has no effect in most cases. The only time it takes
>>>> effect (when no resource limits are set) it does the opposite of what
>>>> the user expects.
>>> I tend to agree. We should start with a CSR review of this, though, as
>>> it would be a behavioural change as compared to previous versions of
>>> the JDK.
>>
>> Hi Severin,
>>
>> Sorry for the delay. I've created a CSR. Could you take a look?
>>
>> https://bugs.openjdk.java.net/browse/JDK-8281571
>>
>>>
>>>> Also, the current implementation is really tied to specific 
>>>> behaviors of
>>>> Kubernetes + docker (the 1024 and 100 constants). This will cause
>>>> problems with other container/orchestration software that use 
>>>> different
>>>> algorithms and constants.
>>> There are other container orchestration frameworks, like Mesos, which
>>> behave in a similar way (1024 constant is being used). The good news is
>>> that mesos seems to have moved to a hard-limit default. See:
>>>
>>> https://mesos.apache.org/documentation/latest/quota/#deprecated-quota-guarantees 
>>>
>>>
>>>>> ?? * How likely is CPU underutilization to happen in practise?
>>>>> ???? Considering the container is not the only container on the node,
>>>>> ???? then according to your formula, it'll get one CPU or less 
>>>>> anyway.
>>>>> ???? Underutilization would, thus, only happen when it's an idle node
>>>>> ???? with no other containers running. That would suggest to do 
>>>>> nothing
>>>>> ???? and let the user override it as they see fit.
>>>> I think under utilization happens when the containers have a bursty
>>>> usage pattern. If other containers do not fully utilize their CPU
>>>> quotas, we should distribute the unused CPUs to the busy containers.
>>> Right, but this isn't really something the JVM process should care
>>> about. It's really a core feature of the orchestration framework to do
>>> that. All we could do is to not limit CPU for those cases. On the other
>>> hand there is the risk of resource starvation too. Consider a node with
>>> many cores, 50 say, and a very small cpu share setting via container
>>> limits. The experience running a JVM application in such a set up would
>>> be very mediocre as the JVM thinks it can use 50 cores (100% of the
>>> time), yet it would only get this when the rest of the
>>> containers/universe is idle.
>>
>> I think we have a general problem that's not specific to containers. 
>> If we are running 50 active Java processes on a bare-bone Linux, then 
>> each of them would be default use? a 50-thread ForkJoinPool. In each 
>> process is given an equal amount of CPU resources, it would make 
>> sense for each of them to have a single thread FJP so we can avoid 
>> all thread context switching.
>
> The JVM cannot optimise this situation because it has no knowledge of 
> the system, its load, or the workload characteristics. It also doesn't 
> know how the scheduler may apportion CPU resources. Sizing heuristics 
> within the JDK itself are pretty basic. If the user/deployer has 
> better knowledge of what would constitute an "optimum" configuration 
> then they have control knobs (system properties, VM flags) they can 
> use to implement that.
>
>> Or, maybe the Linux kernel is already good enough? If each process is 
>> bound to a single physical CPU, context switching between the threads 
>> of the same process should be pretty lightweight. It would be 
>> worthwhile writing a test case ....
>
> Binding a process to a single CPU would be potentially very bad for 
> some workloads. Neither end-point is likely to be "best" in general.
>

I found some interesting numbers. I think this means we don't accomplish 
much by restricting the size of thread pools from a relatively small 
number (the number of physical CPUs, 3 digit or less) to an even smaller 
number computed by CgroupSubsystem::active_processor_count().

https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/

<quote>
[Cost for each context switch is] somewhere between 1.2 and 1.5 
microseconds per context switch ... Is 1-2 us a long time? As I have 
mentioned in the post on launch overheads, a good comparison is memcpy, 
which takes 3 us for 64 KiB on the same machine. In other words, a 
context switch is a bit quicker than copying 64 KiB of memory from one 
location to another.
...
Conclusion
The numbers reported here paint an interesting picture on the state of 
Linux multi-threaded performance in 2018. I would say that the limits 
still exist - running a million threads is probably not going to make 
sense; however, the limits have definitely shifted since the past, and a 
lot of folklore from the early 2000s doesn't apply today. On a beefy 
multi-core machine with lots of RAM we can easily run 10,000 threads in 
a single process today, in production.
</quote>

So after the proposed change, some users may be surprised, "why do I now 
have 32 threads sleeping inside my containerized app", but the actual 
CPU/memory cost would be minimal, with a large potential up side -- the 
app can run much faster when the rest of the system is quiet.

(I ran a small test on Linux x64 and the cost per Java thread is about 
90KB).

Thanks
- Ioi


From duke at openjdk.java.net  Tue Feb 15 05:41:13 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Tue, 15 Feb 2022 05:41:13 GMT
Subject: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts [v3]
In-Reply-To: <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
 <9geCUxBmjKm5HoVrV2HTlD5DSFkJX-GdvlZbPPnzIcM=.ed8260f3-eed5-4f18-9e37-c12a304e9b4e@github.com>
Message-ID: <NZaiplpKulLJ_F6BvvUxKzo6KNpu4ZHEutX4oM_1Gy0=.962e7c28-fec1-4988-babe-78c3afd7705b@github.com>

On Sun, 13 Feb 2022 05:18:34 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   missing ForceInline

Thanks a lot for your reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From david.holmes at oracle.com  Tue Feb 15 05:50:10 2022
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 15 Feb 2022 15:50:10 +1000
Subject: [RFC containers] 8281181 JDK's interpretation of CPU Shares
 causes underutilization
In-Reply-To: <99d70a98-e5f2-459e-1606-f2ee4edcfb6f@oracle.com>
References: <5636636e-3ef9-0087-f3f4-8ef15d618489@oracle.com>
 <5dbfb77029a00d67542a9104855b2d98a3d8ce5e.camel@redhat.com>
 <587acce6-dd30-1f78-caf6-17925c32cae6@oracle.com>
 <bb58a7e2bb78a824db2096716c2144fd515f8d4b.camel@redhat.com>
 <a0707923-8397-3448-5e53-ce07677a1104@oracle.com>
 <5d25e7ceeabd9186dd6fe5e9e6e04d0d11ef26c0.camel@redhat.com>
 <3a76d11a-6816-5179-5a32-fd87e94ae90a@oracle.com>
 <0d081302-9dfb-3e48-13c0-8ee151bfb626@oracle.com>
 <99d70a98-e5f2-459e-1606-f2ee4edcfb6f@oracle.com>
Message-ID: <4a9ee526-dfcd-d02f-0ec9-692a91a76d90@oracle.com>

Trimming ...

On 15/02/2022 3:20 pm, Ioi Lam wrote:
> On 2/13/2022 11:02 PM, David Holmes wrote:
>> On 14/02/2022 4:07 pm, Ioi Lam wrote:
>>> I think we have a general problem that's not specific to containers. 
>>> If we are running 50 active Java processes on a bare-bone Linux, then 
>>> each of them would be default use? a 50-thread ForkJoinPool. In each 
>>> process is given an equal amount of CPU resources, it would make 
>>> sense for each of them to have a single thread FJP so we can avoid 
>>> all thread context switching.
>>
>> The JVM cannot optimise this situation because it has no knowledge of 
>> the system, its load, or the workload characteristics. It also doesn't 
>> know how the scheduler may apportion CPU resources. Sizing heuristics 
>> within the JDK itself are pretty basic. If the user/deployer has 
>> better knowledge of what would constitute an "optimum" configuration 
>> then they have control knobs (system properties, VM flags) they can 
>> use to implement that.
>>
>>> Or, maybe the Linux kernel is already good enough? If each process is 
>>> bound to a single physical CPU, context switching between the threads 
>>> of the same process should be pretty lightweight. It would be 
>>> worthwhile writing a test case ....
>>
>> Binding a process to a single CPU would be potentially very bad for 
>> some workloads. Neither end-point is likely to be "best" in general.
>>
> 
> I found some interesting numbers. I think this means we don't accomplish 
> much by restricting the size of thread pools from a relatively small 
> number (the number of physical CPUs, 3 digit or less) to an even smaller 
> number computed by CgroupSubsystem::active_processor_count().
> 
> https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/ 
> 
> 
> <quote>
> [Cost for each context switch is] somewhere between 1.2 and 1.5 
> microseconds per context switch ... Is 1-2 us a long time? As I have 
> mentioned in the post on launch overheads, a good comparison is memcpy, 
> which takes 3 us for 64 KiB on the same machine. In other words, a 
> context switch is a bit quicker than copying 64 KiB of memory from one 
> location to another.
> ...
> Conclusion
> The numbers reported here paint an interesting picture on the state of 
> Linux multi-threaded performance in 2018. I would say that the limits 
> still exist - running a million threads is probably not going to make 
> sense; however, the limits have definitely shifted since the past, and a 
> lot of folklore from the early 2000s doesn't apply today. On a beefy 
> multi-core machine with lots of RAM we can easily run 10,000 threads in 
> a single process today, in production.
> </quote>

I agree that the under-utilization caused by the way shares is currently 
used is bad. But I don't see how the above really relates to that at 
all. The above is primarily about the RAM cost of threads - and I agree 
it's better now than it used to be, so a system can support many more 
threads than it used to. But the main issue with sizing thread pools etc 
is about effective servicing of load to either achieve throughput or 
response time goals. Too many threads, just like have too many of any 
kind of worker, can be very inefficient when they just get in each 
others way.

Cheers,
David
-----

> So after the proposed change, some users may be surprised, "why do I now 
> have 32 threads sleeping inside my containerized app", but the actual 
> CPU/memory cost would be minimal, with a large potential up side -- the 
> app can run much faster when the rest of the system is quiet.
> 
> (I ran a small test on Linux x64 and the cost per Java thread is about 
> 90KB).
> 
> Thanks
> - Ioi
> 

From shade at openjdk.java.net  Tue Feb 15 06:22:15 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 15 Feb 2022 06:22:15 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
Message-ID: <FpLmc5wRRJAVlc-R3nyyx7_ZX6JXTBZ2LYRzkbDza4U=.2e2aa58f-6263-4ef5-a0e5-b617fd78cae3@github.com>

On Tue, 8 Feb 2022 18:19:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
> 
> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
> 
> The default values for options are different per platform, so tests are x86_64 specific.
> 
> No default value is changed, this only unblocks experiments.
> 
> Additional testing:
>  - [x] New tests on Linux x86_64 fastdebug
>  - [x] New tests on Linux x86_64 release

Thank you!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From shade at openjdk.java.net  Tue Feb 15 06:22:17 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 15 Feb 2022 06:22:17 GMT
Subject: Integrated: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
Message-ID: <2gHgw8d9J0G5-tMO5mB-JYhTkfgt5goIFj92lzxcOlU=.37834a48-0cbc-4c6f-ab93-be942e7a640a@github.com>

On Tue, 8 Feb 2022 18:19:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
> 
> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
> 
> The default values for options are different per platform, so tests are x86_64 specific.
> 
> No default value is changed, this only unblocks experiments.
> 
> Additional testing:
>  - [x] New tests on Linux x86_64 fastdebug
>  - [x] New tests on Linux x86_64 release

This pull request has now been integrated.

Changeset: b1564624
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/b1564624ce454d0df9b2464424b7b5e449481ee6
Stats:     178 lines in 4 files changed: 176 ins; 0 del; 2 mod

8281467: Allow larger OptoLoopAlignment and CodeEntryAlignment

Reviewed-by: kvn, dlong

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From kbarrett at openjdk.java.net  Tue Feb 15 06:52:08 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 15 Feb 2022 06:52:08 GMT
Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes
In-Reply-To: <-94XsVuSzzJ68iz1GZCqu4BXZOp9OVA6t9R7tvaUIU4=.108377db-1312-45de-aafc-88b980e41778@github.com>
References: <SDUK8k1OZZU7qha8dERVEj34cNap1YPEqEv92b1hCxw=.908aecb5-b941-48c6-a690-49690f2359ad@github.com>
 <-94XsVuSzzJ68iz1GZCqu4BXZOp9OVA6t9R7tvaUIU4=.108377db-1312-45de-aafc-88b980e41778@github.com>
Message-ID: <YY6XY-T1fod0PfH4YvBSlH--QYPjtM6HcguATajdUpI=.10aa2967-b7bb-4259-b997-55b07f2a5fb6@github.com>

On Mon, 31 Jan 2022 22:14:32 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Please review this change to the HotSpot Style Guide change process.
>> 
>> The current process involves gathering consensus among the HotSpot Group
>> Members.  That's fine for changes of substance.  But it seems overly weighty
>> for editorial changes that don't affect the substance of the guide, but only
>> it's clarity or accuracy.
>> 
>> The proposed change would permit the normal PR process to be used for such
>> changes, but require the requisite reviewers to additionally be HotSpot Group
>> Members.
>> 
>> Note that there have already been a couple of changes that effectively
>> followed the proposed new process.
>> https://bugs.openjdk.java.net/browse/JDK-8274169
>> https://bugs.openjdk.java.net/browse/JDK-8280182
>> 
>> This is a modification of the Style Guide, so rough consensus among the
>> HotSpot Group members is required to make this change. Only Group members
>> should vote for approval (via the github PR), though reasoned objections or
>> comments from anyone will be considered. A decision on this proposal will not
>> be made before Monday 14-Feb-2022 at 12h00 UTC.
>> 
>> Since we're piggybacking on github PRs here, please use the PR review process
>> to approve (click on Review Changes > Approve), rather than sending a "vote:
>> yes" email reply that would be normal for a CFV.
>
> Approved.

Thanks @vnkozlov and all the other reviewers.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7281

From kbarrett at openjdk.java.net  Tue Feb 15 06:54:13 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 15 Feb 2022 06:54:13 GMT
Subject: Integrated: 8280916: Simplify HotSpot Style Guide editorial changes
In-Reply-To: <SDUK8k1OZZU7qha8dERVEj34cNap1YPEqEv92b1hCxw=.908aecb5-b941-48c6-a690-49690f2359ad@github.com>
References: <SDUK8k1OZZU7qha8dERVEj34cNap1YPEqEv92b1hCxw=.908aecb5-b941-48c6-a690-49690f2359ad@github.com>
Message-ID: <i2fz9qwrsz5guuDOCMSUYrc2RNutMvryPEIIKPfirdA=.8a793817-a7a9-45e9-95fa-8cf03dfe9596@github.com>

On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the HotSpot Style Guide change process.
> 
> The current process involves gathering consensus among the HotSpot Group
> Members.  That's fine for changes of substance.  But it seems overly weighty
> for editorial changes that don't affect the substance of the guide, but only
> it's clarity or accuracy.
> 
> The proposed change would permit the normal PR process to be used for such
> changes, but require the requisite reviewers to additionally be HotSpot Group
> Members.
> 
> Note that there have already been a couple of changes that effectively
> followed the proposed new process.
> https://bugs.openjdk.java.net/browse/JDK-8274169
> https://bugs.openjdk.java.net/browse/JDK-8280182
> 
> This is a modification of the Style Guide, so rough consensus among the
> HotSpot Group members is required to make this change. Only Group members
> should vote for approval (via the github PR), though reasoned objections or
> comments from anyone will be considered. A decision on this proposal will not
> be made before Monday 14-Feb-2022 at 12h00 UTC.
> 
> Since we're piggybacking on github PRs here, please use the PR review process
> to approve (click on Review Changes > Approve), rather than sending a "vote:
> yes" email reply that would be normal for a CFV.

This pull request has now been integrated.

Changeset: 11f943d1
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/11f943d148e7bc8d931c382ff019b3e65a87432e
Stats:     13 lines in 2 files changed: 9 ins; 0 del; 4 mod

8280916: Simplify HotSpot Style Guide editorial changes

Reviewed-by: dcubed, dholmes, stuefe, stefank, kvn, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/7281

From kbarrett at openjdk.java.net  Tue Feb 15 08:05:10 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 15 Feb 2022 08:05:10 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v4]
In-Reply-To: <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>
Message-ID: <tZY4BTVImGYhYVVG4HOhpvQI9VlCtMhuNGU9iunYwp0=.2a2f9fc0-d3b7-4bb4-9904-af4bc01a46f7@github.com>

On Thu, 10 Feb 2022 17:23:42 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   lower case

Changes requested by kbarrett (Reviewer).

src/hotspot/share/gc/parallel/psOldGen.cpp line 48:

> 46: #else
> 47:   _expand_lock(Mutex::safepoint, "PSOldGenExpand_lock", true)
> 48: #endif

I think this is the only relative lock rank outside of mutexLocker.  I didn't realize that when I suggested the possibility of a private mutex.  I'd prefer the definition and rank calculation be left in mutexLocker.

src/hotspot/share/gc/parallel/psOldGen.cpp line 179:

> 177:     // expand.  That's okay, we'll just try expanding again.
> 178:     bool needs_expand =
> 179:       pointer_delta(object_space()->end(), object_space()->top()) < word_size;

This has lost the comment in the `needs_expand` function about the stability of end() and the associated implications for access ordering.  I'd like to keep that information.

src/hotspot/share/gc/parallel/psOldGen.cpp line 283:

> 281:   size_t size = align_down(bytes, virtual_space()->alignment());
> 282:   if (size > 0) {
> 283:     assert_lock_strong(&_expand_lock);

[pre-existing] Redundant with the assert at the beginning of the function.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From lucy at openjdk.java.net  Tue Feb 15 08:29:16 2022
From: lucy at openjdk.java.net (Lutz Schmidt)
Date: Tue, 15 Feb 2022 08:29:16 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
 <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
Message-ID: <hdbQsNLm2usoweGnu_qCOhj6C5e1CLMw2mp9zB5KMVQ=.60c3ab86-dc05-46d9-afe8-f41c1038eb48@github.com>

On Fri, 11 Feb 2022 12:11:54 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
>> 
>> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
>> 
>> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
>> 
>> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
>> 
>> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).
>
> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 23 additional commits since the last revision:
> 
>  - Merge branch 'master' into count_positives
>  - Restore partial vector checks in AVX2 and SSE intrinsic variants
>  - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
>  - Simplify changes to encodeUTF8
>  - Fix little-endian error caught by testing
>  - Reduce jumps in the ascii path
>  - Remove unused tail_mask
>  - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
>  - Add more comments, simplify tail branching in AVX512 variant
>  - Resolve issues in the precise implementation
>  - ... and 13 more: https://git.openjdk.java.net/jdk/compare/d4fb8919...c4bb3612

Hi Claes,
I'm working on the s390 implementation. I hoped to have it ready, but tests are failing. I'll post a PR (similar to Martin's) once I believe my work is worth to be looked at. 

Just for clarification: the return value must be the index of the first negative byte?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From shade at openjdk.java.net  Tue Feb 15 09:50:32 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 15 Feb 2022 09:50:32 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
Message-ID: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>

Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.

Additional testing:
 - [x] Linux x86_64 fastdebug `hotspot:tier1`
 - [x] Linux x86_32 fastdebug `hotspot:tier1`

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.java.net/jdk/pull/7475/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7475&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281815
  Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7475.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7475/head:pull/7475

PR: https://git.openjdk.java.net/jdk/pull/7475

From ayang at openjdk.java.net  Tue Feb 15 10:09:55 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Tue, 15 Feb 2022 10:09:55 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v5]
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision:

 - rename
 - revert review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7124/files
  - new: https://git.openjdk.java.net/jdk/pull/7124/files/d5a2a9ca..ef078740

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=03-04

  Stats: 44 lines in 6 files changed: 25 ins; 10 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7124.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7124/head:pull/7124

PR: https://git.openjdk.java.net/jdk/pull/7124

From ayang at openjdk.java.net  Tue Feb 15 10:09:55 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Tue, 15 Feb 2022 10:09:55 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v4]
In-Reply-To: <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <xIPIF-EkTdwIrDNmrMSSFJMeFrHIZB-A5IHrMDbvyhg=.017dbf7a-157c-4b86-bd36-76fda786a01d@github.com>
Message-ID: <qoWnYVOF7AQM5xBPieeFec0JZI105J8uA12Zbh1RpQE=.3d186fce-8937-4c27-8143-dfb5516f7e85@github.com>

On Thu, 10 Feb 2022 17:23:42 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   lower case

I have moved the lock back to `mutexLocker` and used the name `PSOldGenExpand_lock`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From kbarrett at openjdk.java.net  Tue Feb 15 10:37:06 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 15 Feb 2022 10:37:06 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v5]
In-Reply-To: <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>
Message-ID: <nSuPBI9uHzgmxH7vALy6PAzZ42dv-l-Xh5V5rnpRyro=.36796f3e-6ed2-40f2-a1b3-1948b6e015e2@github.com>

On Tue, 15 Feb 2022 10:09:55 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - rename
>  - revert review

Looks good.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7124

From redestad at openjdk.java.net  Tue Feb 15 10:41:10 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Tue, 15 Feb 2022 10:41:10 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <hdbQsNLm2usoweGnu_qCOhj6C5e1CLMw2mp9zB5KMVQ=.60c3ab86-dc05-46d9-afe8-f41c1038eb48@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
 <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
 <hdbQsNLm2usoweGnu_qCOhj6C5e1CLMw2mp9zB5KMVQ=.60c3ab86-dc05-46d9-afe8-f41c1038eb48@github.com>
Message-ID: <I7-qeKjNAXtLbPz6CodawyDO9V3rUZj3dThIV3fZdhU=.6799f1c6-1579-4600-999b-9b39a2f1d39b@github.com>

On Tue, 15 Feb 2022 08:25:29 GMT, Lutz Schmidt <lucy at openjdk.org> wrote:

> Hi Claes, I'm working on the s390 implementation. 

Awesome, thanks!

> 
> Just for clarification: the return value must be the index of the first negative byte?

Yes, or the length if there are no such bytes. 

I've considered (and am still considering) writing the spec of `countPositives` to allow intrinsics to do an early return of a value that is less than the index if it's prohibitively expensive or complicated to implement the intrinsic to be precise in the case where it finds a negative byte. While it must be precise w.r.t. returning the full length if it's all positive bytes, no call site would break if the intrinsic returned 0 or some convenient number less than the first negative index (my first experiments with the x86 intrinsic did it like this, but since the semantics of the intrinsic would then differ from the java code I was asked to try and make it precise). The aarch64 algorithm is proving to be a challenge to work with and I might ask again for some leeway in a first implementation there.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From sjohanss at openjdk.java.net  Tue Feb 15 11:02:09 2022
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Tue, 15 Feb 2022 11:02:09 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v5]
In-Reply-To: <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>
Message-ID: <gzAr8xfsCUCqnezEFpBOqk1VGtr3j-bjHFT4DdIjS_E=.71ffc0f8-68d3-44d9-9f7a-2938fc457a14@github.com>

On Tue, 15 Feb 2022 10:09:55 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - rename
>  - revert review

Marked as reviewed by sjohanss (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From lucy at openjdk.java.net  Tue Feb 15 11:24:18 2022
From: lucy at openjdk.java.net (Lutz Schmidt)
Date: Tue, 15 Feb 2022 11:24:18 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
 <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
Message-ID: <G06mbYFS2Yq2nxAgOPyb7WDlcsnFRTb3dhVjf-A0hxY=.4e57344c-2a41-482e-aa7a-cd093501558f@github.com>

On Fri, 11 Feb 2022 12:11:54 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
>> 
>> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
>> 
>> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
>> 
>> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
>> 
>> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).
>
> Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 23 additional commits since the last revision:
> 
>  - Merge branch 'master' into count_positives
>  - Restore partial vector checks in AVX2 and SSE intrinsic variants
>  - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
>  - Simplify changes to encodeUTF8
>  - Fix little-endian error caught by testing
>  - Reduce jumps in the ascii path
>  - Remove unused tail_mask
>  - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
>  - Add more comments, simplify tail branching in AVX512 variant
>  - Resolve issues in the precise implementation
>  - ... and 13 more: https://git.openjdk.java.net/jdk/compare/cee17570...c4bb3612

Well, with the existing implementations for ppc and s390, I do not see a complexity advantage with a relaxed spec. The code would have to be there anyway. 

When it comes to cost, the worst case would be an array of length n, a loop unroll factor of (u==n) and the first (and only) negative byte at index (n-1). All bytes would then be checked twice. With growing n, the overhead diminishes. After all, you want profile-based stub generation - with actual load matching the profile, of course.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From ayang at openjdk.java.net  Tue Feb 15 12:27:12 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Tue, 15 Feb 2022 12:27:12 GMT
Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
 [v5]
In-Reply-To: <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
 <r2oCRNHXjlf38DOGFfcAecIg2Rfz9_GaIy986c8qYTA=.b301c9f4-5eb5-4156-910c-027a5c22af49@github.com>
Message-ID: <ZdS5KRoRM_54KrzVHsVUJuANIxBqLEpmwVSlhTR_Eg4=.445b8d43-89f4-467a-b8e0-5971cb1153e8@github.com>

On Tue, 15 Feb 2022 10:09:55 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> This PR consists of two commits:
>> 
>> 1. remove `ExpandHeap_lock` in Serial GC code.
>> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
>> 
>> Test: tier1-6
>
> Albert Mingkun Yang has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - rename
>  - revert review

Thanks for the review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From ayang at openjdk.java.net  Tue Feb 15 12:27:12 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Tue, 15 Feb 2022 12:27:12 GMT
Subject: Integrated: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock
In-Reply-To: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
References: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com>
Message-ID: <ORScDbn_XJTxjtZtUZ-rsHWLlPYITt-V5M_r7dIXSAc=.c0fba92b-a5d2-42d5-b1d4-998f40abce8d@github.com>

On Tue, 18 Jan 2022 12:03:46 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> This PR consists of two commits:
> 
> 1. remove `ExpandHeap_lock` in Serial GC code.
> 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only.
> 
> Test: tier1-6

This pull request has now been integrated.

Changeset: bc614840
Author:    Albert Mingkun Yang <ayang at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/bc6148407e629bd99fa5a8577ebd90320610f349
Stats:     24 lines in 7 files changed: 8 ins; 3 del; 13 mod

8280136: Serial: Remove unnecessary use of ExpandHeap_lock

Reviewed-by: iwalulya, kbarrett, sjohanss

-------------

PR: https://git.openjdk.java.net/jdk/pull/7124

From redestad at openjdk.java.net  Tue Feb 15 13:45:07 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Tue, 15 Feb 2022 13:45:07 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v2]
In-Reply-To: <G06mbYFS2Yq2nxAgOPyb7WDlcsnFRTb3dhVjf-A0hxY=.4e57344c-2a41-482e-aa7a-cd093501558f@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
 <aEVwA9aHody4Vbk2M3x2KBeJyCv_VuRY2VmQzeI0EHI=.9ce9c90a-44b8-4a93-a732-62f18c3163ac@github.com>
 <G06mbYFS2Yq2nxAgOPyb7WDlcsnFRTb3dhVjf-A0hxY=.4e57344c-2a41-482e-aa7a-cd093501558f@github.com>
Message-ID: <k2MSLiT28qQu5GPEjz_dkEHAHNqdNmUOv0wWDGvlMiY=.cf0e2a97-ac71-4654-990c-1920cd7ce796@github.com>

On Tue, 15 Feb 2022 11:20:55 GMT, Lutz Schmidt <lucy at openjdk.org> wrote:

> Well, with the existing implementations for ppc and s390, I do not see a complexity advantage with a relaxed spec. The code would have to be there anyway.

Same for x86, but we could avoid going into and checking the tail on a negative byte in a vector and instead an early return that returns `N * vector_size` where `N` is the number of vectors we've checked that were all positive. This could save a few ns in some cases.

> 
> When it comes to cost, the worst case would be an array of length n, a loop unroll factor of (u==n) and the first (and only) negative byte at index (n-1). All bytes would then be checked twice. With growing n, the overhead diminishes. After all, you want profile-based stub generation - with actual load matching the profile, of course.

Sounds about right. I've explored the cost of this in a few microbenchmarks. In `StringDecode/-Encode` such double-checking would happen anyhow later on in the java code. So for most of the prominent use such double-checking is performance neutral even in the worst case. 

There are a few places where we don't productively use the count and continue to lean on a `hasNegatives` predicate which calls into `countPositives`. This will mean a small amount of useless computation on certain inputs. For the 16- and 32-byte vectors I've benchmarked extensively on x86 (AVX2) the worst case overhead landed in the vicinity of 20 cycles (7.5-15ns @ 2.4Ghz). Allowing for imprecision _could_ improve a few such corner cases, but I've not found a performance sensitive place where it would really matter.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

From stuefe at openjdk.java.net  Tue Feb 15 16:00:07 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 15 Feb 2022 16:00:07 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend
In-Reply-To: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
Message-ID: <vvFsE4TIImv4hJ-CXYZ0te80ng_gWXMJsWamZz_BE-k=.dded54fb-568f-4300-b107-8d26b057363d@github.com>

On Mon, 31 Jan 2022 08:12:02 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> NMT backend can be further simplified and cleaned out.
> 
> - some entry points require NMT_TrackingLevel as arguments, some use the global tracking level. Ultimately, every part of NMT always uses the global tracking level, so in many cases the explicit parameter can be removed and the global tracking level can be used instead.
> - `MemTracker::malloc_header_size(level)` + `MemTracker::malloc_footer_size(level)` are fused into `MemTracker::overhead_per_malloc()`
> - when adding to `MallocSiteTable`, caller gets back a shortcut to the entry. That shortcut is stored verbatim in the malloc header. It consists of two 16-bit values (bucket index and chain position). That tupel finds its way into many argument lists. It can be simplified into single 32-bit opaque marker. Code outside the MallocSiteTable does not need to know what it is.
> - Currently, the `MallocHeader` class contains a lot of logic. It accounts (in constructor) and de-accounts (in `MallocHeader::release()`). It would simplify code if `MallocHeader` were just a dumb data carrier and the `MallocTracker` would do the actual work.
> - `MallocHeader` can be simplified, almost all members made constant and modifying accessors removed.
> - In some places we handle inputptr=NULL gracefully where we should assert instead
> - Expressions like `MemTracker::tracking_level() != NMT_off` can be simplified to `MemTracker::enabled()`.
> - MemTracker::malloc_base (all variants) can be removed. Note that we have MallocTracker::malloc_header, which achieves the same and does not require casting to the header.
> 
> Testing:
> 
> - GHAs
> - manually ran NMT gtests (all NMT modes) and NMT jtreg tests on Ubuntu x64
> - SAP nightlies ran through. Note that since 8275301 "Unify C-heap buffer overrun checks into NMT" NMT is enabled by default in debug builds, so it gets a lot more workout in tests now.
> 
> Note that I wanted to manually verify that the gdb "call pp" command still works in order to not break Zhengyu's recent addition, but found its already broken. I filed https://bugs.openjdk.java.net/browse/JDK-8281023 and am preparing a separate patch.

Tested at SAP for 14 days, no problems. 

Any opinions? Should I reduce this patch, or split it into parts to make it more palatable?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7283

From shade at openjdk.java.net  Tue Feb 15 16:45:08 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 15 Feb 2022 16:45:08 GMT
Subject: RFR: 8281744: x86: Use short jumps in TIG::set_vtos_entry_points
In-Reply-To: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
References: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
Message-ID: <UY-8TXWghl_qilke1RBnUPgyRKRgq_6KxaOkeeNfgk8=.7e5bdd59-b3ec-4a98-930b-e3c3a5474e05@github.com>

On Mon, 14 Feb 2022 15:47:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Performance in `-Xint` mode seems to be bottlenecked on the code size, rather than particular instruction hotspots, which means code density is important.
> 
> There are forward branches in `TemplateInterpreterGenerator::set_vtos_entry_points`, which cannot be shortened by `MacroAssembler`, unless we tell it specifically that the upcoming branch target would be within the 8-bit offset. Which it apparently is in this particular case, because there are just a handful of `push`-es between the jump and its target. If a jump offset is more than 8 bits, the interpreter would catch fire just about everywhere, since `set_vtos_entry_points` is used at every bytecode entry. `fastdebug` builds assert the offset sanity directly.
> 
> Current patch improves `SPECjvm2008` performance in `-Xint` mode on Ryzen 7 5700G:
> 
> 
> compiler.compiler: +4.1%
> compiler.sunflow: +4.7%
> compress: +9.9%
> crypto.signverify: +5.2%
> scimark.fft.large: +9.5%
> scimark.fft.small: +10.1%
> serial: +7.3%
> xml.transform: +7.1%
> xml.validation: +3.3%
> 
> 
> There are other places in template interpreter where forward jumps can be short, I'll do them separately, since they are riskier and also less important.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier1`

Thanks for reviews!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7463

From shade at openjdk.java.net  Tue Feb 15 16:45:08 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 15 Feb 2022 16:45:08 GMT
Subject: Integrated: 8281744: x86: Use short jumps in
 TIG::set_vtos_entry_points
In-Reply-To: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
References: <MJE8ssivXp90MMZuvfXd9RCa3uIxfb2s9ZsKYD9bfvE=.bf5677a7-937e-4124-8f1a-c69a7b750847@github.com>
Message-ID: <ZctHKnfzTCbSAFS7CUv9beLTC_KKcpBtMCGs6qd6HA0=.7fd9f9e7-3911-4fc5-9f42-77b94e6f2132@github.com>

On Mon, 14 Feb 2022 15:47:41 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Performance in `-Xint` mode seems to be bottlenecked on the code size, rather than particular instruction hotspots, which means code density is important.
> 
> There are forward branches in `TemplateInterpreterGenerator::set_vtos_entry_points`, which cannot be shortened by `MacroAssembler`, unless we tell it specifically that the upcoming branch target would be within the 8-bit offset. Which it apparently is in this particular case, because there are just a handful of `push`-es between the jump and its target. If a jump offset is more than 8 bits, the interpreter would catch fire just about everywhere, since `set_vtos_entry_points` is used at every bytecode entry. `fastdebug` builds assert the offset sanity directly.
> 
> Current patch improves `SPECjvm2008` performance in `-Xint` mode on Ryzen 7 5700G:
> 
> 
> compiler.compiler: +4.1%
> compiler.sunflow: +4.7%
> compress: +9.9%
> crypto.signverify: +5.2%
> scimark.fft.large: +9.5%
> scimark.fft.small: +10.1%
> serial: +7.3%
> xml.transform: +7.1%
> xml.validation: +3.3%
> 
> 
> There are other places in template interpreter where forward jumps can be short, I'll do them separately, since they are riskier and also less important.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug, `tier1`
>  - [x] Linux x86_32 fastdebug, `tier1`

This pull request has now been integrated.

Changeset: 18704653
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/18704653dcc76b6360b746a6a9c20d614633da0e
Stats:     5 lines in 1 file changed: 0 ins; 0 del; 5 mod

8281744: x86: Use short jumps in TIG::set_vtos_entry_points

Reviewed-by: rehn, coleenp

-------------

PR: https://git.openjdk.java.net/jdk/pull/7463

From hseigel at openjdk.java.net  Tue Feb 15 18:22:00 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 15 Feb 2022 18:22:00 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v6]
In-Reply-To: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
Message-ID: <zTOEmfFag6HAoaz_4-aJYQ3JBuKpk_u_twrHSiwwf_k=.d271f4af-b787-458c-8e04-fc6b99022f47@github.com>

> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
> 
> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  change lseek64() calls to os::lseek()

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7248/files
  - new: https://git.openjdk.java.net/jdk/pull/7248/files/d062fb50..e0abfdb4

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=04-05

  Stats: 9 lines in 1 file changed: 0 ins; 4 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7248.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7248/head:pull/7248

PR: https://git.openjdk.java.net/jdk/pull/7248

From hseigel at openjdk.java.net  Tue Feb 15 18:29:14 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Tue, 15 Feb 2022 18:29:14 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v5]
In-Reply-To: <IP1CgEQUri6O7ONsG1teSc_yTE5aVlHNbsI5KEC8xk8=.3e64d0e1-795d-4697-890a-0d016d4adc66@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <QODvIzjG2BmpGqB5fpM8ycZNCHP0CqSCUcw96-NPHoo=.5936ea58-59cd-4397-92db-e8dd006e7654@github.com>
 <IP1CgEQUri6O7ONsG1teSc_yTE5aVlHNbsI5KEC8xk8=.3e64d0e1-795d-4697-890a-0d016d4adc66@github.com>
Message-ID: <2YciZFs9m7FUZHg0s3Eun8s-XFoeG2lj9-UEva1uXTw=.3a5d6d11-13ea-4c56-87aa-563c8f23ed6b@github.com>

On Mon, 14 Feb 2022 22:37:39 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   rename macro, fix semi-colon issue, fix zero lseek64 and ftruncate64 build issue
>
> src/hotspot/os/linux/os_linux.cpp line 4924:
> 
>> 4922: }
>> 4923: 
>> 4924: off64_t call_lseek64(int fd, off64_t offset, int whence) {
> 
> I think it would be better to just change the `lseek64` calls to `os::lseek` rather than introduce this wrapper function.

Thanks David.  I changed the code to call os::lseek().

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From duke at openjdk.java.net  Tue Feb 15 19:01:08 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Tue, 15 Feb 2022 19:01:08 GMT
Subject: Integrated: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero
 extended) casts
In-Reply-To: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
References: <wY-To-VJCIYtJkAgG1u5ePqJeABUxs5yx9oF4fL8_Zc=.1682c95f-3d45-460b-90d4-2d3b194617af@github.com>
Message-ID: <w3klZDCKfttVgyY0RxMoYGDO_rp7PSJpIhNolFi8Egw=.5c620e28-6bd0-4dd1-a2c9-a56cb8f6aff3@github.com>

On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

> Hi,
> 
> This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations.
> 
> Thank you very much.

This pull request has now been integrated.

Changeset: 0af356bb
Author:    Quan Anh Mai <anhmdq99 at gmail.com>
Committer: Sandhya Viswanathan <sviswanathan at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/0af356bb4bfee99223d4bd4f8b0001c5f362c150
Stats:     490 lines in 19 files changed: 428 ins; 24 del; 38 mod

8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts

Reviewed-by: psandoz, sviswanathan

-------------

PR: https://git.openjdk.java.net/jdk/pull/7358

From dholmes at openjdk.java.net  Tue Feb 15 22:12:17 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 15 Feb 2022 22:12:17 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v6]
In-Reply-To: <zTOEmfFag6HAoaz_4-aJYQ3JBuKpk_u_twrHSiwwf_k=.d271f4af-b787-458c-8e04-fc6b99022f47@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <zTOEmfFag6HAoaz_4-aJYQ3JBuKpk_u_twrHSiwwf_k=.d271f4af-b787-458c-8e04-fc6b99022f47@github.com>
Message-ID: <6rDE2Bo8OEdvlJ5uIQsJL3flAH-AgCpkXk03D5hd4dQ=.8545e589-cc2e-4f23-b439-64b7a18d98ab@github.com>

On Tue, 15 Feb 2022 18:22:00 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> Please review this new attempt to resolve JDK-8214976.  This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function.  The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE.  This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions.  Changes to Windows code is left for a future RFE.
>> 
>> This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390.
>> 
>> Thanks, Harold
>
> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
> 
>   change lseek64() calls to os::lseek()

Thanks Harold,

This seems acceptable to me now. Only remaining issue is the placement issue Kim raised - see query/suggestion below.

David

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From dholmes at openjdk.java.net  Tue Feb 15 22:12:18 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 15 Feb 2022 22:12:18 GMT
Subject: RFR: 8214976: Warn about uses of functions replaced for
 portability [v6]
In-Reply-To: <ccy8hD10bNUW4vDLsI8a_EsAbgBa24ckIMIjh8RGhbs=.42ad99f3-bf97-4e68-8ca4-ee4e58e45d75@github.com>
References: <qqmkCA5bKr0ZUEvk9cZxCVUoZFQ66vDh0dZpVxsJ4Cw=.bca72004-96e1-4488-9975-e6157bb89610@github.com>
 <ccy8hD10bNUW4vDLsI8a_EsAbgBa24ckIMIjh8RGhbs=.42ad99f3-bf97-4e68-8ca4-ee4e58e45d75@github.com>
Message-ID: <LQ_Own3unosUMKkIAhmxrA4fwI0Nwg4NkBCZom6ZoPw=.bba094f9-f6e7-4217-9557-08246ddf9a5c@github.com>

On Fri, 28 Jan 2022 22:40:45 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   change lseek64() calls to os::lseek()
>
> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 109:
> 
>> 107: FORBID_C_FUNCTION(ssize_t  write(int, const void*, size_t ),          "use os::write");
>> 108: 
>> 109: FORBID_C_FUNCTION(char*    strtok(char*, const char*),                "use strtok_r");
> 
> Some of these functions are portable and ought to be forbidden in a platform agnostic location, so the restriction also applies if/when we have real support on other platforms.  I think almost none are gcc (or clang) specific, but are instead probably posix and not windows, so maybe should go in a different place as well.  Basically I think the structure / placement considerations need some more work.

Can we put the list of forbidden functions in os.hpp?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7248

From duke at openjdk.java.net  Wed Feb 16 07:59:08 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Wed, 16 Feb 2022 07:59:08 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <d9NT33YZFZeFH0JrWvYV-4wNcTQZgIEae0REudAdUBU=.8ac42dd7-b7c5-423f-98b7-af904f0add83@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <d9NT33YZFZeFH0JrWvYV-4wNcTQZgIEae0REudAdUBU=.8ac42dd7-b7c5-423f-98b7-af904f0add83@github.com>
Message-ID: <jLrdxrGzCmawzZeE_KZUgVL79Qr6Qgg4hedN2tbNTws=.c7c51eaa-b27c-406f-ac0c-63ff87b25c9d@github.com>

On Thu, 10 Feb 2022 19:05:41 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>> 
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>> 
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>> 
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
>> because there is no space on device.
>> Could you please review the fix?
>
> src/hotspot/share/jfr/jni/jfrJavaSupport.hpp line 103:
> 
>> 101: 
>> 102:   // critical
>> 103:   static void abort(jstring errorMsg, TRAPS, bool dump_core=true);
> 
> Not sure this is necessary. The existing core dump logic already handles the case where a core file cannot be generated due to disk full.

Thank you for your review.  

Whether or not hotspot generate a core file is determined by the argument of vm_abort(bool dump_core).  If the argument is "true", vm_abort(bool dump_core) calls os::abort(bool dump_core) to generate a core file. 
See the following code:
https://github.com/openjdk/jdk/blob/3c160ab5bec0c2364ec3f43c5a5789098d4699e5/src/hotspot/share/runtime/java.cpp#L625

I think JfrJavaSupport::abort() should pass "false" as an argument to vm_abort(bool dump_core).

> test/hotspot/jtreg/runtime/jfr/TestJFRDiskFull.java line 127:
> 
>> 125:         raf.close();
>> 126:     }
>> 127: }
> 
> I appreciate the effort, but we can't have a test that intentionally provokes a disk full situation. Instead, the updated error message will have to be manually verified.

I use `@run main/manual` in TestJFRDiskFull.java. I think this label means manually test.  
I mannually confirmed this test to pass with jtreg after this fix.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From shade at openjdk.java.net  Wed Feb 16 08:04:15 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 16 Feb 2022 08:04:15 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <_CQho3N-3xjNk1Pm-KRzhR9q0ZEhGnOQYppdFmc0EWg=.4dc9823d-994a-4a93-88a7-777579daff1c@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

The GHA failure on x86_32 is new and unrelated: https://bugs.openjdk.java.net/browse/JDK-8281822

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From mgronlun at openjdk.java.net  Wed Feb 16 10:28:12 2022
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Wed, 16 Feb 2022 10:28:12 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <jLrdxrGzCmawzZeE_KZUgVL79Qr6Qgg4hedN2tbNTws=.c7c51eaa-b27c-406f-ac0c-63ff87b25c9d@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <d9NT33YZFZeFH0JrWvYV-4wNcTQZgIEae0REudAdUBU=.8ac42dd7-b7c5-423f-98b7-af904f0add83@github.com>
 <jLrdxrGzCmawzZeE_KZUgVL79Qr6Qgg4hedN2tbNTws=.c7c51eaa-b27c-406f-ac0c-63ff87b25c9d@github.com>
Message-ID: <eS1AxRdPZtAenOJMx0ElC5ST79-a4i2_VgqZtkIgWJw=.dbeb41bc-3ca3-4cf4-94b4-f207a34ba392@github.com>

On Wed, 16 Feb 2022 07:54:26 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

>> src/hotspot/share/jfr/jni/jfrJavaSupport.hpp line 103:
>> 
>>> 101: 
>>> 102:   // critical
>>> 103:   static void abort(jstring errorMsg, TRAPS, bool dump_core=true);
>> 
>> Not sure this is necessary. The existing core dump logic already handles the case where a core file cannot be generated due to disk full.
>
> Thank you for your review.  
> 
> Whether or not hotspot generate a core file is determined by the argument of vm_abort(bool dump_core).  If the argument is "true", vm_abort(bool dump_core) calls os::abort(bool dump_core) to generate a core file. 
> See the following code:
> https://github.com/openjdk/jdk/blob/3c160ab5bec0c2364ec3f43c5a5789098d4699e5/src/hotspot/share/runtime/java.cpp#L625
> 
> I think JfrJavaSupport::abort() should pass "false" as an argument to vm_abort(bool dump_core).

Ok. My point was that the os won't be able to create a core file if there is no available space.

But this is indeed more succinct, if we don't want to create a core categorically from this location.

>> test/hotspot/jtreg/runtime/jfr/TestJFRDiskFull.java line 127:
>> 
>>> 125:         raf.close();
>>> 126:     }
>>> 127: }
>> 
>> I appreciate the effort, but we can't have a test that intentionally provokes a disk full situation. Instead, the updated error message will have to be manually verified.
>
> I use `@run main/manual` in TestJFRDiskFull.java. I think this label means manually test.  
> I mannually confirmed this test to pass with jtreg after this fix.

My apologies, I missed the @run main/manual decoration. I don't think we have any JFR tests that use it.

If you can ensure this test is excluded for automatic runs, then perhaps...but then I don't know who will get to run it, so the value of the test is questionable.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From jbhateja at openjdk.java.net  Wed Feb 16 11:05:07 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 16 Feb 2022 11:05:07 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v4]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <1dqqh2KXNKFtAUuCMwvI9mLA0jFw--Bqz-AEfrxq_NM=.1b9f677e-3798-4877-9b58-8afdc8ed64ac@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | -- | --
> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  8279508: Replacing by efficient instruction sequence based on MXCSR.RC mode.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/2dc364fa..1c9ff777

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=02-03

  Stats: 143 lines in 4 files changed: 4 ins; 82 del; 57 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Wed Feb 16 12:30:27 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 16 Feb 2022 12:30:27 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v5]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <-NfiIwcnrf7TRNxA9x1d9itPvKYgeCYogpjSZgGYtvc=.15346702-2db7-4295-8e5a-a4864f3bbdbd@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | -- | --
> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:

 - 8279508: Adding few descriptive comments.
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
 - 8279508: Replacing by efficient instruction sequence based on MXCSR.RC mode.
 - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
 - 8279508: Adding a test for scalar intrinsification.
 - 8279508: Auto-vectorize Math.round API

-------------

Changes: https://git.openjdk.java.net/jdk/pull/7094/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=04
  Stats: 739 lines in 23 files changed: 648 ins; 29 del; 62 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Wed Feb 16 12:30:28 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 16 Feb 2022 12:30:28 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <UMm_6uonBzgdEoJzE3zbiZ2MTeBRh8FWEgR1wjk-SMI=.ecc686f6-f233-4335-a810-89f540992f93@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
 <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
 <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>
 <iCaaelBgPdReusZLD-8eM-XDbw_xWsV5mv1vq4umRcg=.58de564c-df8f-4d20-96fd-e64389421cc0@github.com>
 <EKNLSOdo1lrE-PbeZZ9YC9LonG9FhmDitLo_Wa60vYk=.c4324df4-3b7d-4b12-939e-e74de8e6ae76@github.com>
 <UMm_6uonBzgdEoJzE3zbiZ2MTeBRh8FWEgR1wjk-SMI=.ecc686f6-f233-4335-a810-89f540992f93@github.com>
Message-ID: <CMP5v8fC8SRtISzzTncUyG7OU58s0kdrEIQZaX_-oUw=.88fdebd3-5f12-47fd-a5c0-7690175c1f93@github.com>

On Mon, 14 Feb 2022 17:14:10 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> That pseudocode would make a very useful comment too. This whole patch is very thinly commented.
>
>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.
>> 
>> Hi @merykitty , You are correct, we can embed RC mode in instruction encoding of round instruction (towards -inf,+inf, zero). But to match the semantics of Math.round API one needs to add 0.5[f] to input value and then perform rounding over resultant value, which is why @sviswa7 suggested to use a global rounding mode driven by MXCSR.RC so that intermediate floating inexact values are resolved as desired, but OOO execution may misplace LDMXCSR and hence may have undesired side effects.
> 
> **Just want to correct above statement, LDMXCSR will not be re-ordered/re-scheduled early OOO backend.**

> That pseudocode would make a very useful comment too. This whole patch is very thinly commented.

I have replaced earlier bulky sequence, new sequence is having similar performance but reduction in code may improve inlining behavior.  Added descriptive comments around the special cases.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Wed Feb 16 12:40:10 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 16 Feb 2022 12:40:10 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v3]
In-Reply-To: <CMP5v8fC8SRtISzzTncUyG7OU58s0kdrEIQZaX_-oUw=.88fdebd3-5f12-47fd-a5c0-7690175c1f93@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <O1e2e74ohmj0q0nxd1YuInGsZWrlDpXGetUqwXRZES0=.eb3c1352-d840-4d05-ad22-b68a4da187db@github.com>
 <aQmrSiY4J2-diiRGJKRM26RnaCAF93rsaoyvzQyVOSM=.852e8450-c25b-4e1a-b3e1-4f71a1e16977@github.com>
 <j-EwZ27qdjOya-YG0gRFgQ-ekCoEEr5A10YqHtGOh1k=.83e47518-ee9b-44d1-8652-e5f84c59d539@github.com>
 <iCaaelBgPdReusZLD-8eM-XDbw_xWsV5mv1vq4umRcg=.58de564c-df8f-4d20-96fd-e64389421cc0@github.com>
 <EKNLSOdo1lrE-PbeZZ9YC9LonG9FhmDitLo_Wa60vYk=.c4324df4-3b7d-4b12-939e-e74de8e6ae76@github.com>
 <UMm_6uonBzgdEoJzE3zbiZ2MTeBRh8FWEgR1wjk-SMI=.ecc686f6-f233-4335-a810-89f540992f93@github.com>
 <CMP5v8fC8SRtISzzTncUyG7OU58s0kdrEIQZaX_-oUw=.88fdebd3-5f12-47fd-a5c0-7690175c1f93@github.com>
Message-ID: <UdQYwLX0ky_klc-PLJxzAAyEeFLeYVJYHH9BRD5exR8=.5e45de8c-46a1-4f79-8c05-172a7257d1e4@github.com>

On Wed, 16 Feb 2022 12:26:45 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in the evex prefix, removing the need to rely on global MXCSR register. Thanks.
>>> 
>>> Hi @merykitty , You are correct, we can embed RC mode in instruction encoding of round instruction (towards -inf,+inf, zero). But to match the semantics of Math.round API one needs to add 0.5[f] to input value and then perform rounding over resultant value, which is why @sviswa7 suggested to use a global rounding mode driven by MXCSR.RC so that intermediate floating inexact values are resolved as desired, but OOO execution may misplace LDMXCSR and hence may have undesired side effects.
>> 
>> **Just want to correct above statement, LDMXCSR will not be re-ordered/re-scheduled early OOO backend.**
>
>> That pseudocode would make a very useful comment too. This whole patch is very thinly commented.
> 
> I have replaced earlier bulky sequence, new sequence is having similar performance but reduction in code may improve inlining behavior.  Added descriptive comments around the special cases.

> There are already `RoundFloat`, `RoundDouble`, and `RoundDoubleMode` nodes defined.
> 
> Though `RoundFloat` and `RoundDouble` are legacy nodes used only on x86-32, `RoundDoubleMode` supports multiple rounding modes and is amenable to auto-vectorization.
> 
> What do you think about the following alternative?
> 
> Reuse `RoundDoubleMode` (with a new rounding mode) and introduce `RoundFloatMode`.
> 
> Special rounding rules is not the only peculiarity of `Math.round()`. It also converts the result to an integral type. It can be represented as `ConvF2I (RoundFloatMode f #rmode)` / `ConvD2L (RoundDoubleMode d #rmode)`. In scalar case, it can be matched as a single AD instruction.
> 
> Auto-vectorizer can then convert it to `VectorCastF2X (RoundFloatModeV vf #rmode)` / `VectorCastD2X (RoundDoubleModeV vd #rmode)` and match it in a similar manner.

Adding new rounding mode to RoundDoubleMode may disturb other targets. match_rule_supported routine operates over Opcodes and currently any target supporting RoundDoubleMode generates code for all the rounding modes. Your solution is anyways based on creating new scalar and vector IR node for floating point rounding operation, which is what patch is doing currently.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From dholmes at openjdk.java.net  Wed Feb 16 12:45:05 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 16 Feb 2022 12:45:05 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <bdbrAk9GYsRE9KThVyj7LScuMJdEq3W5ot6J78guhEU=.78fa0bf3-9cfe-44f0-bbb0-e6585faec0ba@github.com>

On Wed, 26 Jan 2022 06:41:41 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

test/hotspot/jtreg/runtime/jfr/TestJFRDiskFull.java line 28:

> 26:  * @test
> 27:  * @bug 8280684
> 28:  * @summary JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

typo: failes

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From dholmes at openjdk.java.net  Wed Feb 16 12:50:09 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 16 Feb 2022 12:50:09 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <eS1AxRdPZtAenOJMx0ElC5ST79-a4i2_VgqZtkIgWJw=.dbeb41bc-3ca3-4cf4-94b4-f207a34ba392@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <d9NT33YZFZeFH0JrWvYV-4wNcTQZgIEae0REudAdUBU=.8ac42dd7-b7c5-423f-98b7-af904f0add83@github.com>
 <jLrdxrGzCmawzZeE_KZUgVL79Qr6Qgg4hedN2tbNTws=.c7c51eaa-b27c-406f-ac0c-63ff87b25c9d@github.com>
 <eS1AxRdPZtAenOJMx0ElC5ST79-a4i2_VgqZtkIgWJw=.dbeb41bc-3ca3-4cf4-94b4-f207a34ba392@github.com>
Message-ID: <5bed6fhWOY90Tpy4sFkixqBN96vvbc9qrN1xOLRzeqI=.1f707da6-6c64-4b8f-9e75-4121199156de@github.com>

On Wed, 16 Feb 2022 10:17:00 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> Thank you for your review.  
>> 
>> Whether or not hotspot generate a core file is determined by the argument of vm_abort(bool dump_core).  If the argument is "true", vm_abort(bool dump_core) calls os::abort(bool dump_core) to generate a core file. 
>> See the following code:
>> https://github.com/openjdk/jdk/blob/3c160ab5bec0c2364ec3f43c5a5789098d4699e5/src/hotspot/share/runtime/java.cpp#L625
>> 
>> I think JfrJavaSupport::abort() should pass "false" as an argument to vm_abort(bool dump_core).
>
> Ok. My point was that the os won't be able to create a core file if there is no available space.
> 
> But this is indeed more succinct, if we don't want to create a core categorically from this location.

Just an observation but the filesystem that is full, and the filesystem where a core would be written, need not be the same file system. That said, a core dump in this case seems unwarranted.

>> I use `@run main/manual` in TestJFRDiskFull.java. I think this label means manually test.  
>> I mannually confirmed this test to pass with jtreg after this fix.
>
> My apologies, I missed the @run main/manual decoration. I don't think we have any JFR tests that use it.
> 
> If you can ensure this test is excluded for automatic runs, then perhaps...but then I don't know who will get to run it, so the value of the test is questionable.

Manual tests are excluded if the jtreg test run specifies to run automatic tests only (as we do in our CI). So this really only serves as a validation of the fix, with no real expectation that anyone will necessarily every run it again. Even as a locally run test, filling the disk can easily lead to unexpected problems for other processes - including the swap/paging file on Windows - so this is a risky test to run.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From dholmes at openjdk.java.net  Wed Feb 16 12:55:05 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 16 Feb 2022 12:55:05 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <bdbrAk9GYsRE9KThVyj7LScuMJdEq3W5ot6J78guhEU=.78fa0bf3-9cfe-44f0-bbb0-e6585faec0ba@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <bdbrAk9GYsRE9KThVyj7LScuMJdEq3W5ot6J78guhEU=.78fa0bf3-9cfe-44f0-bbb0-e6585faec0ba@github.com>
Message-ID: <h9abc6YIeYhWsJbkiPTfXgFLJFWKIgbpNJWDN1ywsog=.db53d68c-25df-4328-a198-d58818b4f99f@github.com>

On Wed, 16 Feb 2022 12:41:52 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>> 
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>> 
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>> 
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
>> because there is no space on device.
>> Could you please review the fix?
>
> test/hotspot/jtreg/runtime/jfr/TestJFRDiskFull.java line 28:
> 
>> 26:  * @test
>> 27:  * @bug 8280684
>> 28:  * @summary JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.
> 
> typo: failes

Actually "summary" is meant to describe what the test does, not what the original bug was about

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From mgronlun at openjdk.java.net  Wed Feb 16 13:08:11 2022
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Wed, 16 Feb 2022 13:08:11 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <w5AV_dajiN50m5h9LIen5k7IEqvmC1oLpfbvh72fkGs=.0eba3319-5ce4-4923-905a-cafad1059bb2@github.com>

On Wed, 26 Jan 2022 06:41:41 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

Takuya, can I suggest keeping your proposed changes but excluding the test?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From ayang at openjdk.java.net  Wed Feb 16 15:16:22 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Wed, 16 Feb 2022 15:16:22 GMT
Subject: RFR: 8281971: Remove unimplemented InstanceRefKlass::do_next
Message-ID: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>

Trivial change of removing dead code.

Test: build

-------------

Commit messages:
 - trivial

Changes: https://git.openjdk.java.net/jdk/pull/7497/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7497&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281971
  Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7497.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7497/head:pull/7497

PR: https://git.openjdk.java.net/jdk/pull/7497

From lkorinth at openjdk.java.net  Wed Feb 16 17:05:13 2022
From: lkorinth at openjdk.java.net (Leo Korinth)
Date: Wed, 16 Feb 2022 17:05:13 GMT
Subject: RFR: 8269537: memset() is called after operator new [v4]
In-Reply-To: <OLUjj1GGh7kLi5tWiI_keOCWU-VgxtjesyoWHDtifSs=.213f71a6-c4b5-4107-b64f-b59efbfe7309@github.com>
References: <fe0PJHcDQ9Gax0idlyvpt0IQxhDwReN27jV9en3F1Uo=.6853eda5-c0fe-4664-b231-8e8922fa3713@github.com>
 <OLUjj1GGh7kLi5tWiI_keOCWU-VgxtjesyoWHDtifSs=.213f71a6-c4b5-4107-b64f-b59efbfe7309@github.com>
Message-ID: <X9YSw7jron6P_xzPq-unha6VUi3sP4-zJDcKEtOmK6U=.61735a5d-7dc6-420b-8ee3-93a470bed2b3@github.com>

On Wed, 20 Oct 2021 09:36:38 GMT, Leo Korinth <lkorinth at openjdk.org> wrote:

>> The basic problem is that we are relying on undefined behaviour, as documented in the code:
>> 
>> // This whole business of passing information from ResourceObj::operator new
>> // to the ResourceObj constructor via fields in the "object" is technically UB.
>> // But it seems to work within the limitations of HotSpot usage (such as no
>> // multiple inheritance) with the compilers and compiler options we're using.
>> // And it gives some possibly useful checking for misuse of ResourceObj.
>> 
>> 
>> I am removing the undefined behaviour by passing the type of allocation through a thread local variable.
>> 
>> This solution has some advantages:
>> 1) it is not UB
>> 2) it is simpler and easier to understand
>> 3) it uses less memory (I could make it use even less if I made the enum `allocation_type` a u8)
>> 4) in the *very* unlikely situation that stack memory (or embedded) already equals the data calculated from the address of the object, the code will also work. 
>> 
>> When doing the change, I also updated  `allocated_on_stack()` to the new name `allocated_on_stack_or_embedded()` which is much harder to misinterpret.
>> 
>> I also disallow to "fake" the memory type by explicitly calling `ResourceObj::set_allocation_type`.
>> 
>> This forced me to change two places that is faking the allocation type of an embedded `GrowableArray` from  `STACK_OR_EMBEDDED` to `C_HEAP`. The faking of the type is hard to understand as a `STACK_OR_EMBEDDED` `GrowableArray` can allocate any type of object. My guess is that `GrowableArray` has changed behaviour, or maybe that it was hard to understand because the old naming of `allocated_on_stack()`. 
>> 
>> I have also tried to update the comments. In doing that I not only changed the comments for this change, but also for the *incorrect* advice to always delete object you allocate with new.
>> 
>> Testing on debug build tier1-3
>> Testing on release build tier1
>
> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review updates

This comment will keep this pull request alive a bit longer.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5387

From iklam at openjdk.java.net  Wed Feb 16 17:49:11 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 16 Feb 2022 17:49:11 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v3]
In-Reply-To: <UPPmzDd-44TljmPhCVt1GOjpWFgmFTQlEcSSwE-TVWI=.2f41603b-e4ac-4e97-b24f-202647212cc1@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
 <pOfPrnVSbe4SOyUeCaTXl-2dzck0EeSCa1fuarocajo=.22176dfa-d283-4891-a91d-48aae98fce09@github.com>
 <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com>
 <UPPmzDd-44TljmPhCVt1GOjpWFgmFTQlEcSSwE-TVWI=.2f41603b-e4ac-4e97-b24f-202647212cc1@github.com>
Message-ID: <viyqFj7s_69skArcflJIidFBAi0_-nYNt-aQr3EVow8=.786d6391-7827-457a-b834-4a34b48ed25a@github.com>

On Wed, 19 Jan 2022 05:50:50 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> I don't really know this code well enough to do a good code review.  I had some comments though.
>
>> I don't really know this code well enough to do a good code review. I had some comments though.
> 
> Hi Coleen, thanks for taking a look.
> 
> This PR has two major parts:
> 
> 1. Check for inappropriate reference to static fields. This is mainly done in cdsHeapVerifier.cpp. These checks don't affect the contents of the CDS archive. They just print out warnings if problems are found.
> 2. Special initialization of enum classes. Essentially if any instance of an enum class `X` is archived, then `X::<clinit>` will not be executed, and we'll take this path instead (in instanceKlass.cpp):
> 
> 
>   // This is needed to ensure the consistency of the archived heap objects.
>   if (has_archived_enum_objs()) {
>     assert(is_shared(), "must be");
>     bool initialized = HeapShared::initialize_enum_klass(this, CHECK);
>     if (initialized) {
>       return;
>     }
>   }
> 
> Could you check if (2) is correct?

> @iklam This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

keepalive

-------------

PR: https://git.openjdk.java.net/jdk/pull/6653

From dholmes at openjdk.java.net  Wed Feb 16 21:32:02 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 16 Feb 2022 21:32:02 GMT
Subject: RFR: 8281971: Remove unimplemented InstanceRefKlass::do_next
In-Reply-To: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>
References: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>
Message-ID: <RjoIu4NXLTzxkkojKDN4Js-ynHkAXOYJ-OCH4IDVqbY=.eac50947-b3a5-46bd-80de-3cce725cadd4@github.com>

On Wed, 16 Feb 2022 15:10:26 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> Trivial change of removing dead code.
> 
> Test: build

Looks good and trivial.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7497

From joe.darcy at oracle.com  Wed Feb 16 22:20:20 2022
From: joe.darcy at oracle.com (Joseph D. Darcy)
Date: Wed, 16 Feb 2022 14:20:20 -0800
Subject: RFR: 8279508: Auto-vectorize Math.round API [v2]
In-Reply-To: <SKEroM6QsoBV4Btj6kAemSCqqRfHT4mm33Avdy1L8l4=.fcd38193-3821-4573-8bca-300e22f875fe@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <LQMZEAy-QU55kNt5fwFQSI8JPGuYz-nRWhuWVkKMt5c=.e5245c2f-c111-4c3d-829c-db44bca43e47@github.com>
 <2TVKx_BFFyAK2ooOWKpdsEIMFzJngYxlWjbgeZ2y4Mc=.5deb2173-8107-476d-92ca-1835d69ce336@github.com>
 <SKEroM6QsoBV4Btj6kAemSCqqRfHT4mm33Avdy1L8l4=.fcd38193-3821-4573-8bca-300e22f875fe@github.com>
Message-ID: <6e3a21d8-fc16-24b3-ead1-fefb52db9684@oracle.com>


On 2/12/2022 6:55 PM, Jatin Bhateja wrote:
> On Fri, 21 Jan 2022 00:49:04 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>
>> The JVM currently initializes the x86 mxcsr to round to nearest even, see below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode, exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works for Math.rint which is specified to be round to nearest even. Please see: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html : section 4.8.4
>>
>> The rounding mode needed for Math.round is round to positive infinity which needs a different x86 mxcsr initialization(0x5F80).
> Hi @sviswa7 ,
> As per JLS 17 section 15.4 Java follows round to nearest rounding policy for all floating point operations except conversion to integer and remainder where it uses round toward zero.


That is a true background condition, but I will note that the Math.round 
method does independently define the semantics of its operation and 
rounding behavior, which has changed (slightly) over the lifetime of the 
platform.

-Joe


From jiefu at openjdk.java.net  Wed Feb 16 23:27:12 2022
From: jiefu at openjdk.java.net (Jie Fu)
Date: Wed, 16 Feb 2022 23:27:12 GMT
Subject: RFR: 8281467: Allow larger OptoLoopAlignment and
 CodeEntryAlignment
In-Reply-To: <FpLmc5wRRJAVlc-R3nyyx7_ZX6JXTBZ2LYRzkbDza4U=.2e2aa58f-6263-4ef5-a0e5-b617fd78cae3@github.com>
References: <q8nxT7Ey103QPoyyjIhtkBeMG0Hlw4NP9w4DZ1uL5QU=.3737be56-30fd-43d8-9b85-fc7b591cc444@github.com>
 <FpLmc5wRRJAVlc-R3nyyx7_ZX6JXTBZ2LYRzkbDza4U=.2e2aa58f-6263-4ef5-a0e5-b617fd78cae3@github.com>
Message-ID: <9JilFxfj4hSQbkBrNAgpkhjPiGyYFGn0dCR_9SypWKg=.32cee859-7533-4109-871b-44f20d1a89e3@github.com>

On Tue, 15 Feb 2022 06:17:57 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> I am following up on the performance issue where the culprit seems to be the too low `OptoLoopAlignment`. To perform better experiments, I suggest allowing larger alignments.
>> 
>> Note that we cannot make `OptoLoopAlignment` larger than `CodeEntryAlignment`, because nmethod copy would break it, see assert in `MacroAssembler::align`. See [JDK-8273459](https://bugs.openjdk.java.net/browse/JDK-8273459) for latest discussion about it. So `CodeEntryAlignment` needs to be configurable as well.
>> 
>> The default values for options are different per platform, so tests are x86_64 specific.
>> 
>> No default value is changed, this only unblocks experiments.
>> 
>> Additional testing:
>>  - [x] New tests on Linux x86_64 fastdebug
>>  - [x] New tests on Linux x86_64 release
>
> Thank you!

Hi @shipilev ,

compiler/arguments/TestCodeEntryAlignment.java fails on AVX512 machines.
Please have a look: https://github.com/openjdk/jdk/pull/7485
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7388

From jbhateja at openjdk.java.net  Thu Feb 17 03:44:02 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 17 Feb 2022 03:44:02 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v5]
In-Reply-To: <-NfiIwcnrf7TRNxA9x1d9itPvKYgeCYogpjSZgGYtvc=.15346702-2db7-4295-8e5a-a4864f3bbdbd@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <-NfiIwcnrf7TRNxA9x1d9itPvKYgeCYogpjSZgGYtvc=.15346702-2db7-4295-8e5a-a4864f3bbdbd@github.com>
Message-ID: <U34qdWMWGuoYbavOIDlL5aJQhEnWyp8PKTcMPJ-bGmY=.6e12fd50-0b81-4b22-8e65-ccc065bfe6ee@github.com>

On Wed, 16 Feb 2022 12:30:27 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 584.99 | 1870.70 | 3.20 | 510.35 | 548.60 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 257.17 | 965.33 | 3.75 | 293.60 | 273.15 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.69 | 3592.54 | 4.35 | 825.32 | 1836.42 | 2.23
>> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | 412.31 | 945.82 | 2.29
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
> 
>  - 8279508: Adding few descriptive comments.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Replacing by efficient instruction sequence based on MXCSR.RC mode.
>  - 8279508: Adding vectorized algorithms to match the semantics of rounding operations.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Adding a test for scalar intrinsification.
>  - 8279508: Auto-vectorize Math.round API

> _Mailing list message from [Joseph D. Darcy](mailto:joe.darcy at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_
> 
> On 2/12/2022 6:55 PM, Jatin Bhateja wrote:
> 
> > On Fri, 21 Jan 2022 00:49:04 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
> > > The JVM currently initializes the x86 mxcsr to round to nearest even, see below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode, exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works for Math.rint which is specified to be round to nearest even. Please see: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html : section 4.8.4
> > > The rounding mode needed for Math.round is round to positive infinity which needs a different x86 mxcsr initialization(0x5F80).
> > > Hi @sviswa7 ,
> > > As per JLS 17 section 15.4 Java follows round to nearest rounding policy for all floating point operations except conversion to integer and remainder where it uses round toward zero.
> 
> That is a true background condition, but I will note that the Math.round method does independently define the semantics of its operation and rounding behavior, which has changed (slightly) over the lifetime of the platform.
> 
> -Joe

Hi @jddarcy , Thanks for your comments, patch has been updated to follow the prescribed semantics  of Math.round API.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Thu Feb 17 08:00:28 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Thu, 17 Feb 2022 08:00:28 GMT
Subject: RFR: 8281544: assert(VM_Version::supports_avx512bw()) failed for Tests
 jdk/incubator/vector/
Message-ID: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>

`ZSaveLiveRegisters::ZSaveLiveRegisters` stores live registers, and later they are loaded again.
This includes opmask registers, which are part of AVX512. However, not all platforms have all of the AVX512 instructions.
For example Knights Landing has general AVX512 support and makes use of optmask registers, but does not support the AVX512 BW subset of instructions, specifically it does not support the `kmovql` instruction. Platforms like Cannon Landing have support for AVX512 BW.

Solution: in analogy to `RegisterSaver::save_live_registers`, which seems to perform a very similar task, use `MacroAssembler::kmov` instead of `kmovql` directly. Internally, `kmov` choses either `kmovql` if avx512bw is available, else it takes `kmovwl`.

As a regression test, I took one of the tests that failed with `-XX:+UnlockExperimentalVMOptions -XX:+UseZGC`, and added an additional `@run` statement with those flags. I simulated this test locally with `sde -knl` (Knights Landing, AVX512 but not BW, fails without change to `kmov`, passes with it) and `sde -cnl` (Cannon Landing, has AVX512 BW, passes before and after code change). Ran additional tests to verify that the test triggers before code change, and that with the code change nothing broke.

@neliasso Thanks for the help!

-------------

Commit messages:
 - 8281544: assert(VM_Version::supports_avx512bw()) failed for Tests jdk/incubator/vector/

Changes: https://git.openjdk.java.net/jdk/pull/7510/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7510&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281544
  Stats: 14 lines in 2 files changed: 12 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7510.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7510/head:pull/7510

PR: https://git.openjdk.java.net/jdk/pull/7510

From duke at openjdk.java.net  Thu Feb 17 08:03:36 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Thu, 17 Feb 2022 08:03:36 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v2]
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <qmoxoovG_EkvqJcqcWuYumDABGKjt7toZU6efFb0Ul4=.f62f2920-3721-4749-9b55-5f28264c09f4@github.com>

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:

  8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7227/files
  - new: https://git.openjdk.java.net/jdk/pull/7227/files/3c160ab5..c2ad1c39

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=00-01

  Stats: 127 lines in 1 file changed: 0 ins; 127 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7227.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7227/head:pull/7227

PR: https://git.openjdk.java.net/jdk/pull/7227

From duke at openjdk.java.net  Thu Feb 17 08:03:38 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Thu, 17 Feb 2022 08:03:38 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device.
In-Reply-To: <w5AV_dajiN50m5h9LIen5k7IEqvmC1oLpfbvh72fkGs=.0eba3319-5ce4-4923-905a-cafad1059bb2@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <w5AV_dajiN50m5h9LIen5k7IEqvmC1oLpfbvh72fkGs=.0eba3319-5ce4-4923-905a-cafad1059bb2@github.com>
Message-ID: <-YSoVkfzQH5LQTzJgUqPKIEuicdyUCrugxQfSb6WVL0=.61724f11-e25f-4f49-a6c4-6a76f4f539e4@github.com>

On Wed, 16 Feb 2022 13:04:36 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

> Takuya, can I suggest keeping your proposed changes but excluding the test?

OK. This test is surely risky. I remove this test.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From rrich at openjdk.java.net  Thu Feb 17 08:33:05 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Thu, 17 Feb 2022 08:33:05 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <-5PaPJvfgixjpHechVa5bra26EWo4n-hvDnYlM4jXO8=.f709cab3-c666-42f2-bfda-c4caa2709755@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

On s390, being CISC too, there are similar issues. We addressed them with `NearLabel`, `branch_optimized`, and `compare_and_branch_optimized`. They provide a higher level of abstraction which helps writing better code without knowing all the details, which at least I instantly forget after looking into the manual.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From shade at openjdk.java.net  Thu Feb 17 08:38:08 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 17 Feb 2022 08:38:08 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <-5PaPJvfgixjpHechVa5bra26EWo4n-hvDnYlM4jXO8=.f709cab3-c666-42f2-bfda-c4caa2709755@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
 <-5PaPJvfgixjpHechVa5bra26EWo4n-hvDnYlM4jXO8=.f709cab3-c666-42f2-bfda-c4caa2709755@github.com>
Message-ID: <OLhx7oi6PXMQDQOqvvnoHWY47118q8pqCEx-7YEjHfg=.6cbd8ebf-1792-4059-897a-6b4daaf108e8@github.com>

On Thu, 17 Feb 2022 08:30:04 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

> On s390, being CISC too, there are similar issues. We addressed them with `NearLabel`, `branch_optimized`, and `compare_and_branch_optimized`. They provide a higher level of abstraction which helps writing better code without knowing all the details, which at least I instantly forget after looking into the manual.

In x86 `MacroAssembler` there are `jcc` and `jccb` for this. When `MacroAssembler` can make `jcc`, it would, but that requires the jump target to be already bound, so that jump offset is already known. For *forward* jumps, though, `MacroAssembler` cannot know this, so in those cases we need to tell it explicitly. `NearLabel` looks like another way of doing so.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From rrich at openjdk.java.net  Thu Feb 17 08:47:12 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Thu, 17 Feb 2022 08:47:12 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <J6JEzdWxYNSj3C2_eKjIi7YJ26YUVFeap1k55O6A3rk=.bf9b92cf-266b-4e95-bdc6-c6c333c54d94@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

> > On s390, being CISC too, there are similar issues. We addressed them with `NearLabel`, `branch_optimized`, and `compare_and_branch_optimized`. They provide a higher level of abstraction which helps writing better code without knowing all the details, which at least I instantly forget after looking into the manual.
> 
> In x86 `MacroAssembler` there are `jcc` and `jccb` for this. When `MacroAssembler` can make `jcc`, it would, but that requires the jump target to be already bound, so that jump offset is already known. For _forward_ jumps, though, `MacroAssembler` cannot know this, so in those cases we need to tell it explicitly. `NearLabel` looks like another way of doing so.

Yes it is another way of doing so. For me the intend is clearer. Also you can pass a `NearLabel` to an assembler method that takes a `Label` parameter and there you can optimize if the passed `Label` is actually a `NearLabel`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From shade at openjdk.java.net  Thu Feb 17 08:47:12 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 17 Feb 2022 08:47:12 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <J6JEzdWxYNSj3C2_eKjIi7YJ26YUVFeap1k55O6A3rk=.bf9b92cf-266b-4e95-bdc6-c6c333c54d94@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
 <J6JEzdWxYNSj3C2_eKjIi7YJ26YUVFeap1k55O6A3rk=.bf9b92cf-266b-4e95-bdc6-c6c333c54d94@github.com>
Message-ID: <4DVKrNLAlMWNwHrtup6OZNim_oH5WjYpJ1GClgFU5w0=.557f929e-10d0-48d7-b3e3-2e5cf2d98225@github.com>

On Thu, 17 Feb 2022 08:43:33 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

> Yes it is another way of doing so. For me the intend is clearer. Also you can pass a `NearLabel` to an assembler method that takes a `Label` parameter and there you can optimize if the passed `Label` is actually a `NearLabel`.

True. I would like to consider that out of scope for this PR, would you agree?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From rrich at openjdk.java.net  Thu Feb 17 08:53:16 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Thu, 17 Feb 2022 08:53:16 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <4DVKrNLAlMWNwHrtup6OZNim_oH5WjYpJ1GClgFU5w0=.557f929e-10d0-48d7-b3e3-2e5cf2d98225@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
 <J6JEzdWxYNSj3C2_eKjIi7YJ26YUVFeap1k55O6A3rk=.bf9b92cf-266b-4e95-bdc6-c6c333c54d94@github.com>
 <4DVKrNLAlMWNwHrtup6OZNim_oH5WjYpJ1GClgFU5w0=.557f929e-10d0-48d7-b3e3-2e5cf2d98225@github.com>
Message-ID: <9UaPruI3qc2BWhAtsCFzwqGybDqDWvlBRZfdvn5-K5U=.c6fe43a9-ad61-4fb2-bf32-d78a4d1a02c3@github.com>

On Thu, 17 Feb 2022 08:45:00 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> > Yes it is another way of doing so. For me the intend is clearer. Also you can pass a `NearLabel` to an assembler method that takes a `Label` parameter and there you can optimize if the passed `Label` is actually a `NearLabel`.
> 
> True. I would like to consider that out of scope for this PR, would you agree?

Of course.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From rrich at openjdk.java.net  Thu Feb 17 09:02:03 2022
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Thu, 17 Feb 2022 09:02:03 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <EKMdUTCAy6_HeUsJTo_bRYkk9oqfm7nOY-EZaKNRT2M=.53fdcd15-2cfa-4039-bdb4-8ab14192ddc0@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

Changes seem fine.

Richard.

-------------

Marked as reviewed by rrich (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7475

From mgronlun at openjdk.java.net  Thu Feb 17 10:28:04 2022
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Thu, 17 Feb 2022 10:28:04 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v2]
In-Reply-To: <qmoxoovG_EkvqJcqcWuYumDABGKjt7toZU6efFb0Ul4=.f62f2920-3721-4749-9b55-5f28264c09f4@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <qmoxoovG_EkvqJcqcWuYumDABGKjt7toZU6efFb0Ul4=.f62f2920-3721-4749-9b55-5f28264c09f4@github.com>
Message-ID: <baiZLbXpFN5geyr1vCd-gyxggf4KgfG_vqWM0cbz-Ow=.f7616cba-ae75-4b04-a6d3-f729ef65456c@github.com>

On Thu, 17 Feb 2022 08:03:36 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>> 
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>> 
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>> 
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
>> because there is no space on device.
>> Could you please review the fix?
>
> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

Changes requested by mgronlun (Reviewer).

src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp line 90:

> 88:         JfrJavaSupport::abort(JfrJavaSupport::new_string(msg, jt), jt, false);
> 89:     }
> 90:     else {

The else block can be removed. Just put the guarantee inline with the other code.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From ayang at openjdk.java.net  Thu Feb 17 11:46:03 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 17 Feb 2022 11:46:03 GMT
Subject: RFR: 8281971: Remove unimplemented InstanceRefKlass::do_next
In-Reply-To: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>
References: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>
Message-ID: <NSRuMjoYdkEr5fu9mSha-NdzsqKl1EtNXusRdsYmhxM=.5e9cebeb-fdb5-4f86-ac2f-ddda5ee31ef3@github.com>

On Wed, 16 Feb 2022 15:10:26 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> Trivial change of removing dead code.
> 
> Test: build

Thanks for the review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7497

From ayang at openjdk.java.net  Thu Feb 17 11:46:04 2022
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 17 Feb 2022 11:46:04 GMT
Subject: Integrated: 8281971: Remove unimplemented InstanceRefKlass::do_next
In-Reply-To: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>
References: <RCcwmM89koa7GV1BPnfzzxmvOgNdzuh_aKoEFQ2Ez0A=.d0678c37-4742-4ab0-8d25-d04cc6a9dcbf@github.com>
Message-ID: <2xx8D-V_0yZ8M6ych4lMZbbHEr0Lwywyhz7yJW-2_NM=.af5cddf0-452c-4ac0-99df-f2e79902b22f@github.com>

On Wed, 16 Feb 2022 15:10:26 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> Trivial change of removing dead code.
> 
> Test: build

This pull request has now been integrated.

Changeset: 3b7a3cfc
Author:    Albert Mingkun Yang <ayang at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/3b7a3cfce345cc900e042c5378d35d1237bdcd78
Stats:     3 lines in 1 file changed: 0 ins; 3 del; 0 mod

8281971: Remove unimplemented InstanceRefKlass::do_next

Reviewed-by: dholmes

-------------

PR: https://git.openjdk.java.net/jdk/pull/7497

From duke at openjdk.java.net  Thu Feb 17 12:01:56 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Thu, 17 Feb 2022 12:01:56 GMT
Subject: RFR: 8281544: assert(VM_Version::supports_avx512bw()) failed for
 Tests jdk/incubator/vector/ [v2]
In-Reply-To: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
References: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
Message-ID: <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>

> `ZSaveLiveRegisters::ZSaveLiveRegisters` stores live registers, and later they are loaded again.
> This includes opmask registers, which are part of AVX512. However, not all platforms have all of the AVX512 instructions.
> For example Knights Landing has general AVX512 support and makes use of optmask registers, but does not support the AVX512 BW subset of instructions, specifically it does not support the `kmovql` instruction. Platforms like Cannon Landing have support for AVX512 BW.
> 
> Solution: in analogy to `RegisterSaver::save_live_registers`, which seems to perform a very similar task, use `MacroAssembler::kmov` instead of `kmovql` directly. Internally, `kmov` choses either `kmovql` if avx512bw is available, else it takes `kmovwl`.
> 
> As a regression test, I took one of the tests that failed with `-XX:+UnlockExperimentalVMOptions -XX:+UseZGC`, and added an additional `@run` statement with those flags. I simulated this test locally with Intel Software Development Emulator:
> `sde -knl`: Knights Landing, AVX512 but not BW, fails without change to `kmov`, passes with it.
> `sde -cnl`: Cannon Landing, has AVX512 BW, passes before and after code change.
> 
> Ran additional tests to verify that the test triggers before code change, and that with the code change nothing broke.
> 
> @neliasso Thanks for the help!

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  fix indentation

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7510/files
  - new: https://git.openjdk.java.net/jdk/pull/7510/files/9e4169fb..7636119d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7510&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7510&range=00-01

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7510.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7510/head:pull/7510

PR: https://git.openjdk.java.net/jdk/pull/7510

From redestad at openjdk.java.net  Thu Feb 17 15:28:57 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Thu, 17 Feb 2022 15:28:57 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v3]
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <7kC4xxWon70YnYlqH_KJFTa2eEJf-P3VQ1L9ahugJgk=.0943bcaa-b53d-4216-afa1-69496dac248a@github.com>

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 25 additional commits since the last revision:

 - Revert micro changes, split out to #7516
 - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives
 - Merge branch 'master' into count_positives
 - Restore partial vector checks in AVX2 and SSE intrinsic variants
 - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
 - Simplify changes to encodeUTF8
 - Fix little-endian error caught by testing
 - Reduce jumps in the ascii path
 - Remove unused tail_mask
 - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use count_positives)
 - ... and 15 more: https://git.openjdk.java.net/jdk/compare/1ca44ef9...531139a1

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7231/files
  - new: https://git.openjdk.java.net/jdk/pull/7231/files/c4bb3612..531139a1

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=01-02

  Stats: 10910 lines in 329 files changed: 7340 ins; 2150 del; 1420 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7231.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231

PR: https://git.openjdk.java.net/jdk/pull/7231

From zgu at openjdk.java.net  Thu Feb 17 15:52:16 2022
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 17 Feb 2022 15:52:16 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend
In-Reply-To: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
Message-ID: <JN39I3H42NHx43U-YhBWFx7CnX2_h0hQvKLDjc1ZiAM=.6c829ee4-db8e-4bf4-ac23-abda28817d4c@github.com>

On Mon, 31 Jan 2022 08:12:02 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> NMT backend can be further simplified and cleaned out.
> 
> - some entry points require NMT_TrackingLevel as arguments, some use the global tracking level. Ultimately, every part of NMT always uses the global tracking level, so in many cases the explicit parameter can be removed and the global tracking level can be used instead.
> - `MemTracker::malloc_header_size(level)` + `MemTracker::malloc_footer_size(level)` are fused into `MemTracker::overhead_per_malloc()`
> - when adding to `MallocSiteTable`, caller gets back a shortcut to the entry. That shortcut is stored verbatim in the malloc header. It consists of two 16-bit values (bucket index and chain position). That tupel finds its way into many argument lists. It can be simplified into single 32-bit opaque marker. Code outside the MallocSiteTable does not need to know what it is.
> - Currently, the `MallocHeader` class contains a lot of logic. It accounts (in constructor) and de-accounts (in `MallocHeader::release()`). It would simplify code if `MallocHeader` were just a dumb data carrier and the `MallocTracker` would do the actual work.
> - `MallocHeader` can be simplified, almost all members made constant and modifying accessors removed.
> - In some places we handle inputptr=NULL gracefully where we should assert instead
> - Expressions like `MemTracker::tracking_level() != NMT_off` can be simplified to `MemTracker::enabled()`.
> - MemTracker::malloc_base (all variants) can be removed. Note that we have MallocTracker::malloc_header, which achieves the same and does not require casting to the header.
> 
> Testing:
> 
> - GHAs
> - manually ran NMT gtests (all NMT modes) and NMT jtreg tests on Ubuntu x64
> - SAP nightlies ran through. Note that since 8275301 "Unify C-heap buffer overrun checks into NMT" NMT is enabled by default in debug builds, so it gets a lot more workout in tests now.
> 
> Note that I wanted to manually verify that the gdb "call pp" command still works in order to not break Zhengyu's recent addition, but found its already broken. I filed https://bugs.openjdk.java.net/browse/JDK-8281023 and am preparing a separate patch.

Overall is good, a few minor comments.

src/hotspot/share/services/mallocSiteTable.cpp line 161:

> 159: // Access malloc site
> 160: MallocSite* MallocSiteTable::malloc_site(uint32_t marker) {
> 161:   uint16_t bucket_idx = bucket_idx_from_marker(marker);

Please restore assert on bucket_idx.

src/hotspot/share/services/mallocTracker.hpp line 296:

> 294:   NOT_LP64(uint32_t _alt_canary);
> 295:   const size_t _size;
> 296:   const uint32_t _mst_marker;

make mst_marker a struct? instead of opaque type.

src/hotspot/share/services/memTracker.hpp line 115:

> 113:   static inline void* record_free(void* memblock, NMT_TrackingLevel level) {
> 114:     // Never turned on
> 115:     if (level == NMT_off || memblock == NULL) {

Wanna add assert `memblock != NULL`?

-------------

Marked as reviewed by zgu (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7283

From duke at openjdk.java.net  Thu Feb 17 16:17:09 2022
From: duke at openjdk.java.net (duke)
Date: Thu, 17 Feb 2022 16:17:09 GMT
Subject: Withdrawn: 8277930: Add unsafe allocation event to jfr
In-Reply-To: <hTSvT0d63lXUYdW8Y4Gk_DjO6rX5RsBnTwKQg9-if64=.cac9f512-62d8-4e6d-a25c-33287635b82d@github.com>
References: <hTSvT0d63lXUYdW8Y4Gk_DjO6rX5RsBnTwKQg9-if64=.cac9f512-62d8-4e6d-a25c-33287635b82d@github.com>
Message-ID: <e9VeT14L9sk8FhmjAQEpkbNnabYgissn3nPFc2DMqp4=.e9d8c8cc-397d-47d8-96de-1e89eb6ae93d@github.com>

On Mon, 29 Nov 2021 12:06:02 GMT, xpbob <duke at openjdk.java.net> wrote:

> Unsafe is used in many Java frameworks.
> When the framework has a unsafe memory leak , there is no way to know what code is causing it.
> Add unsafe allocation event to jfr.
> Records the size and stack allocated.
> This event is off by default

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6591

From jbhateja at openjdk.java.net  Thu Feb 17 17:43:43 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 17 Feb 2022 17:43:43 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v6]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <oMnlIO5l_pU71SvWpOFppQ-7882cq32UOjKqWZckxM0=.0efd7853-b30d-488b-92c4-4a8ad0412fda@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | --
> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  8279508: Fixing for windows failure.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/73674fe4..f35ed9cf

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=04-05

  Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From hseigel at openjdk.java.net  Thu Feb 17 19:17:34 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Thu, 17 Feb 2022 19:17:34 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large illegal
 options values
Message-ID: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>

Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.

The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.

Thanks, Harold

-------------

Commit messages:
 - 8281472: JVM options processing silently truncates large illegal options values

Changes: https://git.openjdk.java.net/jdk/pull/7522/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7522&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281472
  Stats: 148 lines in 3 files changed: 145 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7522/head:pull/7522

PR: https://git.openjdk.java.net/jdk/pull/7522

From kvn at openjdk.java.net  Thu Feb 17 20:07:14 2022
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Thu, 17 Feb 2022 20:07:14 GMT
Subject: RFR: 8281544: assert(VM_Version::supports_avx512bw()) failed for
 Tests jdk/incubator/vector/ [v2]
In-Reply-To: <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
References: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
 <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
Message-ID: <Vrh8Sk9NGYP8YnH9Dv4cOwG5qU0u93_A3NK_D4ieK2Q=.28f12483-fb57-4ddc-9342-31abcba4374f@github.com>

On Thu, 17 Feb 2022 12:01:56 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> `ZSaveLiveRegisters::ZSaveLiveRegisters` stores live registers, and later they are loaded again.
>> This includes opmask registers, which are part of AVX512. However, not all platforms have all of the AVX512 instructions.
>> For example Knights Landing has general AVX512 support and makes use of optmask registers, but does not support the AVX512 BW subset of instructions, specifically it does not support the `kmovql` instruction. Platforms like Cannon Landing have support for AVX512 BW.
>> 
>> Solution: in analogy to `RegisterSaver::save_live_registers`, which seems to perform a very similar task, use `MacroAssembler::kmov` instead of `kmovql` directly. Internally, `kmov` choses either `kmovql` if avx512bw is available, else it takes `kmovwl`.
>> 
>> As a regression test, I took one of the tests that failed with `-XX:+UnlockExperimentalVMOptions -XX:+UseZGC`, and added an additional `@run` statement with those flags. I simulated this test locally with Intel Software Development Emulator:
>> `sde -knl`: Knights Landing, AVX512 but not BW, fails without change to `kmov`, passes with it.
>> `sde -cnl`: Cannon Landing, has AVX512 BW, passes before and after code change.
>> 
>> Ran additional tests to verify that the test triggers before code change, and that with the code change nothing broke.
>> 
>> @neliasso Thanks for the help!
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix indentation

Good.

-------------

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7510

From neliasso at openjdk.java.net  Thu Feb 17 20:07:14 2022
From: neliasso at openjdk.java.net (Nils Eliasson)
Date: Thu, 17 Feb 2022 20:07:14 GMT
Subject: RFR: 8281544: assert(VM_Version::supports_avx512bw()) failed for
 Tests jdk/incubator/vector/ [v2]
In-Reply-To: <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
References: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
 <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
Message-ID: <Q5adA1YTFVDGwhi19AK6KGxB4OW-dqjmIgXRlL9vA2Y=.5a649118-ceab-4f0e-ad4b-a7d051c50e63@github.com>

On Thu, 17 Feb 2022 12:01:56 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> `ZSaveLiveRegisters::ZSaveLiveRegisters` stores live registers, and later they are loaded again.
>> This includes opmask registers, which are part of AVX512. However, not all platforms have all of the AVX512 instructions.
>> For example Knights Landing has general AVX512 support and makes use of optmask registers, but does not support the AVX512 BW subset of instructions, specifically it does not support the `kmovql` instruction. Platforms like Cannon Landing have support for AVX512 BW.
>> 
>> Solution: in analogy to `RegisterSaver::save_live_registers`, which seems to perform a very similar task, use `MacroAssembler::kmov` instead of `kmovql` directly. Internally, `kmov` choses either `kmovql` if avx512bw is available, else it takes `kmovwl`.
>> 
>> As a regression test, I took one of the tests that failed with `-XX:+UnlockExperimentalVMOptions -XX:+UseZGC`, and added an additional `@run` statement with those flags. I simulated this test locally with Intel Software Development Emulator:
>> `sde -knl`: Knights Landing, AVX512 but not BW, fails without change to `kmov`, passes with it.
>> `sde -cnl`: Cannon Landing, has AVX512 BW, passes before and after code change.
>> 
>> Ran additional tests to verify that the test triggers before code change, and that with the code change nothing broke.
>> 
>> @neliasso Thanks for the help!
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix indentation

Looks good!

-------------

Marked as reviewed by neliasso (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7510

From dholmes at openjdk.java.net  Thu Feb 17 22:32:11 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 17 Feb 2022 22:32:11 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
Message-ID: <KISrp8xGgyrs47F8yAF0YEW-j7pFiYHcdkSHzB4hSKI=.7638c0bd-2a29-44c1-a6a2-5be099e57cbf@github.com>

On Thu, 17 Feb 2022 19:09:26 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
> 
> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
> 
> Thanks, Harold

src/hotspot/share/runtime/arguments.cpp line 874:

> 872:     if (v > (julong)max_juint + 1) {
> 873:       return false;
> 874:     }

This seems very suspicious. Where is the code that depends on this?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From dholmes at openjdk.java.net  Thu Feb 17 22:57:50 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 17 Feb 2022 22:57:50 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
Message-ID: <B_VQIoP164ASEtyx1cVu1JfO3ftpr-hzc02qyy28pLE=.8150cc37-5b14-4a8d-a90f-7e6e6f8cabbd@github.com>

On Thu, 17 Feb 2022 19:09:26 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
> 
> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
> 
> Thanks, Harold

A gtest would seem far simpler to write and allow for easy checking of all the interesting boundary cases:
- 0 +/- 1
- max jint +/- 1
-  min jint +/- 1
- etc

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From ccheung at openjdk.java.net  Thu Feb 17 23:24:05 2022
From: ccheung at openjdk.java.net (Calvin Cheung)
Date: Thu, 17 Feb 2022 23:24:05 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v4]
In-Reply-To: <vr6Kx9et3LNNBT76J3vEav7eYlVk9rmdzmd4CPVlzH0=.40bd2ef0-edba-4c4a-a36d-86e72b7a0079@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
 <vr6Kx9et3LNNBT76J3vEav7eYlVk9rmdzmd4CPVlzH0=.40bd2ef0-edba-4c4a-a36d-86e72b7a0079@github.com>
Message-ID: <ZNXfiilUk7SSsjyBRUvvr8sk-MhZyJiq-A_PYe5HiwQ=.5b1327d9-7047-4b64-bc04-ec82e76406fc@github.com>

On Wed, 19 Jan 2022 05:47:57 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> **Background:**
>> 
>> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this:
>> 
>> 
>> public enum Day {  SUNDAY, MONDAY ... } 
>> 
>> 
>> to
>> 
>> 
>> public class Day extends java.lang.Enum {
>>     public static final SUNDAY = new Day("SUNDAY");
>>     public static final MONDAY = new Day("MONDAY"); ...
>> }
>> 
>> 
>> With CDS archived heap objects, `Day::<clinit>` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
>> 
>> **Fix:**
>> 
>> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::<clinit>`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time.
>> 
>> This is safe as we know that `X::<clinit>` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized.
>> 
>> **Verification:**
>> 
>> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
>> 
>> **Testing:**
>> 
>> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use InstanceKlass::do_local_static_fields for some field iterations

Looks good. Minor comment below.
Also, several files with copyright year 2021 need updating.

src/hotspot/share/cds/cdsHeapVerifier.cpp line 63:

> 61: // class Bar {
> 62: //     // this field is initialized in both CDS dump time and runtime.
> 63: //     static final Bar bar = new Bar;

`new Bar` should be `new Bar()`?

-------------

Marked as reviewed by ccheung (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6653

From duke at openjdk.java.net  Fri Feb 18 05:44:34 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Fri, 18 Feb 2022 05:44:34 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v3]
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <enpXnRowRytWp1G8YAZ_7PC_o-2jKgFqOCUcm5KReCo=.ef081f84-5468-40ba-8851-9761f03c903e@github.com>

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:

  8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7227/files
  - new: https://git.openjdk.java.net/jdk/pull/7227/files/c2ad1c39..561cce33

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=01-02

  Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7227.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7227/head:pull/7227

PR: https://git.openjdk.java.net/jdk/pull/7227

From duke at openjdk.java.net  Fri Feb 18 05:44:35 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Fri, 18 Feb 2022 05:44:35 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v2]
In-Reply-To: <baiZLbXpFN5geyr1vCd-gyxggf4KgfG_vqWM0cbz-Ow=.f7616cba-ae75-4b04-a6d3-f729ef65456c@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <qmoxoovG_EkvqJcqcWuYumDABGKjt7toZU6efFb0Ul4=.f62f2920-3721-4749-9b55-5f28264c09f4@github.com>
 <baiZLbXpFN5geyr1vCd-gyxggf4KgfG_vqWM0cbz-Ow=.f7616cba-ae75-4b04-a6d3-f729ef65456c@github.com>
Message-ID: <2b60oveZnuqrsYdJ7JapCles3df7i5PKH3ofXyy77s8=.145a285a-c169-4a55-89de-cb882312efdb@github.com>

On Thu, 17 Feb 2022 10:24:19 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.
>
> src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp line 90:
> 
>> 88:         JfrJavaSupport::abort(JfrJavaSupport::new_string(msg, jt), jt, false);
>> 89:     }
>> 90:     else {
> 
> The else block can be removed. Just put the guarantee inline with the other code.

Thank you so much. You're right. I removed the else block.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From thartmann at openjdk.java.net  Fri Feb 18 08:13:53 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Fri, 18 Feb 2022 08:13:53 GMT
Subject: RFR: 8281544: assert(VM_Version::supports_avx512bw()) failed for
 Tests jdk/incubator/vector/ [v2]
In-Reply-To: <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
References: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
 <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
Message-ID: <C8qTYm47Gm8Yt3STYwG88cYo9XvfCrH3jSpkiGJGAts=.dd893456-6c60-4405-bc88-690c1fab1209@github.com>

On Thu, 17 Feb 2022 12:01:56 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

>> `ZSaveLiveRegisters::ZSaveLiveRegisters` stores live registers, and later they are loaded again.
>> This includes opmask registers, which are part of AVX512. However, not all platforms have all of the AVX512 instructions.
>> For example Knights Landing has general AVX512 support and makes use of optmask registers, but does not support the AVX512 BW subset of instructions, specifically it does not support the `kmovql` instruction. Platforms like Cannon Landing have support for AVX512 BW.
>> 
>> Solution: in analogy to `RegisterSaver::save_live_registers`, which seems to perform a very similar task, use `MacroAssembler::kmov` instead of `kmovql` directly. Internally, `kmov` choses either `kmovql` if avx512bw is available, else it takes `kmovwl`.
>> 
>> As a regression test, I took one of the tests that failed with `-XX:+UnlockExperimentalVMOptions -XX:+UseZGC`, and added an additional `@run` statement with those flags. I simulated this test locally with Intel Software Development Emulator:
>> `sde -knl`: Knights Landing, AVX512 but not BW, fails without change to `kmov`, passes with it.
>> `sde -cnl`: Cannon Landing, has AVX512 BW, passes before and after code change.
>> 
>> Ran additional tests to verify that the test triggers before code change, and that with the code change nothing broke.
>> 
>> @neliasso Thanks for the help!
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix indentation

Looks good.

-------------

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7510

From mgronlun at openjdk.java.net  Fri Feb 18 11:23:51 2022
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Fri, 18 Feb 2022 11:23:51 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v3]
In-Reply-To: <enpXnRowRytWp1G8YAZ_7PC_o-2jKgFqOCUcm5KReCo=.ef081f84-5468-40ba-8851-9761f03c903e@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <enpXnRowRytWp1G8YAZ_7PC_o-2jKgFqOCUcm5KReCo=.ef081f84-5468-40ba-8851-9761f03c903e@github.com>
Message-ID: <_Ozu_sZZUPH-7vFaMfyzJFBv5WxQicM9asfJ9dK_jzg=.e9c28828-ca74-49e6-b0e8-33e79ea8a086@github.com>

On Fri, 18 Feb 2022 05:44:34 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>> 
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>> 
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>> 
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
>> because there is no space on device.
>> Could you please review the fix?
>
> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

diff --git a/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp b/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp
index 95b96e02c06..015d4ebe065 100644
--- a/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp
+++ b/src/hotspot/share/jfr/jni/jfrJavaSupport.cpp
@@ -563,14 +563,16 @@ void JfrJavaSupport::throw_runtime_exception(const char* message, TRAPS) {
 
 void JfrJavaSupport::abort(jstring errorMsg, JavaThread* t) {
   DEBUG_ONLY(check_java_thread_in_vm(t));
-
   ResourceMark rm(t);
-  const char* const error_msg = c_str(errorMsg, t);
-  if (error_msg != NULL) {
-    log_error(jfr, system)("%s",error_msg);
+  abort(c_str(errorMsg, t));
+}
+
+void JfrJavaSupport::abort(const char* error_msg, bool dump_core /* true */) {
+  if (error_msg != nullptr) {
+    log_error(jfr, system)("%s", error_msg);
   }
   log_error(jfr, system)("%s", "An irrecoverable error in Jfr. Shutting down VM...");
-  vm_abort();
+  vm_abort(dump_core);
 }
 
 JfrJavaSupport::CAUSE JfrJavaSupport::_cause = JfrJavaSupport::VM_ERROR;
diff --git a/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp b/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp
index 53d6eed68a8..1ec5a884b4b 100644
--- a/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp
+++ b/src/hotspot/share/jfr/jni/jfrJavaSupport.hpp
@@ -112,6 +112,7 @@ class JfrJavaSupport : public AllStatic {
 
   // critical
   static void abort(jstring errorMsg, TRAPS);
+  static void abort(const char* error_msg, bool dump_core = true);
   static void uncaught_exception(jthrowable throwable, JavaThread* t);
 
   // asserts
diff --git a/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp b/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp
index 3a7ec286381..73404a1aede 100644
--- a/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp
+++ b/src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp
@@ -25,8 +25,8 @@
 #ifndef SHARE_JFR_WRITERS_JFRSTREAMWRITERHOST_INLINE_HPP
 #define SHARE_JFR_WRITERS_JFRSTREAMWRITERHOST_INLINE_HPP
 
+#include "jfr/jni/jfrJavaSupport.hpp"
 #include "jfr/writers/jfrStreamWriterHost.hpp"
-
 #include "runtime/os.hpp"
 
 template <typename Adapter, typename AP>
@@ -77,6 +77,9 @@ inline void StreamWriterHost<Adapter, AP>::write_bytes(const u1* buf, intptr_t l
   while (len > 0) {
     const unsigned int nBytes = len > INT_MAX ? INT_MAX : (unsigned int)len;
     const ssize_t num_written = os::write(_fd, buf, nBytes);
+    if (errno == ENOSPC) {
+      JfrJavaSupport::abort("Failed to write to jfr stream because no space left on device", false);
+    }
     guarantee(num_written > 0, "Nothing got written, or os::write() failed");
     _stream_pos += num_written;
     len -= num_written;

src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp line 88:

> 86:         JavaThread* jt = JavaThread::current();
> 87:         ThreadInVMfromNative transition(jt);
> 88:         JfrJavaSupport::abort(JfrJavaSupport::new_string(msg, jt), jt, false);

Hi again Takuya, I'm sorry, but I should have noticed this earlier:

I now see that the code needs to allocate a Java string oop to conform to the existing abort function signature, which caters to invocations from Java. Then abort() immediately strips out the c-string from the oop. To be correct, also headers for logging/log.hpp and runtime/thread.inline.hpp should need be included.

I believe we can simplify this by updating the abort() signature so that we don't need to drag in those extra dependencies. Please see my following comment where I suggest a way to do this.

Thanks for your patience
Markus

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From hseigel at openjdk.java.net  Fri Feb 18 13:30:52 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 18 Feb 2022 13:30:52 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <KISrp8xGgyrs47F8yAF0YEW-j7pFiYHcdkSHzB4hSKI=.7638c0bd-2a29-44c1-a6a2-5be099e57cbf@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
 <KISrp8xGgyrs47F8yAF0YEW-j7pFiYHcdkSHzB4hSKI=.7638c0bd-2a29-44c1-a6a2-5be099e57cbf@github.com>
Message-ID: <DUpqP1p7ZLgqu-XWeCKF4WeOp5Gtj_VefJe3sPztULA=.86daa5dd-fe4d-4bd5-b6ff-ec42f6e0ffc1@github.com>

On Thu, 17 Feb 2022 22:28:56 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
>> 
>> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
>> 
>> Thanks, Harold
>
> src/hotspot/share/runtime/arguments.cpp line 874:
> 
>> 872:     if (v > (julong)max_juint + 1) {
>> 873:       return false;
>> 874:     }
> 
> This seems very suspicious. Where is the code that depends on this?

Test test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java has the following code:


    // 4294967295 == (unsigned int) -1
    // So setting ParallelGCThreads=4294967295 should give back 4294967295
    // and setting ParallelGCThreads=4294967296 should give back 0. (SerialGC is ok with ParallelGCThreads=0)
    for (long i = 4294967295L; i <= 4294967296L; i++) {
      long count = getParallelGCThreadCount(
          "-XX:+UseSerialGC",
          "-XX:ParallelGCThreads=" + i,
          "-XX:+PrintFlagsFinal",
          "-version");
      Asserts.assertEQ(count, i % 4294967296L, "Specifying ParallelGCThreads=" + i + " does not set the thread count properly!");
    }
  }

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From dholmes at openjdk.java.net  Fri Feb 18 13:50:51 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 18 Feb 2022 13:50:51 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <DUpqP1p7ZLgqu-XWeCKF4WeOp5Gtj_VefJe3sPztULA=.86daa5dd-fe4d-4bd5-b6ff-ec42f6e0ffc1@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
 <KISrp8xGgyrs47F8yAF0YEW-j7pFiYHcdkSHzB4hSKI=.7638c0bd-2a29-44c1-a6a2-5be099e57cbf@github.com>
 <DUpqP1p7ZLgqu-XWeCKF4WeOp5Gtj_VefJe3sPztULA=.86daa5dd-fe4d-4bd5-b6ff-ec42f6e0ffc1@github.com>
Message-ID: <SABbCYGSCdhaQ-Rtnr7W1NNprWB0toHIps7pX_Vl78w=.0888af98-abe7-4570-9e66-e2bf0e0ddd51@github.com>

On Fri, 18 Feb 2022 13:27:40 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

>> src/hotspot/share/runtime/arguments.cpp line 874:
>> 
>>> 872:     if (v > (julong)max_juint + 1) {
>>> 873:       return false;
>>> 874:     }
>> 
>> This seems very suspicious. Where is the code that depends on this?
>
> Test test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java has the following code:
> 
> 
>     // 4294967295 == (unsigned int) -1
>     // So setting ParallelGCThreads=4294967295 should give back 4294967295
>     // and setting ParallelGCThreads=4294967296 should give back 0. (SerialGC is ok with ParallelGCThreads=0)
>     for (long i = 4294967295L; i <= 4294967296L; i++) {
>       long count = getParallelGCThreadCount(
>           "-XX:+UseSerialGC",
>           "-XX:ParallelGCThreads=" + i,
>           "-XX:+PrintFlagsFinal",
>           "-version");
>       Asserts.assertEQ(count, i % 4294967296L, "Specifying ParallelGCThreads=" + i + " does not set the thread count properly!");
>     }
>   }

That test seems bizarre to me - perhaps someone from GC team can comment on why it expects to see what it sees. But I would not suggest we cripple the argument processing logic just because one test expects it to behave in a strange way.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From redestad at openjdk.java.net  Fri Feb 18 15:57:40 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Fri, 18 Feb 2022 15:57:40 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v4]
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <GcSawWkc8_paTY2GS3mVA6ZKWgqdEOVFErLVR1YATOw=.cc22549d-713f-4379-9340-6ebe33afe804@github.com>

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:

  Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7231/files
  - new: https://git.openjdk.java.net/jdk/pull/7231/files/531139a1..a5e28b32

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=02-03

  Stats: 59 lines in 8 files changed: 7 ins; 14 del; 38 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7231.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231

PR: https://git.openjdk.java.net/jdk/pull/7231

From iklam at openjdk.java.net  Fri Feb 18 18:50:54 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Fri, 18 Feb 2022 18:50:54 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
Message-ID: <jSF0QONR5MXdP-tt5K50jR1M5saYf17vKkVrQWE0rM4=.5e2dca4c-7181-42a9-8dae-64c4d576d06c@github.com>

On Thu, 17 Feb 2022 19:09:26 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
> 
> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
> 
> Thanks, Harold

src/hotspot/share/runtime/arguments.cpp line 889:

> 887:     // -9223372036854775808.  Negating intx_v for such values will erroneously
> 888:     // make them positive.
> 889:     if (is_neg && intx_v > 0) {

I found it hard to reason with the casts such as `(uintx)(min_intx)`, even though they appear to be correct. I think this will be simpler and more readable:


intx_v = (intx) v;
if (is_neg) {
  intx_v = - intx_v;
  if (intx_v > 0) {
    return false; // underflow
  }
} else {
  if (intx_v < 0) {
    return false; // overflow
  }
}

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From iklam at openjdk.java.net  Fri Feb 18 18:56:47 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Fri, 18 Feb 2022 18:56:47 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <SABbCYGSCdhaQ-Rtnr7W1NNprWB0toHIps7pX_Vl78w=.0888af98-abe7-4570-9e66-e2bf0e0ddd51@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
 <KISrp8xGgyrs47F8yAF0YEW-j7pFiYHcdkSHzB4hSKI=.7638c0bd-2a29-44c1-a6a2-5be099e57cbf@github.com>
 <DUpqP1p7ZLgqu-XWeCKF4WeOp5Gtj_VefJe3sPztULA=.86daa5dd-fe4d-4bd5-b6ff-ec42f6e0ffc1@github.com>
 <SABbCYGSCdhaQ-Rtnr7W1NNprWB0toHIps7pX_Vl78w=.0888af98-abe7-4570-9e66-e2bf0e0ddd51@github.com>
Message-ID: <-6SnWhIU0I645Nd2EZZtOhJ4nbCx0VOwPjt23mhlco4=.8b1ac953-6f96-48b7-9a87-2b16e2f203f6@github.com>

On Fri, 18 Feb 2022 13:47:19 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Test test/hotspot/jtreg/gc/arguments/TestParallelGCThreads.java has the following code:
>> 
>> 
>>     // 4294967295 == (unsigned int) -1
>>     // So setting ParallelGCThreads=4294967295 should give back 4294967295
>>     // and setting ParallelGCThreads=4294967296 should give back 0. (SerialGC is ok with ParallelGCThreads=0)
>>     for (long i = 4294967295L; i <= 4294967296L; i++) {
>>       long count = getParallelGCThreadCount(
>>           "-XX:+UseSerialGC",
>>           "-XX:ParallelGCThreads=" + i,
>>           "-XX:+PrintFlagsFinal",
>>           "-version");
>>       Asserts.assertEQ(count, i % 4294967296L, "Specifying ParallelGCThreads=" + i + " does not set the thread count properly!");
>>     }
>>   }
>
> That test seems bizarre to me - perhaps someone from GC team can comment on why it expects to see what it sees. But I would not suggest we cripple the argument processing logic just because one test expects it to behave in a strange way.

Setting ParallelGCThreads=4294967296 isn't a reasonable real-life use case. There's no need to check for the JVM's behavior under this situation (which now becomes illegal).

There's already a test for  "SerialGC is ok with ParallelGCThreads=0" on line 100 of this file. So the block quoted by Harold should be removed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From hseigel at openjdk.java.net  Fri Feb 18 19:29:53 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 18 Feb 2022 19:29:53 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <jSF0QONR5MXdP-tt5K50jR1M5saYf17vKkVrQWE0rM4=.5e2dca4c-7181-42a9-8dae-64c4d576d06c@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
 <jSF0QONR5MXdP-tt5K50jR1M5saYf17vKkVrQWE0rM4=.5e2dca4c-7181-42a9-8dae-64c4d576d06c@github.com>
Message-ID: <QtGdz9h4LPbhkR0-I-aM7LYO6npYUdwj0hzmAiqOZxs=.49351ff3-4666-4a0d-8f5c-4764f81e3c32@github.com>

On Fri, 18 Feb 2022 18:47:30 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
>> 
>> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
>> 
>> Thanks, Harold
>
> src/hotspot/share/runtime/arguments.cpp line 889:
> 
>> 887:     // -9223372036854775808.  Negating intx_v for such values will erroneously
>> 888:     // make them positive.
>> 889:     if (is_neg && intx_v > 0) {
> 
> I found it hard to reason with the casts such as `(uintx)(min_intx)`, even though they appear to be correct. I think this will be simpler and more readable:
> 
> 
> intx_v = (intx) v;
> if (is_neg) {
>   intx_v = - intx_v;
>   if (intx_v > 0) {
>     return false; // underflow
>   }
> } else {
>   if (intx_v < 0) {
>     return false; // overflow
>   }
> }

That doesn't work for intx options set to min_intx, such as MaxJNILocalCapacity=-9223372036854775808.  Perhaps min_intx should be special cased?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From ioi.lam at oracle.com  Fri Feb 18 19:31:20 2022
From: ioi.lam at oracle.com (Ioi Lam)
Date: Fri, 18 Feb 2022 11:31:20 -0800
Subject: [RFC containers] 8281571 Do not use CPU Shares to compute active
 processor count
Message-ID: <554c30f8-5d5d-8d98-4e1a-2883cf833f94@oracle.com>

Hi Folks,

I have filed the CSR https://bugs.openjdk.java.net/browse/JDK-8281571

Summary

Modify HotSpot's Linux-only container detection code to not use CPU Shares
(the "cpu.shares" file with cgroupv1 or "cpu.weight" file with cgroupv2,
exposed through the CgroupSubsystem::cpu_shares() API) to limit the 
number of
active processors that can be used by the JVM. Add a new flag (immediately
deprecated), UseContainerCpuShares, to restore the old behaviour; and
deprecate the existing PreferContainerQuotaForCPUCount flag.

Please refer to the CSR for the reasons for making this change, as well as
ways to address compatibility risks.

If you have any concerns, please let me know. Otherwise I plan to move the
CSR to "finalized" state and start RFR in two weeks.

Thanks to Severin Gehwolf and David Holmes for contributing to the CSR.

Best Regards
- Ioi


From duke at openjdk.java.net  Sat Feb 19 08:04:52 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Sat, 19 Feb 2022 08:04:52 GMT
Subject: RFR: 8281544: assert(VM_Version::supports_avx512bw()) failed for
 Tests jdk/incubator/vector/ [v2]
In-Reply-To: <Q5adA1YTFVDGwhi19AK6KGxB4OW-dqjmIgXRlL9vA2Y=.5a649118-ceab-4f0e-ad4b-a7d051c50e63@github.com>
References: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
 <MkCnJurj0-UxLlqU8dgicJKsvE39hvJl63VHvi_wm8g=.97e9ed92-16a0-4980-948a-524fd8bb05fd@github.com>
 <Q5adA1YTFVDGwhi19AK6KGxB4OW-dqjmIgXRlL9vA2Y=.5a649118-ceab-4f0e-ad4b-a7d051c50e63@github.com>
Message-ID: <lmoGVDdFE-oSMaycj1cZ85s7jP82Z9MDOm56JfzTjDE=.813b01b8-121a-484b-963d-75db4506c57f@github.com>

On Thu, 17 Feb 2022 20:03:38 GMT, Nils Eliasson <neliasso at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix indentation
>
> Looks good!

Thanks @neliasso @TobiHartmann @vnkozlov for the reviews!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7510

From shade at openjdk.java.net  Mon Feb 21 06:03:50 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 21 Feb 2022 06:03:50 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <GLVUYQGrDmGTRLj7Mo9qNjayKRBMbc6jG0zYcjnENL8=.26f94daf-90eb-4b83-9085-363b53c9d517@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

Anyone else? :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From dholmes at openjdk.java.net  Mon Feb 21 06:17:49 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 21 Feb 2022 06:17:49 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <eDBD63MmL067rIqyJoQFm3NO3Py9r06bxnNjrlMv8sc=.73e494b8-1f9c-420d-a36e-5a98c3155d71@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

Seems quite reasonable based on the description.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7475

From jiefu at openjdk.java.net  Mon Feb 21 06:17:49 2022
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 21 Feb 2022 06:17:49 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <BWgWfDIb0v_cMQ8-Qe3sKmZE6QwSKy2b7Ev9k-Efkbs=.90b83cb8-d3e8-4dc9-9976-796ebd7c9844@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

Please also update the copy right year.

-------------

Marked as reviewed by jiefu (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7475

From shade at openjdk.java.net  Mon Feb 21 06:17:49 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 21 Feb 2022 06:17:49 GMT
Subject: RFR: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <-P4AB1UaiQODwFcuprTb3Mh2ChfJn6fZ6sct1x2aE58=.a12703e8-f46e-4e35-8ce2-ab8d08082380@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

Thank you!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From shade at openjdk.java.net  Mon Feb 21 06:17:50 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 21 Feb 2022 06:17:50 GMT
Subject: Integrated: 8281815: x86: Use short jumps in
 TIG::generate_slow_signature_handler
In-Reply-To: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
References: <gAs8uLzOKE1kigXwpBvJjEYlQXMH8fSArB6FRnpdi0I=.a522a4a9-5027-4dab-afa3-2434f8dc54ca@github.com>
Message-ID: <HwwKUcQKMRh58t1XMgdYgdZXCm4vgEo6FU1rNyYLt2U=.9d1a1ea3-50c1-409c-a941-9118b75cc121@github.com>

On Tue, 15 Feb 2022 09:40:28 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Similar to [JDK-8281744](https://bugs.openjdk.java.net/browse/JDK-8281744), this change improves `TemplateInterpreterGenerator::generate_slow_signature_handler`: there are only a few moves between the jumps, and we can tell `MacroAssembler` those can be short. This code is used to process arguments after the slow call to VM, so the performance improvement is drowned by the call itself. This makes interpreter code a bit more compact, though.
> 
> Additional testing:
>  - [x] Linux x86_64 fastdebug `hotspot:tier1`
>  - [x] Linux x86_32 fastdebug `hotspot:tier1`

This pull request has now been integrated.

Changeset: d28b048f
Author:    Aleksey Shipilev <shade at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/d28b048f35d5893187076e853a4a898d5ca8b220
Stats:     6 lines in 1 file changed: 0 ins; 0 del; 6 mod

8281815: x86: Use short jumps in TIG::generate_slow_signature_handler

Reviewed-by: rrich, dholmes, jiefu

-------------

PR: https://git.openjdk.java.net/jdk/pull/7475

From pli at openjdk.java.net  Mon Feb 21 06:19:26 2022
From: pli at openjdk.java.net (Pengfei Li)
Date: Mon, 21 Feb 2022 06:19:26 GMT
Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v4]
In-Reply-To: <M1ojlBqpMKuAC-oAJdp1JC5vgvzhdkxjnJmTaX0v4sI=.932c9d8d-9663-4e7b-9deb-a88d54cc72bc@github.com>
References: <M1ojlBqpMKuAC-oAJdp1JC5vgvzhdkxjnJmTaX0v4sI=.932c9d8d-9663-4e7b-9deb-a88d54cc72bc@github.com>
Message-ID: <DPPICoKR2tIZ8nbOaBDV2vFi7dp-lCjSnPbOc4ExeJc=.861804f6-b8fa-4d0d-9504-238ea0a89ccc@github.com>

> ### Background
> 
> Post loop vectorization is a C2 compiler optimization in an experimental
> VM feature called PostLoopMultiversioning. It transforms the range-check
> eliminated post loop to a 1-iteration vectorized loop with vector mask.
> This optimization was contributed by Intel in 2016 to support x86 AVX512
> masked vector instructions. However, it was disabled soon after an issue
> was found. Due to insufficient maintenance in these years, multiple bugs
> have been accumulated inside. But we (Arm) still think this is a useful
> framework for vector mask support in C2 auto-vectorized loops, for both
> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable
> post loop vectorization.
> 
> ### Changes in this patch
> 
> This patch reworks post loop vectorization. The most significant change
> is removing vector mask support in C2 x86 backend and re-implementing
> it in the mid-end. With this, we can re-enable post loop vectorization
> for platforms other than x86.
> 
> Previous implementation hard-codes x86 k1 register as a reserved AVX512
> opmask register and defines two routines (setvectmask/restorevectmask)
> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes
> AVX512 instructions as unmasked by default, generated vector masks are
> no longer used in AVX512 vector instructions. To fix incorrect codegen
> and add vector mask support for more platforms, we turn to add a vector
> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode
> to generate a mask and replace all Load/Store nodes in the post loop
> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This
> IR form is exactly the same to those which are used in VectorAPI mask
> support. For now, we only add mask inputs for Load/Store nodes because
> we don't have reduction operations supported in post loop vectorization.
> After this change, the x86 k1 register is no longer reserved and can be
> allocated when PostLoopMultiversioning is enabled.
> 
> Besides this change, we have fixed a compiler crash and five incorrect
> result issues with post loop vectorization.
> 
> **I) C2 crashes with segmentation fault in strip-mined loops**
> 
> Previous implementation was done before C2 loop strip-mining was merged
> into JDK master so it didn't take strip-mined loops into consideration.
> In C2's strip mined loops, post loop is not the sibling of the main loop
> in ideal loop tree. Instead, it's the sibling of the main loop's parent.
> This patch fixed a SIGSEGV issue caused by NULL pointer when locating
> post loop from strip-mined main loop.
> 
> **II) Incorrect result issues with post loop vectorization**
> 
> We have also fixed five incorrect vectorization issues. Some of them are
> hidden deep and can only be reproduced with corner cases. These issues
> have a common cause that it assumes the post loop can be vectorized if
> the vectorization in corresponding main loop is successful. But in many
> cases this assumption is wrong. Below are details.
> 
> - **[Issue-1] Incorrect vectorization for partial vectorizable loops**
> 
> This issue can be reproduced by below loop where only some operations in
> the loop body are vectorizable.
> 
>   for (int i = 0; i < 10000; i++) {
>     res[i] = a[i] * b[i];
>     k = 3 * k + 1;
>   }
> 
> In the main loop, superword can work well if parts of the operations in
> loop body are not vectorizable since those parts can be unrolled only.
> But for post loops, we don't create vectors through combining scalar IRs
> generated from loop unrolling. Instead, we are doing scalars to vectors
> replacement for all operations in the loop body. Hence, all operations
> should be either vectorized together or not vectorized at all. To fix
> this kind of cases, we add an extra field "_slp_vector_pack_count" in
> CountedLoopNode to record the eventual count of vector packs in the main
> loop. This value is then passed to post loop and compared with post loop
> pack count. Vectorization will be bailed out in post loop if it creates
> more vector packs than in the main loop.
> 
> - **[Issue-2] Incorrect result in loops with growing-down vectors**
> 
> This issue appears with growing-down vectors, that is, vectors that grow
> to smaller memory address as the loop iterates. It can be reproduced by
> below counting-up loop with negative scale value in array index.
> 
>   for (int i = 0; i < 10000; i++) {
>     a[MAX - i] = b[MAX - i];
>   }
> 
> Cause of this issue is that for a growing-down vector, generated vector
> mask value has reversed vector-lane order so it masks incorrect vector
> lanes. Note that if negative scale value appears in counting-down loops,
> the vector will be growing up. With this rule, we fix the issue by only
> allowing positive array index scales in counting-up loops and negative
> array index scales in counting-down loops. This check is done with the
> help of SWPointer by comparing scale values in each memory access in the
> loop with loop stride value.
> 
> - **[Issue-3] Incorrect result in manually unrolled loops**
> 
> This issue can be reproduced by below manually unrolled loop.
> 
>   for (int i = 0; i < 10000; i += 2) {
>     c[i] = a[i] + b[i];
>     c[i + 1] = a[i + 1] * b[i + 1];
>   }
> 
> In this loop, operations in the 2nd statement duplicate those in the 1st
> statement with a small memory address offset. Vectorization in the main
> loop works well in this case because C2 does further unrolling and pack
> combination. But we cannot vectorize the post loop through replacement
> from scalars to vectors because it creates duplicated vector operations.
> To fix this, we restrict post loop vectorization to loops with stride
> values of 1 or -1.
> 
> - **[Issue-4] Incorrect result in loops with mixed vector element sizes**
> 
> This issue is found after we enable post loop vectorization for AArch64.
> It's reproducible by multiple array operations with different element
> sizes inside a loop. On x86, there is no issue because the values of x86
> AVX512 opmasks only depend on which vector lanes are active. But AArch64
> is different - the values of SVE predicates also depend on lane size of
> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element
> sizes, we should use different vector masks. For now, we just support
> loops with only one vector element size, i.e., "int + float" vectors in
> a single loop is ok but "int + double" vectors in a single loop is not
> vectorizable. This fix also enables subword vectors support to make all
> primitive type array operations vectorizable.
> 
> - **[Issue-5] Incorrect result in loops with potential data dependence**
> 
> This issue can be reproduced by below corner case on AArch64 only.
> 
>   for (int i = 0; i < 10000; i++) {
>     a[i] = x;
>     a[i + OFFSET] = y;
>   }
> 
> In this case, two stores in the loop have data dependence if the OFFSET
> value is smaller than the vector length. So we cannot do vectorization
> through replacing scalars to vectors. But the main loop vectorization
> in this case is successful on AArch64 because AArch64 has partial vector
> load/store support. It splits vector fill with different values in lanes
> to several smaller-sized fills. In this patch, we add additional data
> dependence check for this kind of cases. The check is also done with the
> help of SWPointer class. In this check, we require that every two memory
> accesses (with at least one store) of the same element type (or subword
> size) in the loop has the same array index expression.
> 
> ### Tests
> 
> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with
> experimental VM option "PostLoopMultiversioning" turned on. We found no
> issue in all tests. We notice that those existing cases are not enough
> because some of above issues are not spotted by them. We would like to
> add some new cases but we found existing vectorization tests are a bit
> cumbersome - golden results must be pre-calculated and hard-coded in the
> test code for correctness verification. Thus, in this patch, we propose
> a new vectorization testing framework.
> 
> Our new framework brings a simpler way to add new cases. For a new test
> case, we only need to create a new method annotated with "@Test". The
> test runner will invoke each annotated method twice automatically. First
> time it runs in the interpreter and second time it's forced compiled by
> C2. Then the two return results are compared. So in this framework each
> test method should return a primitive value or an array of primitives.
> In this way, no extra verification code for vectorization correctness is
> required. This test runner is still jtreg-based and takes advantages of
> the jtreg WhiteBox API, which enables test methods running at specific
> compilation levels. Each test class inside is also jtreg-based. It just
> need to inherit from the test runner class and run with two additional
> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI".
> 
> ### Summary & Future work
> 
> In this patch, we reworked post loop vectorization. We made it platform
> independent and fixed several issues inside. We also implemented a new
> vectorization testing framework with many test cases inside. Meanwhile,
> we did some code cleanups.
> 
> This patch only touches C2 code guarded with PostLoopMultiversioning,
> except a few data structure changes. So, there's no behavior change when
> experimental VM option PostLoopMultiversioning is off. Also, to reduce
> risks, we still propose to keep post loop vectorization experimental for
> now. But if it receives positive feedback, we would like to change it to
> non-experimental in the future.

Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:

 - Merge branch 'master' into postloop
   
   Change-Id: I503edb75f0f626569c776416bfef09651935979c
 - Update copyright year and rename a function
   
   Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb
 - Merge branch 'master' into postloop
   
   Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4
 - Fix issues in newly added test framework
   
   Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1
 - Merge branch 'master' into postloop
   
   Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6
 - 8183390: Fix and re-enable post loop vectorization
   
   ** Background
   
   Post loop vectorization is a C2 compiler optimization in an experimental
   VM feature called PostLoopMultiversioning. It transforms the range-check
   eliminated post loop to a 1-iteration vectorized loop with vector mask.
   This optimization was contributed by Intel in 2016 to support x86 AVX512
   masked vector instructions. However, it was disabled soon after an issue
   was found. Due to insufficient maintenance in these years, multiple bugs
   have been accumulated inside. But we (Arm) still think this is a useful
   framework for vector mask support in C2 auto-vectorized loops, for both
   x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable
   post loop vectorization.
   
   ** Changes in this patch
   
   This patch reworks post loop vectorization. The most significant change
   is removing vector mask support in C2 x86 backend and re-implementing
   it in the mid-end. With this, we can re-enable post loop vectorization
   for platforms other than x86.
   
   Previous implementation hard-codes x86 k1 register as a reserved AVX512
   opmask register and defines two routines (setvectmask/restorevectmask)
   to set and restore the value of k1. But after JDK-8211251 which encodes
   AVX512 instructions as unmasked by default, generated vector masks are
   no longer used in AVX512 vector instructions. To fix incorrect codegen
   and add vector mask support for more platforms, we turn to add a vector
   mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode
   to generate a mask and replace all Load/Store nodes in the post loop
   into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This
   IR form is exactly the same to those which are used in VectorAPI mask
   support. For now, we only add mask inputs for Load/Store nodes because
   we don't have reduction operations supported in post loop vectorization.
   After this change, the x86 k1 register is no longer reserved and can be
   allocated when PostLoopMultiversioning is enabled.
   
   Besides this change, we have fixed a compiler crash and five incorrect
   result issues with post loop vectorization.
   
   - 1) C2 crashes with segmentation fault in strip-mined loops
   
   Previous implementation was done before C2 loop strip-mining was merged
   into JDK master so it didn't take strip-mined loops into consideration.
   In C2's strip mined loops, post loop is not the sibling of the main loop
   in ideal loop tree. Instead, it's the sibling of the main loop's parent.
   This patch fixed a SIGSEGV issue caused by NULL pointer when locating
   post loop from strip-mined main loop.
   
   - 2) Incorrect result issues with post loop vectorization
   
   We have also fixed five incorrect vectorization issues. Some of them are
   hidden deep and can only be reproduced with corner cases. These issues
   have a common cause that it assumes the post loop can be vectorized if
   the vectorization in corresponding main loop is successful. But in many
   cases this assumption is wrong. Below are details.
   
   [Issue-1] Incorrect vectorization for partial vectorizable loops
   
   This issue can be reproduced by below loop where only some operations in
   the loop body are vectorizable.
   
     for (int i = 0; i < 10000; i++) {
       res[i] = a[i] * b[i];
       k = 3 * k + 1;
     }
   
   In the main loop, superword can work well if parts of the operations in
   loop body are not vectorizable since those parts can be unrolled only.
   But for post loops, we don't create vectors through combining scalar IRs
   generated from loop unrolling. Instead, we are doing scalars to vectors
   replacement for all operations in the loop body. Hence, all operations
   should be either vectorized together or not vectorized at all. To fix
   this kind of cases, we add an extra field "_slp_vector_pack_count" in
   CountedLoopNode to record the eventual count of vector packs in the main
   loop. This value is then passed to post loop and compared with post loop
   pack count. Vectorization will be bailed out in post loop if it creates
   more vector packs than in the main loop.
   
   [Issue-2] Incorrect result in loops with growing-down vectors
   
   This issue appears with growing-down vectors, that is, vectors that grow
   to smaller memory address as the loop iterates. It can be reproduced by
   below counting-up loop with negative scale value in array index.
   
     for (int i = 0; i < 10000; i++) {
       a[MAX - i] = b[MAX - i];
     }
   
   Cause of this issue is that for a growing-down vector, generated vector
   mask value has reversed vector-lane order so it masks incorrect vector
   lanes. Note that if negative scale value appears in counting-down loops,
   the vector will be growing up. With this rule, we fix the issue by only
   allowing positive array index scales in counting-up loops and negative
   array index scales in counting-down loops. This check is done with the
   help of SWPointer by comparing scale values in each memory access in the
   loop with loop stride value.
   
   [Issue-3] Incorrect result in manually unrolled loops
   
   This issue can be reproduced by below manually unrolled loop.
   
     for (int i = 0; i < 10000; i += 2) {
       c[i] = a[i] + b[i];
       c[i + 1] = a[i + 1] * b[i + 1];
     }
   
   In this loop, operations in the 2nd statement duplicate those in the 1st
   statement with a small memory address offset. Vectorization in the main
   loop works well in this case because C2 does further unrolling and pack
   combination. But we cannot vectorize the post loop through replacement
   from scalars to vectors because it creates duplicated vector operations.
   To fix this, we restrict post loop vectorization to loops with stride
   values of 1 or -1.
   
   [Issue-4] Incorrect result in loops with mixed vector element sizes
   
   This issue is found after we enable post loop vectorization for AArch64.
   It's reproducible by multiple array operations with different element
   sizes inside a loop. On x86, there is no issue because the values of x86
   AVX512 opmasks only depend on which vector lanes are active. But AArch64
   is different - the values of SVE predicates also depend on lane size of
   the vector. Hence, on AArch64 SVE, if a loop has mixed vector element
   sizes, we should use different vector masks. For now, we just support
   loops with only one vector element size, i.e., "int + float" vectors in
   a single loop is ok but "int + double" vectors in a single loop is not
   vectorizable. This fix also enables subword vectors support to make all
   primitive type array operations vectorizable.
   
   [Issue-5] Incorrect result in loops with potential data dependence
   
   This issue can be reproduced by below corner case on AArch64 only.
   
     for (int i = 0; i < 10000; i++) {
       a[i] = x;
       a[i + OFFSET] = y;
     }
   
   In this case, two stores in the loop have data dependence if the OFFSET
   value is smaller than the vector length. So we cannot do vectorization
   through replacing scalars to vectors. But the main loop vectorization
   in this case is successful on AArch64 because AArch64 has partial vector
   load/store support. It splits vector fill with different values in lanes
   to several smaller-sized fills. In this patch, we add additional data
   dependence check for this kind of cases. The check is also done with the
   help of SWPointer class. In this check, we require that every two memory
   accesses (with at least one store) of the same element type (or subword
   size) in the loop has the same array index expression.
   
   ** Tests
   
   So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with
   experimental VM option "PostLoopMultiversioning" turned on. We found no
   issue in all tests. We notice that those existing cases are not enough
   because some of above issues are not spotted by them. We would like to
   add some new cases but we found existing vectorization tests are a bit
   cumbersome - golden results must be pre-calculated and hard-coded in the
   test code for correctness verification. Thus, in this patch, we propose
   a new vectorization testing framework.
   
   Our new framework brings a simpler way to add new cases. For a new test
   case, we only need to create a new method annotated with "@Test". The
   test runner will invoke each annotated method twice automatically. First
   time it runs in the interpreter and second time it's forced compiled by
   C2. Then the two return results are compared. So in this framework each
   test method should return a primitive value or an array of primitives.
   In this way, no extra verification code for vectorization correctness is
   required. This test runner is still jtreg-based and takes advantages of
   the jtreg WhiteBox API, which enables test methods running at specific
   compilation levels. Each test class inside is also jtreg-based. It just
   need to inherit from the test runner class and run with two additional
   options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI".
   
   ** Summary & Future work
   
   In this patch, we reworked post loop vectorization. We made it platform
   independent and fixed several issues inside. We also implemented a new
   vectorization testing framework with many test cases inside. Meanwhile,
   we did some code cleanups.
   
   This patch only touches C2 code guarded with PostLoopMultiversioning,
   except a few data structure changes. So, there's no behavior change when
   experimental VM option PostLoopMultiversioning is off. Also, to reduce
   risks, we still propose to keep post loop vectorization experimental for
   now. But if it receives positive feedback, we would like to change it to
   non-experimental in the future.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6828/files
  - new: https://git.openjdk.java.net/jdk/pull/6828/files/56575886..ea0598ad

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=02-03

  Stats: 57104 lines in 1757 files changed: 38847 ins; 10161 del; 8096 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6828.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6828/head:pull/6828

PR: https://git.openjdk.java.net/jdk/pull/6828

From duke at openjdk.java.net  Mon Feb 21 07:08:50 2022
From: duke at openjdk.java.net (Emanuel Peter)
Date: Mon, 21 Feb 2022 07:08:50 GMT
Subject: Integrated: 8281544: assert(VM_Version::supports_avx512bw()) failed
 for Tests jdk/incubator/vector/
In-Reply-To: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
References: <NuMg_3wZ7GTF1QIuTBDb0t0NJ1YlzWn795AtnDEdNbs=.20bb8fe7-b5ec-4246-a236-5653e0d2d546@github.com>
Message-ID: <8CY3PKUGwuGTB6D-mzhR6218exbBBgA_4xR8n0joyx4=.461f0fd9-a069-4616-9a7e-25d716831d1b@github.com>

On Thu, 17 Feb 2022 07:51:42 GMT, Emanuel Peter <duke at openjdk.java.net> wrote:

> `ZSaveLiveRegisters::ZSaveLiveRegisters` stores live registers, and later they are loaded again.
> This includes opmask registers, which are part of AVX512. However, not all platforms have all of the AVX512 instructions.
> For example Knights Landing has general AVX512 support and makes use of optmask registers, but does not support the AVX512 BW subset of instructions, specifically it does not support the `kmovql` instruction. Platforms like Cannon Landing have support for AVX512 BW.
> 
> Solution: in analogy to `RegisterSaver::save_live_registers`, which seems to perform a very similar task, use `MacroAssembler::kmov` instead of `kmovql` directly. Internally, `kmov` choses either `kmovql` if avx512bw is available, else it takes `kmovwl`.
> 
> As a regression test, I took one of the tests that failed with `-XX:+UnlockExperimentalVMOptions -XX:+UseZGC`, and added an additional `@run` statement with those flags. I simulated this test locally with Intel Software Development Emulator:
> `sde -knl`: Knights Landing, AVX512 but not BW, fails without change to `kmov`, passes with it.
> `sde -cnl`: Cannon Landing, has AVX512 BW, passes before and after code change.
> 
> Ran additional tests to verify that the test triggers before code change, and that with the code change nothing broke.
> 
> @neliasso Thanks for the help!

This pull request has now been integrated.

Changeset: 4e0b81c5
Author:    Emanuel Peter <emanuel.peter at oracle.com>
Committer: Tobias Hartmann <thartmann at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/4e0b81c596f2a2eae49127b9ee98c80500b4e319
Stats:     14 lines in 2 files changed: 12 ins; 0 del; 2 mod

8281544: assert(VM_Version::supports_avx512bw()) failed for Tests jdk/incubator/vector/

Reviewed-by: kvn, neliasso, thartmann

-------------

PR: https://git.openjdk.java.net/jdk/pull/7510

From thartmann at openjdk.java.net  Mon Feb 21 09:20:53 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Mon, 21 Feb 2022 09:20:53 GMT
Subject: RFR: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom
In-Reply-To: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
Message-ID: <vr1-Ii9mo71Oxtn--tBLFEO8FCymzWwoO3ArfHNSrJ8=.459f1080-c246-4620-b956-e4bfbb01bda9@github.com>

On Thu, 17 Feb 2022 13:34:39 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> In Loom there's a full heap walk when the sweeper is triggered.  Many of the triggers in this test case are for the adapters created by the test, which are not deallocated.  Since there is a fall back to other code cache heap areas for NonNMethod and for NMethodProfiled, made the function CodeCache::reverse_free_ratio() examine the total code cache available rather than the specific area that it is allocating into.  The compilation policy also uses this to increase the C1 compile threshold so also uses the entire free code cache size to calculate new threshold (ask @TobiHartmann about this).  Thanks to Tobias for the discussion for this fix.
> Tested with tier1-4.

Looks good to me. Thanks for fixing this.

src/hotspot/share/code/codeCache.cpp line 897:

> 895: // Since code heap for each type of code blobs falls forward to the next
> 896: // type of code heap, return the reverse free ratio for the entire
> 897: // code heap.

Suggestion:

// Returns the reverse free ratio. E.g., if 25% (1/4) of the code cache
// is free, reverse_free_ratio() returns 4.
// Since code heap for each type of code blobs falls forward to the next
// type of code heap, return the reverse free ratio for the entire
// code cache.

-------------

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7514

From duke at openjdk.java.net  Mon Feb 21 12:25:46 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 21 Feb 2022 12:25:46 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v22]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <i7E_Cqk8kOTOTUCUa3xVrU-uExS432J-F726-ejQenI=.971577c8-bf85-4cad-be11-6ae24e1a0b38@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/2062cce7..7f80f289

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=21
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=20-21

  Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Mon Feb 21 14:51:11 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Mon, 21 Feb 2022 14:51:11 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace after
 JDK-8280422
Message-ID: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>

Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

-------------

Commit messages:
 - Fix AsyncGetCallTrace bug

Changes: https://git.openjdk.java.net/jdk/pull/7559/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282200
  Stats: 25 lines in 2 files changed: 19 ins; 5 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7559.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7559/head:pull/7559

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Mon Feb 21 15:02:53 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Mon, 21 Feb 2022 15:02:53 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <sCr-MwRD8EqtXO58bb02tvof1DwgZcdc4mfYZCURh-M=.20aa8ada-2fba-46d8-a8d9-e9d80f728041@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Related to https://github.com/openjdk/jdk/pull/7193

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From coleenp at openjdk.java.net  Mon Feb 21 15:11:30 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 21 Feb 2022 15:11:30 GMT
Subject: RFR: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom [v2]
In-Reply-To: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
Message-ID: <vLmt4y1zLr2bn-qJ_sp_Z5K10gyiFttAnQxbbx1Jpf8=.27b89f38-3379-42f2-8d05-691b7d53f7f1@github.com>

> In Loom there's a full heap walk when the sweeper is triggered.  Many of the triggers in this test case are for the adapters created by the test, which are not deallocated.  Since there is a fall back to other code cache heap areas for NonNMethod and for NMethodProfiled, made the function CodeCache::reverse_free_ratio() examine the total code cache available rather than the specific area that it is allocating into.  The compilation policy also uses this to increase the C1 compile threshold so also uses the entire free code cache size to calculate new threshold (ask @TobiHartmann about this).  Thanks to Tobias for the discussion for this fix.
> Tested with tier1-4.

Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:

  Fixed comment

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7514/files
  - new: https://git.openjdk.java.net/jdk/pull/7514/files/7a69dc43..97b7a59c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7514&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7514&range=00-01

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7514.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7514/head:pull/7514

PR: https://git.openjdk.java.net/jdk/pull/7514

From coleenp at openjdk.java.net  Mon Feb 21 15:11:31 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 21 Feb 2022 15:11:31 GMT
Subject: RFR: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom
In-Reply-To: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
Message-ID: <W3aja1YEwR-KGO-oblEfLZ60rzJbRQQ5KmHN7r6llko=.a454bc0c-3711-4b07-ac31-3511d7d661f4@github.com>

On Thu, 17 Feb 2022 13:34:39 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> In Loom there's a full heap walk when the sweeper is triggered.  Many of the triggers in this test case are for the adapters created by the test, which are not deallocated.  Since there is a fall back to other code cache heap areas for NonNMethod and for NMethodProfiled, made the function CodeCache::reverse_free_ratio() examine the total code cache available rather than the specific area that it is allocating into.  The compilation policy also uses this to increase the C1 compile threshold so also uses the entire free code cache size to calculate new threshold (ask @TobiHartmann about this).  Thanks to Tobias for the discussion for this fix.
> Tested with tier1-4.

Thanks, Tobias.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7514

From coleenp at openjdk.java.net  Mon Feb 21 15:11:33 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 21 Feb 2022 15:11:33 GMT
Subject: RFR: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom [v2]
In-Reply-To: <vr1-Ii9mo71Oxtn--tBLFEO8FCymzWwoO3ArfHNSrJ8=.459f1080-c246-4620-b956-e4bfbb01bda9@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
 <vr1-Ii9mo71Oxtn--tBLFEO8FCymzWwoO3ArfHNSrJ8=.459f1080-c246-4620-b956-e4bfbb01bda9@github.com>
Message-ID: <avW4MdbIYBRj1pQuYZWUWQsaWKXqYSNRlXevZhxRxQQ=.525896a6-688c-48a4-b558-4340b52abfcb@github.com>

On Mon, 21 Feb 2022 09:15:05 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:

>> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fixed comment
>
> src/hotspot/share/code/codeCache.cpp line 897:
> 
>> 895: // Since code heap for each type of code blobs falls forward to the next
>> 896: // type of code heap, return the reverse free ratio for the entire
>> 897: // code heap.
> 
> Suggestion:
> 
> // Returns the reverse free ratio. E.g., if 25% (1/4) of the code cache
> // is free, reverse_free_ratio() returns 4.
> // Since code heap for each type of code blobs falls forward to the next
> // type of code heap, return the reverse free ratio for the entire
> // code cache.

Fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7514

From eastig at amazon.co.uk  Mon Feb 21 16:49:32 2022
From: eastig at amazon.co.uk (Astigeevich, Evgeny)
Date: Mon, 21 Feb 2022 16:49:32 +0000
Subject: RFC: AArch64: Set Segmented CodeCache default size to 127M
Message-ID: <614FA734-BE19-4F53-B7D4-AFC78A9F1DEE@amazon.com>

Hi Andrew,

Sorry for the late reply. It was half term time.

Thank you for your feedback.

> I have seen bug reports from customers mystified
> at poor OpenJDK performance which have turned out
> to be code cache thrashing.

I think we have the case of code cache trashing. An application consumes
~90% of the code cache whatever the code cache is given. If the big code
cache is given the set of hot methods becomes sparse. The size of the set
is less than 32M. We have a few ideas to solve the trashing.

> I'd like to see more information. What was the *average performance
> gain* of all your benchmarks?

Full dacapo results ('-' means benchmark's time decreased, '+' means increased):
+------------+-------------+------------------+-----------------+
|   Bench    | New vs Base | COV base results | COV new results |
+------------+-------------+------------------+-----------------+
| tradebeans | -9.10%      | 11.99%           | 3.41%           |
| eclipse    | -3.57%      | 1.04%            | 0.91%           |
| tradesoap  | -3.03%      | 0.86%            | 0.46%           |
| tomcat     | -1.45%      | 0.99%            | 0.86%           |
| pmd        | -1.05%      | 0.62%            | 0.87%           |
| lusearch   | -0.81%      | 0.29%            | 0.39%           |
| zxing      | -0.46%      | 1.28%            | 0.82%           |
| biojava    | -0.04%      | 0.18%            | 0.19%           |
| jme        | 0.01%       | 0.01%            | 0.01%           |
| batik      | 0.08%       | 0.40%            | 0.45%           |
| luindex    | 0.42%       | 0.56%            | 0.70%           |
| fop        | 0.58%       | 1.18%            | 1.09%           |
| avrora     | 0.72%       | 2.05%            | 1.45%           |
| xalan      | 0.82%       | 2.82%            | 3.63%           |
| sunflow    | 4.57%       | 10.86%           | 10.84%          |
+------------+-------------+------------------+-----------------+
Each benchmark was run 10 times, 10 iterations per run. The result of the 10th iteration was used.


Renaissance results ('-' means benchmark's time decreased, '+' means increased):
+------------------+--------------+-------------------+-----------------+
|      Bench       | New vs Base  | COV base results  | COV new results |
+------------------+--------------+-------------------+-----------------+
| scrabble         | -13.47%      | 7.01%             | 7.43%           |
| dotty            | -9.03%       | 1.77%             | 1.82%           |
| naive-bayes      | -4.14%       | 9.72%             | 8.94%           |
| finagle-http     | -3.93%       | 0.95%             | 0.83%           |
| finagle-chirper  | -2.75%       | 2.45%             | 3.09%           |
| movie-lens       | -1.79%       | 1.39%             | 1.12%           |
| scala-doku       | -1.72%       | 27.20%            | 29.54%          |
| als              | -1.64%       | 0.64%             | 1.24%           |
| par-mnemonics    | -1.09%       | 11.69%            | 11.39%          |
| rx-scrabble      | -0.98%       | 1.36%             | 0.36%           |
| future-genetic   | -0.95%       | 1.14%             | 2.06%           |
| log-regression   | -0.86%       | 0.99%             | 1.62%           |
| dec-tree         | -0.74%       | 1.52%             | 1.69%           |
| chi-square       | -0.51%       | 1.20%             | 0.85%           |
| mnemonics        | -0.05%       | 0.74%             | 0.75%           |
| fj-kmeans        | 0.01%        | 0.95%             | 0.90%           |
| page-rank        | 0.06%        | 1.02%             | 0.80%           |
| scala-stm-bench7 | 0.16%        | 6.90%             | 7.43%           |
| reactors         | 0.97%        | 28.07%            | 12.42%          |
| scala-kmeans     | 1.22%        | 0.88%             | 0.39%           |
| gauss-mix        | 1.70%        | 1.83%             | 1.42%           |
| akka-uct         | 4.30%        | 5.20%             | 9.94%           |
| philosophers     | 12.64%       | 18.43%            | 17.64%          |
+------------------+--------------+-------------------+-----------------+
Each benchmark was run 10 times, 180 seconds per run. The second half of run's results was used.

I created https://bugs.openjdk.java.net/browse/JDK-8280872 "AArch64: Position non-nmethod segment in between profiled and non-profiled segments for 128M+ CodeCache".
It should reduce the number of trampolines.
There are also:
https://bugs.openjdk.java.net/browse/JDK-8280152 "AArch64: Duplicated trampolines in C2 NMethod Stub Code section"
https://bugs.openjdk.java.net/browse/JDK-8280481 "Duplicated static stubs in NMethod Stub Code section"

Implementing them we will improve the code cache usage but they won't fix the code cache trashing.

Thanks,
Evgeny


?On 11/02/2022, 16:30, "hotspot-dev on behalf of Andrew Haley" <hotspot-dev-retn at openjdk.java.net on behalf of aph-open at littlepinkcloud.com> wrote:

    On 2/10/22 23:02, Astigeevich, Evgeny wrote:
    > We?d like to discuss a proposal for setting TieredCompilation Segmented CodeCache default size to 127M on AArch64 (https://bugs.openjdk.java.net/browse/JDK-8280150).

    I don't think so, at least not without a lot more information.

    This would halve the size of the code cache, potentially causing
    severe regressions in production. I have seen bug reports from
    customers mystified at poor OpenJDK performance which have turned out
    to be code cache thrashing. This is very hard to diagnose without
    making some inspired guesses at what the root cause may be. We'd be
    moving the threshold for cache exhaustion much closer to our default
    configuration.

    So, this is a trade off between a small expected gain and a much
    larger (but hopefully rare) loss.

    I'd like to see more information. What was the *average performance
    gain* of all your benchmarks? I don't think anyone is interested in
    cherry-picked best cases.

    A quick back-of-the-envelope calculation tells me that about 3.5% of
    the code cache is occupied by trampolines and the extra bytes used by
    far calls. However, many of the far calls are never needed; I don't
    have stats for that, but I'd guess about half of them. But given the
    (plausible ?)  assumption that the dynamic frequency of calls is the
    same as the static frequency, I wouldn't be surprised if the cost of
    trampoline calls is about 2% of the total instruction count, so it'd
    be nice to be rid of them if there were no cost; but there is a cost.

    --
    Andrew Haley  (he/him)
    Java Platform Lead Engineer
    Red Hat UK Ltd. <https://www.redhat.com>
    https://keybase.io/andrewhaley
    EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom.


From duke at openjdk.java.net  Mon Feb 21 17:20:57 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 21 Feb 2022 17:20:57 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v23]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <SufqvjvqcDFFVxVGTHmlj9g0iuTfiJWewbNak_RlHf0=.e10b2ed4-4d67-4dc3-aa04-ca2b55dbdf25@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Merge master

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/7f80f289..f9882ff1

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=22
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=21-22

  Stats: 39689 lines in 1308 files changed: 27145 ins; 6944 del; 5600 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From aph-open at littlepinkcloud.com  Mon Feb 21 17:38:46 2022
From: aph-open at littlepinkcloud.com (Andrew Haley)
Date: Mon, 21 Feb 2022 17:38:46 +0000
Subject: RFC: AArch64: Set Segmented CodeCache default size to 127M
In-Reply-To: <614FA734-BE19-4F53-B7D4-AFC78A9F1DEE@amazon.com>
References: <614FA734-BE19-4F53-B7D4-AFC78A9F1DEE@amazon.com>
Message-ID: <663add00-20b5-814f-ed12-91f1079dc98a@littlepinkcloud.com>

On 2/21/22 16:49, Astigeevich, Evgeny wrote:
> Hi Andrew,
> 
> Sorry for the late reply. It was half term time.
> 
> Thank you for your feedback.
> 
>> I have seen bug reports from customers mystified
>> at poor OpenJDK performance which have turned out
>> to be code cache thrashing.
> 
> I think we have the case of code cache trashing. An application consumes
> ~90% of the code cache whatever the code cache is given. If the big code
> cache is given the set of hot methods becomes sparse. The size of the set
> is less than 32M. We have a few ideas to solve the trashing.

Please forgive me, but this paragraph makes no sense to me. I have
seen actual thrashing, where hot methods were being evicted and repeatedly
recompiled. This thrashing was fixed by increasing the size of the code
cache. I take your point about fragmentation, but it happens.

>> I'd like to see more information. What was the *average performance
>> gain* of all your benchmarks?
> 
> Full dacapo results ('-' means benchmark's time decreased, '+' means increased):

I worked it out myself. 0.8% gain, on a bunch of smallish benchmarks.
Unknown loss on large programs.

The results that went the other way were curious. That suggests to me
that there may be some other factors in play. I wonder what they are.

> I created https://bugs.openjdk.java.net/browse/JDK-8280872 "AArch64: Position non-nmethod segment in between profiled and non-profiled segments for 128M+ CodeCache".

> There are also:
> https://bugs.openjdk.java.net/browse/JDK-8280152 "AArch64: Duplicated trampolines in C2 NMethod Stub Code section"
> https://bugs.openjdk.java.net/browse/JDK-8280481 "Duplicated static stubs in NMethod Stub Code section"
Those look pretty uncontroversial: they won't help anything much if
at all, but at least we know they won't regress anything.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From dholmes at openjdk.java.net  Mon Feb 21 20:40:56 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 21 Feb 2022 20:40:56 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <i6C6tquaC2gYoP3Rl4xRiCfgCWpGwRg1YT7J-PTXVUc=.ea509a67-59ef-4a2c-ad56-e100ce527258@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

I marking this as changes requested because I need to investigate further. A `shouldNotReachHere` should never be reached, if it can be reached then the circumstances need investigated to see where the true problem lies.

Thanks,
David

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Mon Feb 21 21:17:46 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Mon, 21 Feb 2022 21:17:46 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <i5DlfpPQmHJZYd6h0ENO-bFY60CMV4099c7F21WwblA=.8b7aa5b5-859a-4c20-be45-0fc66127e705@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

I'm willing to help...

The described error is not dependent on the JVM being a debug build, I can also reproduce it with a release build by decreasing the sampling interval.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From dholmes at openjdk.java.net  Tue Feb 22 01:00:45 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 22 Feb 2022 01:00:45 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <n6Lzy3XzdeYYrlOql1Dut2HyE-lEDuF1nG5F318yOhY=.5bf91b54-821c-49c6-b924-f9c315d76703@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Please see updates to JBS issue and the draft PR here:
https://github.com/openjdk/jdk/pull/7566

You can either take my changes, or hand over to me and I will use my PR.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Tue Feb 22 05:53:31 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Tue, 22 Feb 2022 05:53:31 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v4]
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <jQUZoxEkixgJcta_LlTPV1C02t8mZL5qVwBoVxPjB3g=.b3928584-31c6-4291-99d5-4171495b2d80@github.com>

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:

  8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7227/files
  - new: https://git.openjdk.java.net/jdk/pull/7227/files/561cce33..a6958ad6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=02-03

  Stats: 20 lines in 3 files changed: 4 ins; 9 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7227.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7227/head:pull/7227

PR: https://git.openjdk.java.net/jdk/pull/7227

From duke at openjdk.java.net  Tue Feb 22 05:53:34 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Tue, 22 Feb 2022 05:53:34 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v3]
In-Reply-To: <_Ozu_sZZUPH-7vFaMfyzJFBv5WxQicM9asfJ9dK_jzg=.e9c28828-ca74-49e6-b0e8-33e79ea8a086@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <enpXnRowRytWp1G8YAZ_7PC_o-2jKgFqOCUcm5KReCo=.ef081f84-5468-40ba-8851-9761f03c903e@github.com>
 <_Ozu_sZZUPH-7vFaMfyzJFBv5WxQicM9asfJ9dK_jzg=.e9c28828-ca74-49e6-b0e8-33e79ea8a086@github.com>
Message-ID: <1zhYytRXsDS8Ph4K0yM6Xx9Z8vfLAX6daqg3XL8fOU4=.f904c8e0-e083-4c61-a84d-780b01fc9ab7@github.com>

On Fri, 18 Feb 2022 11:17:40 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.
>
> src/hotspot/share/jfr/writers/jfrStreamWriterHost.inline.hpp line 88:
> 
>> 86:         JavaThread* jt = JavaThread::current();
>> 87:         ThreadInVMfromNative transition(jt);
>> 88:         JfrJavaSupport::abort(JfrJavaSupport::new_string(msg, jt), jt, false);
> 
> Hi again Takuya, I'm sorry, but I should have noticed this earlier:
> 
> I now see that the code needs to allocate a Java string oop to conform to the existing abort function signature, which caters to invocations from Java. Then abort() immediately strips out the c-string from the oop. To be correct, also headers for logging/log.hpp and runtime/thread.inline.hpp should need be included.
> 
> I believe we can simplify this by updating the abort() signature so that we don't need to drag in those extra dependencies. Please see my following comment where I suggest a way to do this.
> 
> Thanks for your patience
> Markus

Thank you for your valuable comments. I agree with you. I corrected this fix in accordance with your suggestions.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From shade at openjdk.java.net  Tue Feb 22 07:02:11 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 22 Feb 2022 07:02:11 GMT
Subject: RFR: 8282224: Correct TIG::bang_stack_shadow_pages comments
Message-ID: <RW8GVSG-Jczv_PjLq7UkWe3nGue098bPhbIP0Bst27I=.87345077-4ab3-4d51-b707-5b4ec2e078c5@github.com>

When reviewing the RISC-V port of the change, I noticed the comment in the x86 code is worded incorrectly:


  // Record a new watermark, unless the update is above the safe limit.
  __ cmpptr(rsp, Address(thread, JavaThread::shadow_zone_safe_limit()));
  __ jccb(Assembler::belowEqual, L_done);


Stacks grow downwards, so we are recording a new watermark *when* update is above the safe limit.

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.java.net/jdk/pull/7569/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7569&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282224
  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7569.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7569/head:pull/7569

PR: https://git.openjdk.java.net/jdk/pull/7569

From duke at openjdk.java.net  Tue Feb 22 08:43:47 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Tue, 22 Feb 2022 08:43:47 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <rUchDyX8FNeKOnRzThC9J_SJu55mD5RygjN5mPZXzYs=.0738db96-fbb5-4705-9b36-13e712f3b871@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

To be frank, I would like to integrate your changes into my, because I need a second PR for JDK to be able to write such issues in JBS on my own. 

To the PR itself: The main difference between both is that with my PR we say "this should not happen please check before if you really want this" and with your PR we don't. I liked your initial PR that threw an error for the normal case that we cannot call this method for a thread in an inconsistent state. As you stated in the comment in the method of your PR, it is only a special case for AsyncGetCallTrace.

What is the down side of having to explicitly check for this special case when you need it and otherwise throw an error?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From mgronlun at openjdk.java.net  Tue Feb 22 11:31:53 2022
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Tue, 22 Feb 2022 11:31:53 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v4]
In-Reply-To: <jQUZoxEkixgJcta_LlTPV1C02t8mZL5qVwBoVxPjB3g=.b3928584-31c6-4291-99d5-4171495b2d80@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <jQUZoxEkixgJcta_LlTPV1C02t8mZL5qVwBoVxPjB3g=.b3928584-31c6-4291-99d5-4171495b2d80@github.com>
Message-ID: <9c_NBxB2A64uPN0YgJVjN8L9WqPPVWk1xZeSt4XH8Lc=.28209567-dba1-4623-86e1-314dd506ada7@github.com>

On Tue, 22 Feb 2022 05:53:31 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>> 
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>> 
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>> 
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
>> because there is no space on device.
>> Could you please review the fix?
>
> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

Looks good, thank you.

-------------

Marked as reviewed by mgronlun (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7227

From tschatzl at openjdk.java.net  Tue Feb 22 11:42:55 2022
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Tue, 22 Feb 2022 11:42:55 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files [v4]
In-Reply-To: <PESchU9s7SJ30uIlIhCkZZZb84bvppiRwBPMFBNJvs0=.1308e4f4-a4af-427b-b2db-f13f2a05be3a@github.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
 <PESchU9s7SJ30uIlIhCkZZZb84bvppiRwBPMFBNJvs0=.1308e4f4-a4af-427b-b2db-f13f2a05be3a@github.com>
Message-ID: <ZTggm8ZVeipDBUrgoc_wPQNBHKTkO9HmIIQ1-6mFZDY=.d076d6b2-c6a8-423c-8f84-83a4c43f9a3f@github.com>

On Tue, 8 Feb 2022 08:17:17 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:

>> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method:
>> 
>> Stack: [0x00007f6e01739000,0x00007f6e0183a000],  sp=0x00007f6e01838110,  free space=1020k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64
>> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec
>> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899
>> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df
>> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69
>> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d
>> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167
>> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180
>> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f
>> 
>> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method.
>> 
>> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)):
>> 
>> Stack: [0x00007f34fca18000,0x00007f34fcb19000],  sp=0x00007f34fcb17110,  free space=1020k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64  (c1_Compilation.cpp:607)
>> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec  (c1_Compiler.cpp:250)
>> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899  (compileBroker.cpp:2291)
>> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df  (compileBroker.cpp:1966)
>> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69  (compilerThread.cpp:59)
>> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d  (thread.cpp:1297)
>> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167  (thread.cpp:1280)
>> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180  (thread.cpp:358)
>> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f  (os_linux.cpp:705)
>> 
>> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. 
>> 
>> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf
>> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. 
>> 
>> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability.
>> 
>> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. 
>> 
>> **Testing:**
>> Apart from manual testing, I've added two kinds of tests:
>> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers.
>> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename.
>> 
>> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation.
>> 
>> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional  `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number.
>> 
>> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user).
>> 
>> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`!
>>  
>> Thanks,
>> Christian
>
> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Make dwarf tag NOT_PRODUCT

First pass, did not dive into details of the state machine yet.

src/hotspot/share/utilities/elfFile.cpp line 319:

> 317:     }
> 318:     log_develop_info(dwarf)("No separate .debuginfo file for library %s. It already contains the required DWARF sections.", _filepath);
> 319:     _dwarf_file = new (std::nothrow) DwarfFile(_filepath);

Would it be useful to explicitly bail out on a `nullptr` value here to avoid crashes below?

src/hotspot/share/utilities/elfFile.cpp line 357:

> 355:   }
> 356: 
> 357:   strcpy(debug_pathname, _filepath);

I'm always a bit uneasy using "raw" `strcpy` instead of `strncpy` and friends. The code seems to be correct though.

src/hotspot/share/utilities/elfFile.cpp line 358:

> 356: 
> 357:   strcpy(debug_pathname, _filepath);
> 358:   char* last_slash = strrchr(debug_pathname, '/');

It's probably no big issue hardcoding the forward slash here instead of using `os::file_separator()` in this method.

src/hotspot/share/utilities/elfFile.cpp line 407:

> 405: bool ElfFile::load_dwarf_file_from_env_path_folder(const char* env_path, const char* folder, const char* debug_filename, const uint32_t crc) {
> 406:   char* debug_pathname = NEW_RESOURCE_ARRAY(char, strlen(env_path) + strlen(folder) + strlen(debug_filename) + 2);
> 407:   strcpy(debug_pathname, env_path);

Similar to other resource allocations, this should bail out if the result is `nullptr`.

src/hotspot/share/utilities/elfFile.cpp line 566:

> 564: // http://sourceware.org/gdb/current/onlinedocs/gdb/Separate-Debug-Files.html#Separate-Debug-Files.
> 565: uint32_t ElfFile::gnu_debuglink_crc32(uint32_t crc, uint8_t* buf, const size_t len) {
> 566:   crc = ~crc & 0xffffffff;

The masks are unnecessary here but don't hurt. Feel free to keep.

src/hotspot/share/utilities/elfFile.cpp line 576:

> 574:   log_develop_info(dwarf)("Open DWARF file: %s", filepath);
> 575:   _dwarf_file = new (std::nothrow) DwarfFile(filepath);
> 576:   if (!_dwarf_file->is_valid_dwarf_file()) {

This should bail out if the `new` returned a `nullptr`.

src/hotspot/share/utilities/elfFile.cpp line 686:

> 684:   }
> 685: 
> 686:   // We must align to twice the address size.

Since alignment is based on address size? I.e. above, at the check whether addresses are correct, define address size and then multiply by 2 here.
This would also make the condition above look nicer, i.e. move the `[NOT_]LP64` outside of the condition.

src/hotspot/share/utilities/elfFile.cpp line 784:

> 782:   }
> 783: 
> 784:   if (!_reader.read_byte(&_header._address_size) || NOT_LP64(_header._address_size != 4)  LP64_ONLY( _header._address_size != 8)) {

Since this is the second time for the clause `|| NOT_LP64(_header._address_size != 4) LP64_ONLY( _header._address_size != 8)` maybe it is useful to make a constant out of the accepted address size somewhere instead of repeating this over and over.
It's value could even be something like `sizeof(intptr_t)` or so.

src/hotspot/share/utilities/elfFile.cpp line 814:

> 812:   log_develop_trace(dwarf)("Series of declarations [code, tag]:");
> 813:   AbbreviationDeclaration declaration;
> 814:   bool found_matching_declaration = false;

This variable is never used. Remove.

src/hotspot/share/utilities/elfFile.cpp line 944:

> 942: #else
> 943:       _reader.move_position(8);
> 944: #endif

Use `AddressSize` or similar here instead of the `#ifdef`.

src/hotspot/share/utilities/elfFile.cpp line 1026:

> 1024:         break;
> 1025:       } else {
> 1026:         if (!_reader.move_position(4)) {

Instead of hardcoding the `4` for lineptr/loclistptr/macptr/rangelistptr it would be nice to have a "DwarfOffset` constant of that value, since we only support 32 bit DWARF.

src/hotspot/share/utilities/elfFile.cpp line 1070:

> 1068:     // reason, GCC is currently using version 3 as specified in the DWARF 3 spec for the line number program even though GCC should
> 1069:     // be using version 4 for DWARF 4 as it emits DWARF 4 by default.
> 1070:     return false;

According to the specification (pg112):

> `version (uhalf)`
> A version number (see Appendix F). This number is specific to the line number information
> and is independent of the DWARF version number.

So this is just fine - actually things may break if the code accepted version 4 here assuming that there are breaking differences.
On the other hand Appendix F mentions that DWARF4 contains .debug_line information in version 4.

src/hotspot/share/utilities/elfFile.cpp line 1121:

> 1119:   // _debug_line_offset + 10 (=sizeof(_unit_length) + sizeof(_version) + sizeof(_header_length)) + _header_length.
> 1120:   _header._file_names_offset = _reader.get_position();
> 1121:   if (!_reader.set_position(shdr.sh_offset + _debug_line_offset + 10 + _header._header_length)) {

I would prefer a constant for this magic `10`. Thank you for the documentation.

src/hotspot/share/utilities/elfFile.hpp line 211:

> 209: 
> 210:   // Load the DWARF file (.debuginfo) that belongs to this file.
> 211:   bool load_dwarf_file();

It would be nice to summarize from which places this methods tries to load the debug info to prevent the need for digging for it in the method implementation.

src/hotspot/share/utilities/elfFile.hpp line 300:

> 298:  *  - debug: Prints the results of the steps (1) - (4) together with the generated line information matrix.
> 299:  *  - trace: Full logging information for intermediate states/results when parsing the DWARF file.
> 300:  */

Maybe add a comment that log output is only supported in non-product builds and the reason.

-------------

Changes requested by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7126

From vlivanov at openjdk.java.net  Tue Feb 22 11:49:26 2022
From: vlivanov at openjdk.java.net (Vladimir Ivanov)
Date: Tue, 22 Feb 2022 11:49:26 GMT
Subject: RFR: 8280901: MethodHandle::linkToNative stub is missing w/ -Xint
 [v2]
In-Reply-To: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
References: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
Message-ID: <9wlliZMzFqrTmAOktwaMPDw95W_rVjZRTW0_RAmAhjo=.e8d7ed64-33f6-4295-ae2c-ee7fd8822319@github.com>

> MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.
> 
> Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 
> 
> Testing: test/jdk/java/foreign/ w/ -Xint

Vladimir Ivanov has updated the pull request incrementally with one additional commit since the last revision:

  Regression test

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7459/files
  - new: https://git.openjdk.java.net/jdk/pull/7459/files/50f68960..17df1875

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7459&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7459&range=00-01

  Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7459.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7459/head:pull/7459

PR: https://git.openjdk.java.net/jdk/pull/7459

From vlivanov at openjdk.java.net  Tue Feb 22 11:49:27 2022
From: vlivanov at openjdk.java.net (Vladimir Ivanov)
Date: Tue, 22 Feb 2022 11:49:27 GMT
Subject: RFR: 8280901: MethodHandle::linkToNative stub is missing w/ -Xint
In-Reply-To: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
References: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
Message-ID: <UsojRTJMxaXnl50j9zGBKRkAXrmqGxZWKYf-hKPcFvo=.d4c08a1e-ec02-420f-932c-6668fa6089a1@github.com>

On Mon, 14 Feb 2022 13:40:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.
> 
> Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 
> 
> Testing: test/jdk/java/foreign/ w/ -Xint

Thanks for the reviews, Maurizio, Aleksey, and Vladimir.

> maybe consider adding some extra test combinations in TestMatrix

I decided to extend `test/jdk/java/foreign/TestDowncall.java` to run a single test in `-Xint` mode:  


----------messages:(5/559)----------
command: testng -Xint ... -Dgenerator.sample.factor=100000 TestDowncall
...
elapsed time (seconds): 1.031
...
----------System.out:(7/249)----------
test TestDowncall.testDowncall(0, "f0_V__", VOID, [], []): success

===============================================
java/foreign/TestDowncall.java
Total tests run: 1, Passes: 1, Failures: 0, Skips: 0
===============================================

-------------

PR: https://git.openjdk.java.net/jdk/pull/7459

From dholmes at openjdk.java.net  Tue Feb 22 12:05:50 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 22 Feb 2022 12:05:50 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <y-t4ZktssqRt5Yjcs8eAwubunegOZibzh2MDhDOgbHE=.fb77df7d-baed-447b-b219-bf2b950e703f@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

I don't like unnecessary special-cases. I added the `ShouldNotReachHere()` due to flawed reasoning, so would like to remove it again and make the code look the way it would have if I had realized about AGCT at the time. Creating a new API just for AGCT to use is not necessary IMO.

Cheers.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From chagedorn at openjdk.java.net  Tue Feb 22 12:22:52 2022
From: chagedorn at openjdk.java.net (Christian Hagedorn)
Date: Tue, 22 Feb 2022 12:22:52 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files [v4]
In-Reply-To: <PESchU9s7SJ30uIlIhCkZZZb84bvppiRwBPMFBNJvs0=.1308e4f4-a4af-427b-b2db-f13f2a05be3a@github.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
 <PESchU9s7SJ30uIlIhCkZZZb84bvppiRwBPMFBNJvs0=.1308e4f4-a4af-427b-b2db-f13f2a05be3a@github.com>
Message-ID: <m56oWeFW871GjB8qxlFUNJRiVR7uQnpkW4AnzjMGSsA=.782e16f6-6d91-4d62-accb-65c7878801ad@github.com>

On Tue, 8 Feb 2022 08:17:17 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:

>> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method:
>> 
>> Stack: [0x00007f6e01739000,0x00007f6e0183a000],  sp=0x00007f6e01838110,  free space=1020k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64
>> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec
>> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899
>> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df
>> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69
>> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d
>> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167
>> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180
>> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f
>> 
>> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method.
>> 
>> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)):
>> 
>> Stack: [0x00007f34fca18000,0x00007f34fcb19000],  sp=0x00007f34fcb17110,  free space=1020k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64  (c1_Compilation.cpp:607)
>> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec  (c1_Compiler.cpp:250)
>> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899  (compileBroker.cpp:2291)
>> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df  (compileBroker.cpp:1966)
>> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69  (compilerThread.cpp:59)
>> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d  (thread.cpp:1297)
>> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167  (thread.cpp:1280)
>> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180  (thread.cpp:358)
>> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f  (os_linux.cpp:705)
>> 
>> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. 
>> 
>> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf
>> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. 
>> 
>> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability.
>> 
>> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. 
>> 
>> **Testing:**
>> Apart from manual testing, I've added two kinds of tests:
>> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers.
>> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename.
>> 
>> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation.
>> 
>> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional  `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number.
>> 
>> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user).
>> 
>> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`!
>>  
>> Thanks,
>> Christian
>
> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Make dwarf tag NOT_PRODUCT

Thank you Thomas for your first pass! I will probably get back to your comments on Monday as I'm taking the rest of the week off.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7126

From duke at openjdk.java.net  Tue Feb 22 12:43:46 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Tue, 22 Feb 2022 12:43:46 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <s38sx-TqK8VUoYt2Oin_AvwzFgjlZUsWayO23LZIQJg=.1db506fd-302f-41de-8cc0-73782ebbb64e@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Good to know. I will change my PR accordingly (if this ok for you) :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From eosterlund at openjdk.java.net  Tue Feb 22 13:45:50 2022
From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Tue, 22 Feb 2022 13:45:50 GMT
Subject: RFR: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom [v2]
In-Reply-To: <vLmt4y1zLr2bn-qJ_sp_Z5K10gyiFttAnQxbbx1Jpf8=.27b89f38-3379-42f2-8d05-691b7d53f7f1@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
 <vLmt4y1zLr2bn-qJ_sp_Z5K10gyiFttAnQxbbx1Jpf8=.27b89f38-3379-42f2-8d05-691b7d53f7f1@github.com>
Message-ID: <yrcxTSIBiGDz8Gk2HM3I6P7TAZ5f9dCekO0VI9ElwF4=.a62bd6e7-c06d-4800-b4af-369d35774cc4@github.com>

On Mon, 21 Feb 2022 15:11:30 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> In Loom there's a full heap walk when the sweeper is triggered.  Many of the triggers in this test case are for the adapters created by the test, which are not deallocated.  Since there is a fall back to other code cache heap areas for NonNMethod and for NMethodProfiled, made the function CodeCache::reverse_free_ratio() examine the total code cache available rather than the specific area that it is allocating into.  The compilation policy also uses this to increase the C1 compile threshold so also uses the entire free code cache size to calculate new threshold (ask @TobiHartmann about this).  Thanks to Tobias for the discussion for this fix.
>> Tested with tier1-4.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixed comment

Looks good.

-------------

Marked as reviewed by eosterlund (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7514

From coleenp at openjdk.java.net  Tue Feb 22 13:45:50 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 22 Feb 2022 13:45:50 GMT
Subject: RFR: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom [v2]
In-Reply-To: <vLmt4y1zLr2bn-qJ_sp_Z5K10gyiFttAnQxbbx1Jpf8=.27b89f38-3379-42f2-8d05-691b7d53f7f1@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
 <vLmt4y1zLr2bn-qJ_sp_Z5K10gyiFttAnQxbbx1Jpf8=.27b89f38-3379-42f2-8d05-691b7d53f7f1@github.com>
Message-ID: <2guikodmogzI8BO3HPTZnJ7_RcqG3iYcErVVY_zpH-Y=.218ee8eb-dccc-4f9d-b95b-b9a70474a723@github.com>

On Mon, 21 Feb 2022 15:11:30 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> In Loom there's a full heap walk when the sweeper is triggered.  Many of the triggers in this test case are for the adapters created by the test, which are not deallocated.  Since there is a fall back to other code cache heap areas for NonNMethod and for NMethodProfiled, made the function CodeCache::reverse_free_ratio() examine the total code cache available rather than the specific area that it is allocating into.  The compilation policy also uses this to increase the C1 compile threshold so also uses the entire free code cache size to calculate new threshold (ask @TobiHartmann about this).  Thanks to Tobias for the discussion for this fix.
>> Tested with tier1-4.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixed comment

Thanks Erik!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7514

From coleenp at openjdk.java.net  Tue Feb 22 13:45:51 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 22 Feb 2022 13:45:51 GMT
Subject: Integrated: 8271008: appcds/*/MethodHandlesAsCollectorTest.java tests
 time out because of excessive GC (CodeCache GC Threshold) in loom
In-Reply-To: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
References: <kYn3797y65G_uT3f16-_Nfg0HNEHOIEKOs91EUzkOzc=.b3155dcd-d36f-4203-963a-c4ddb2eac9eb@github.com>
Message-ID: <0I7-ShhwDMgvG1YqlMOpi4YsyAlRpUo5dAOyuZxCocY=.33864e88-338a-4215-adb8-e520cdb85110@github.com>

On Thu, 17 Feb 2022 13:34:39 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> In Loom there's a full heap walk when the sweeper is triggered.  Many of the triggers in this test case are for the adapters created by the test, which are not deallocated.  Since there is a fall back to other code cache heap areas for NonNMethod and for NMethodProfiled, made the function CodeCache::reverse_free_ratio() examine the total code cache available rather than the specific area that it is allocating into.  The compilation policy also uses this to increase the C1 compile threshold so also uses the entire free code cache size to calculate new threshold (ask @TobiHartmann about this).  Thanks to Tobias for the discussion for this fix.
> Tested with tier1-4.

This pull request has now been integrated.

Changeset: 022d8070
Author:    Coleen Phillimore <coleenp at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/022d80707c346f4b82ac1eb53e77c634769631e9
Stats:     36 lines in 5 files changed: 6 ins; 10 del; 20 mod

8271008: appcds/*/MethodHandlesAsCollectorTest.java tests time out because of excessive GC (CodeCache GC Threshold) in loom

Reviewed-by: thartmann, eosterlund

-------------

PR: https://git.openjdk.java.net/jdk/pull/7514

From duke at openjdk.java.net  Tue Feb 22 14:05:22 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Tue, 22 Feb 2022 14:05:22 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v24]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <3qZYWOySJyCqIHpvp9zg-Co3mBJt19hO5HwTK8NJjIE=.5d70fbfe-16db-45dc-92c0-058b50cb2955@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Merge master

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6334/files
  - new: https://git.openjdk.java.net/jdk/pull/6334/files/f9882ff1..97ae934b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=23
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=22-23

  Stats: 274 lines in 17 files changed: 176 ins; 69 del; 29 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Tue Feb 22 14:35:19 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Tue, 22 Feb 2022 14:35:19 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v25]
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:

 - Merge master
 - Merge master
 - Merge master
 - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret
 - Add comments to enter calls
 - Set PreserveFramePointer if use_rop_protection is set
 - Merge enter_subframe into enter
 - Review fixups
 - Documentation updates
 - Update copyrights to 2022
 - ... and 24 more: https://git.openjdk.java.net/jdk/compare/022d8070...c4e0ee31

-------------

Changes: https://git.openjdk.java.net/jdk/pull/6334/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=24
  Stats: 1481 lines in 35 files changed: 574 ins; 32 del; 875 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6334.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Tue Feb 22 18:29:21 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Tue, 22 Feb 2022 18:29:21 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned
 methods in java.lang.Integer and java.lang.Long
Message-ID: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>

Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.

-------------

Commit messages:
 - fix trailing white space errors
 - fix whitespaces
 - revert comment to original for divmodI
 - Update rax and rdx register usage in x86_64.ad
 - 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long

Changes: https://git.openjdk.java.net/jdk/pull/7572/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282221
  Stats: 741 lines in 16 files changed: 738 ins; 1 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572

From dholmes at openjdk.java.net  Tue Feb 22 21:15:46 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 22 Feb 2022 21:15:46 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <qv3REoZXx1SpWZc4RdNW0IeMttbeLRhT7EeNJL5IFww=.5e05caff-7736-4c72-8e15-39de29af7fe6@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Please do update. Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From sviswanathan at openjdk.java.net  Wed Feb 23 01:34:53 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Wed, 23 Feb 2022 01:34:53 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v6]
In-Reply-To: <oMnlIO5l_pU71SvWpOFppQ-7882cq32UOjKqWZckxM0=.0efd7853-b30d-488b-92c4-4a8ad0412fda@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <oMnlIO5l_pU71SvWpOFppQ-7882cq32UOjKqWZckxM0=.0efd7853-b30d-488b-92c4-4a8ad0412fda@github.com>
Message-ID: <U8OHdsrpVYQuacfR2pi_xqdFSb0RxGAszHwa2aJCXho=.323829f3-308c-4a18-ac31-6d03e2a8983f@github.com>

On Thu, 17 Feb 2022 17:43:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | --
>> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
>> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
>> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
>> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8279508: Fixing for windows failure.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4146:

> 4144:   vaddpd(xtmp1, src , xtmp1, vec_enc);
> 4145:   vrndscalepd(dst, xtmp1, 0x4, vec_enc);
> 4146:   evcvtpd2qq(dst, dst, vec_enc);

Why do we need vrndscalepd in between, could we not directly use cvtpd2qq after vaddpd?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From iklam at openjdk.java.net  Wed Feb 23 03:57:16 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 23 Feb 2022 03:57:16 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v5]
In-Reply-To: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
Message-ID: <YmU2a_UFBTMHy1fJ49RMMKQlFdINt1Gm6v9QoXqqHIQ=.9e13321e-b07b-4628-ad81-15784ac562d1@github.com>

> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
>     public static final SUNDAY = new Day("SUNDAY");
>     public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::<clinit>` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::<clinit>`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time.
> 
> This is safe as we know that `X::<clinit>` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:

 - Fixed comments per @calvinccheung review
 - Merge branch 'master' into 8275731-heapshared-enum
 - Use InstanceKlass::do_local_static_fields for some field iterations
 - Merge branch 'master' into 8275731-heapshared-enum
 - added exclusions needed by "java -Xshare:dump -ea -esa"
 - Comments from @calvinccheung off-line
 - 8275731: CDS archived enums objects are recreated at runtime

-------------

Changes: https://git.openjdk.java.net/jdk/pull/6653/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=04
  Stats: 850 lines in 16 files changed: 807 ins; 2 del; 41 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653

From iklam at openjdk.java.net  Wed Feb 23 04:15:28 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 23 Feb 2022 04:15:28 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v6]
In-Reply-To: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
Message-ID: <oUFhtmXkwBixwlYK8bcsnQclOIO7GUi4LMMGwAXw7Pw=.b9a83d0d-f57b-44d7-9428-a7e35b9ce6ae@github.com>

> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
>     public static final SUNDAY = new Day("SUNDAY");
>     public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::<clinit>` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::<clinit>`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time.
> 
> This is safe as we know that `X::<clinit>` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:

  fixed whitespace

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6653/files
  - new: https://git.openjdk.java.net/jdk/pull/6653/files/4764075e..c6e9be1d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=04-05

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653

From dholmes at openjdk.java.net  Wed Feb 23 04:33:06 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 04:33:06 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic
Message-ID: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>

Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.

See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.

Platforms affected:
 - all x86
 - Zero
 - Windows Aarch64
 - PPC

Testing: tiers 1-3
Additional builds: tiers 4 and 5
 - builds covered: x86 and Zero

GHA
- builds covered:  Windows-Aarch64

The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.

For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.

Thanks,
David

-------------

Commit messages:
 - 8227369: pd_disjoint_words_atomic() needs to be atomic

Changes: https://git.openjdk.java.net/jdk/pull/7567/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7567&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8227369
  Stats: 88 lines in 5 files changed: 24 ins; 58 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7567.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7567/head:pull/7567

PR: https://git.openjdk.java.net/jdk/pull/7567

From eosterlund at openjdk.java.net  Wed Feb 23 04:33:06 2022
From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Wed, 23 Feb 2022 04:33:06 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic
In-Reply-To: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
Message-ID: <M7ZYDC3mgGZtpuURJaRk5ktwH-ZFZZYF8c0bIqYCwBI=.36901b1f-7a52-4c22-8099-a2961711884a@github.com>

On Tue, 22 Feb 2022 05:45:12 GMT, David Holmes <dholmes at openjdk.org> wrote:

> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
> 
> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
> 
> Platforms affected:
>  - all x86
>  - Zero
>  - Windows Aarch64
>  - PPC
> 
> Testing: tiers 1-3
> Additional builds: tiers 4 and 5
>  - builds covered: x86 and Zero
> 
> GHA
> - builds covered:  Windows-Aarch64
> 
> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
> 
> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
> 
> Thanks,
> David

Looks good, thanks for fixing this.

-------------

Marked as reviewed by eosterlund (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7567

From mikael at openjdk.java.net  Wed Feb 23 04:49:44 2022
From: mikael at openjdk.java.net (Mikael Vidstedt)
Date: Wed, 23 Feb 2022 04:49:44 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic
In-Reply-To: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
Message-ID: <fJ7aT3lcLmBNDrT3jMX1rhUuKsToVZ5xOSfRpPcVKB8=.e2d83721-d875-40eb-9c8a-b4a2f1b24eb7@github.com>

On Tue, 22 Feb 2022 05:45:12 GMT, David Holmes <dholmes at openjdk.org> wrote:

> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
> 
> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
> 
> Platforms affected:
>  - all x86
>  - Zero
>  - Windows Aarch64
>  - PPC
> 
> Testing: tiers 1-3
> Additional builds: tiers 4 and 5
>  - builds covered: x86 and Zero
> 
> GHA
> - builds covered:  Windows-Aarch64
> 
> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
> 
> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
> 
> Thanks,
> David

Nice!

(Unrelated to/separate from your change I do wonder if the specialized assembly copy code on the "other" platforms actually is warranted. My memory from doing the conjoint copy (with optional swap) is that gcc generates really good code, but maybe there are platforms/toolchains/cases where that's not sufficient.)

src/hotspot/share/utilities/copy.hpp line 302:

> 300: 
> 301:  protected:
> 302:   inline static void _shared_disjoint_words_atomic(const HeapWord* from,

How about dropping the leading underscore prefix?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From dholmes at openjdk.java.net  Wed Feb 23 05:13:51 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 05:13:51 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic
In-Reply-To: <fJ7aT3lcLmBNDrT3jMX1rhUuKsToVZ5xOSfRpPcVKB8=.e2d83721-d875-40eb-9c8a-b4a2f1b24eb7@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <fJ7aT3lcLmBNDrT3jMX1rhUuKsToVZ5xOSfRpPcVKB8=.e2d83721-d875-40eb-9c8a-b4a2f1b24eb7@github.com>
Message-ID: <s7lUhb2OMbsf2B8rl7inB3a5n_dZCZ3i9VGb_1OfKyY=.fab2d68c-6449-4203-9001-706d34cc3c24@github.com>

On Wed, 23 Feb 2022 04:42:33 GMT, Mikael Vidstedt <mikael at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> src/hotspot/share/utilities/copy.hpp line 302:
> 
>> 300: 
>> 301:  protected:
>> 302:   inline static void _shared_disjoint_words_atomic(const HeapWord* from,
> 
> How about dropping the leading underscore prefix?

Yep will do. Was originally intended (similar to other pd code) to indicate this was a private/internal API, but the protected status achieves the same thing. Thanks for looking at it and the help with the disassembly analysis. :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From dholmes at openjdk.java.net  Wed Feb 23 05:38:34 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 05:38:34 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
Message-ID: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>

> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
> 
> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
> 
> Platforms affected:
>  - all x86
>  - Zero
>  - Windows Aarch64
>  - PPC
> 
> Testing: tiers 1-3
> Additional builds: tiers 4 and 5
>  - builds covered: x86 and Zero
> 
> GHA
> - builds covered:  Windows-Aarch64
> 
> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
> 
> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
> 
> Thanks,
> David

David Holmes has updated the pull request incrementally with one additional commit since the last revision:

  Remove underscore from name

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7567/files
  - new: https://git.openjdk.java.net/jdk/pull/7567/files/a34aee31..46ecdd29

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7567&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7567&range=00-01

  Stats: 6 lines in 5 files changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7567.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7567/head:pull/7567

PR: https://git.openjdk.java.net/jdk/pull/7567

From jbhateja at openjdk.java.net  Wed Feb 23 05:56:52 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 23 Feb 2022 05:56:52 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long
In-Reply-To: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
Message-ID: <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>

On Tue, 22 Feb 2022 09:24:47 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:

> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.

src/hotspot/cpu/x86/x86_64.ad line 8602:

> 8600:     __ jmp(done);
> 8601:     __ bind(neg_divisor_fastpath); 
> 8602:     // Fastpath for divisor < 0: 

Move in macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8633:

> 8631:     __ jmp(done);
> 8632:     __ bind(neg_divisor_fastpath);
> 8633:     // Fastpath for divisor < 0: 

Move in macro assembly rountine.

src/hotspot/cpu/x86/x86_64.ad line 8722:

> 8720:     __ shrl(rax, 31); // quotient
> 8721:     __ sarl(tmp, 31);
> 8722:     __ andl(tmp, divisor);

Move in macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8763:

> 8761:     __ andnq(rax, rax, rdx);
> 8762:     __ movq(tmp, rax);
> 8763:     __ shrq(rax, 63); // quotient

Move in macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8902:

> 8900:     __ subl(tmp_rax, divisor);
> 8901:     __ andnl(tmp_rax, tmp_rax, rdx);
> 8902:     __ sarl(tmp_rax, 31);

Please move this into a macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8932:

> 8930:     // Fastpath when divisor < 0: 
> 8931:     // remainder = dividend - (((dividend & ~(dividend - divisor)) >> (Long.SIZE - 1)) & divisor)
> 8932:     // See Hacker's Delight (2nd ed), section 9.3 which is implemented in java.lang.Long.remainderUnsigned()

Please move it into a macro assembly routine.

src/hotspot/share/opto/compile.cpp line 3499:

> 3497:       Node* d = n->find_similar(Op_UDivI);
> 3498:       if (d) {
> 3499:         // Replace them with a fused unsigned divmod if supported

Can you explain a bit here, why can't this transformation be handled earlier ?

src/hotspot/share/opto/divnode.cpp line 1350:

> 1348:     return NULL;
> 1349:   }
> 1350: 

Please remove Value and Ideal routines if no explicit transforms are being done.

src/hotspot/share/opto/divnode.cpp line 1362:

> 1360:   }
> 1361: 
> 1362: //=============================================================================

You can remove Ideal routine is not transformation is being done.

test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 76:

> 74:         return quotients;
> 75:     }
> 76: 

Return seems redundant here.

test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 83:

> 81:         }
> 82:         return remainders;
> 83:     }

Return seems redundant here.

test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 75:

> 73:         }
> 74:         return quotients;
> 75:     }

Do we need to return quotients, since it's a field  being explicitly modified.

test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 82:

> 80:             remainders[i] = Long.remainderUnsigned(dividends[i], divisors[i]);
> 81:         }
> 82:         return remainders;

Same as above

-------------

PR: https://git.openjdk.java.net/jdk/pull/7572

From mikael at openjdk.java.net  Wed Feb 23 06:04:46 2022
From: mikael at openjdk.java.net (Mikael Vidstedt)
Date: Wed, 23 Feb 2022 06:04:46 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
Message-ID: <7JjwWxlH6KOZ0L76afxGufR2ZxCWbAwQXtYOSI-A1zE=.f21853f3-46da-423d-8358-0765e4617100@github.com>

On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove underscore from name

Marked as reviewed by mikael (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From stuefe at openjdk.java.net  Wed Feb 23 07:22:07 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 23 Feb 2022 07:22:07 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend [v2]
In-Reply-To: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
Message-ID: <3YEOwRmxyanbFix2WLt8DMgR6cweiAJlR8SbL6AUFE8=.d44da1b1-fba6-423e-97d8-9106a52a4aa7@github.com>

> NMT backend can be further simplified and cleaned out.
> 
> - some entry points require NMT_TrackingLevel as arguments, some use the global tracking level. Ultimately, every part of NMT always uses the global tracking level, so in many cases the explicit parameter can be removed and the global tracking level can be used instead.
> - `MemTracker::malloc_header_size(level)` + `MemTracker::malloc_footer_size(level)` are fused into `MemTracker::overhead_per_malloc()`
> - when adding to `MallocSiteTable`, caller gets back a shortcut to the entry. That shortcut is stored verbatim in the malloc header. It consists of two 16-bit values (bucket index and chain position). That tupel finds its way into many argument lists. It can be simplified into single 32-bit opaque marker. Code outside the MallocSiteTable does not need to know what it is.
> - Currently, the `MallocHeader` class contains a lot of logic. It accounts (in constructor) and de-accounts (in `MallocHeader::release()`). It would simplify code if `MallocHeader` were just a dumb data carrier and the `MallocTracker` would do the actual work.
> - `MallocHeader` can be simplified, almost all members made constant and modifying accessors removed.
> - In some places we handle inputptr=NULL gracefully where we should assert instead
> - Expressions like `MemTracker::tracking_level() != NMT_off` can be simplified to `MemTracker::enabled()`.
> - MemTracker::malloc_base (all variants) can be removed. Note that we have MallocTracker::malloc_header, which achieves the same and does not require casting to the header.
> 
> Testing:
> 
> - GHAs
> - manually ran NMT gtests (all NMT modes) and NMT jtreg tests on Ubuntu x64
> - SAP nightlies ran through. Note that since 8275301 "Unify C-heap buffer overrun checks into NMT" NMT is enabled by default in debug builds, so it gets a lot more workout in tests now.
> 
> Note that I wanted to manually verify that the gdb "call pp" command still works in order to not break Zhengyu's recent addition, but found its already broken. I filed https://bugs.openjdk.java.net/browse/JDK-8281023 and am preparing a separate patch.

Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:

 - Zhengyus proposals
 - fix build error after merge (need const variant of malloc_header())
 - merge master
 - pp should handle NULL correctly
 - remove mostly unused MallocTracker accessors for header members
 - Remove use of NMT level and simplify malloc+realloc+free
 - dumb down malloc header
 - mst bucket+pos=marker
 - remove malloc_base

-------------

Changes: https://git.openjdk.java.net/jdk/pull/7283/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7283&range=01
  Stats: 273 lines in 10 files changed: 56 ins; 147 del; 70 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7283.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7283/head:pull/7283

PR: https://git.openjdk.java.net/jdk/pull/7283

From stuefe at openjdk.java.net  Wed Feb 23 07:22:10 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 23 Feb 2022 07:22:10 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend [v2]
In-Reply-To: <JN39I3H42NHx43U-YhBWFx7CnX2_h0hQvKLDjc1ZiAM=.6c829ee4-db8e-4bf4-ac23-abda28817d4c@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
 <JN39I3H42NHx43U-YhBWFx7CnX2_h0hQvKLDjc1ZiAM=.6c829ee4-db8e-4bf4-ac23-abda28817d4c@github.com>
Message-ID: <9vTdCLUESqkKUPI4T2Zes4Hk2dU9IbGtJ8Q0I-ugAe4=.60a9181f-d3e7-4195-bfef-198bd1791552@github.com>

On Thu, 17 Feb 2022 15:49:09 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
>> 
>>  - Zhengyus proposals
>>  - fix build error after merge (need const variant of malloc_header())
>>  - merge master
>>  - pp should handle NULL correctly
>>  - remove mostly unused MallocTracker accessors for header members
>>  - Remove use of NMT level and simplify malloc+realloc+free
>>  - dumb down malloc header
>>  - mst bucket+pos=marker
>>  - remove malloc_base
>
> Overall is good, a few minor comments.

Thanks a lot, @zhengyu123, for your review. Sorry for the delay, I had vacation. I'll implement all your proposals excluding the last one (mst_marker as structure); see comment there.

> src/hotspot/share/services/mallocTracker.hpp line 296:
> 
>> 294:   NOT_LP64(uint32_t _alt_canary);
>> 295:   const size_t _size;
>> 296:   const uint32_t _mst_marker;
> 
> make mst_marker a struct? instead of opaque type.

I played around a lot with different forms (struct, union) and in the end settled on an opaque uint32 since
- it would be passed by value and I would have to provide that structure in all kind places, I got include circularities
- I have a vague improvement in my head where we store the malloc site table entries not as individually malloced elements but in a (dynamically growing) array; that would mean we could address them by index without having to walk the bucket chains; and that index would be a simple number.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7283

From shade at openjdk.java.net  Wed Feb 23 07:51:48 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 23 Feb 2022 07:51:48 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
Message-ID: <PqR2p2ZM-GCUvfn0-lGDt993uI11L_TqAgsTQNEuOgI=.6aba8e9b-06c2-49e0-9d2d-830a2789c970@github.com>

On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove underscore from name

Looks fine. There might be some performance implications to this, as IIRC this code gets called from GC copying, so some light benchmarking might be in order.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7567

From kbarrett at openjdk.java.net  Wed Feb 23 08:16:52 2022
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 23 Feb 2022 08:16:52 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
Message-ID: <EvJPZmq-gN67uqyYQI_M1GyO0TrJdCQTrePW5mMp9hs=.9fd77445-1e41-47d4-8b1c-0d60707eea1e@github.com>

On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove underscore from name

Looks good.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7567

From jbhateja at openjdk.java.net  Wed Feb 23 09:03:37 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 23 Feb 2022 09:03:37 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v7]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | --
> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  8279508: Review comments resolved.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/f35ed9cf..6c869c76

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=05-06

  Stats: 7 lines in 2 files changed: 0 ins; 3 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Wed Feb 23 09:03:39 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Wed, 23 Feb 2022 09:03:39 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v6]
In-Reply-To: <U8OHdsrpVYQuacfR2pi_xqdFSb0RxGAszHwa2aJCXho=.323829f3-308c-4a18-ac31-6d03e2a8983f@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <oMnlIO5l_pU71SvWpOFppQ-7882cq32UOjKqWZckxM0=.0efd7853-b30d-488b-92c4-4a8ad0412fda@github.com>
 <U8OHdsrpVYQuacfR2pi_xqdFSb0RxGAszHwa2aJCXho=.323829f3-308c-4a18-ac31-6d03e2a8983f@github.com>
Message-ID: <Wg3yyVHD32XFG0yJ_RviZkVClkkthf69po-Qe6yiDxg=.3a508365-c27c-451e-9811-c1f91d1b3162@github.com>

On Wed, 23 Feb 2022 01:31:24 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8279508: Fixing for windows failure.
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4146:
> 
>> 4144:   vaddpd(xtmp1, src , xtmp1, vec_enc);
>> 4145:   vrndscalepd(dst, xtmp1, 0x4, vec_enc);
>> 4146:   evcvtpd2qq(dst, dst, vec_enc);
> 
> Why do we need vrndscalepd in between, could we not directly use cvtpd2qq after vaddpd?

Thanks @sviswa7 , when a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits. DONE.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From mdoerr at openjdk.java.net  Wed Feb 23 09:30:53 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Wed, 23 Feb 2022 09:30:53 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
Message-ID: <X2GMzeFyslJXItInRbygUh-m2Hyq-CQ0yaWc2HVVFNI=.da526b83-c96e-44cc-9d5c-cd47ca780008@github.com>

On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove underscore from name

Works on PPC64.
Note: This change may disturb loop optimizations which don't violate atomicity. Performance impact is possible.

-------------

Marked as reviewed by mdoerr (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7567

From dholmes at openjdk.java.net  Wed Feb 23 11:09:51 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 11:09:51 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <PqR2p2ZM-GCUvfn0-lGDt993uI11L_TqAgsTQNEuOgI=.6aba8e9b-06c2-49e0-9d2d-830a2789c970@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
 <PqR2p2ZM-GCUvfn0-lGDt993uI11L_TqAgsTQNEuOgI=.6aba8e9b-06c2-49e0-9d2d-830a2789c970@github.com>
Message-ID: <xkRTxNEnsg8oOQZLBXztZTtqKJiG-zfPmGWVPGjZRBY=.13fde08d-d0eb-41b5-b260-676e0160a2fa@github.com>

On Wed, 23 Feb 2022 07:48:42 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Remove underscore from name
>
> Looks fine. There might be some performance implications to this, as IIRC this code gets called from GC copying, so some light benchmarking might be in order.

@shipilev any suggestions as to which benchmarks to try to run for this? Otherwise I'll just try our usual internal ones.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From shade at openjdk.java.net  Wed Feb 23 11:23:52 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 23 Feb 2022 11:23:52 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <PqR2p2ZM-GCUvfn0-lGDt993uI11L_TqAgsTQNEuOgI=.6aba8e9b-06c2-49e0-9d2d-830a2789c970@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
 <PqR2p2ZM-GCUvfn0-lGDt993uI11L_TqAgsTQNEuOgI=.6aba8e9b-06c2-49e0-9d2d-830a2789c970@github.com>
Message-ID: <AknquHaCXyRqeY3mHW46RTc_fyzF7-0Ht9t3QdNPWNA=.5437dea3-e6e0-4db8-bda2-a48a8fd64210@github.com>

On Wed, 23 Feb 2022 07:48:42 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Remove underscore from name
>
> Looks fine. There might be some performance implications to this, as IIRC this code gets called from GC copying, so some light benchmarking might be in order.

> @shipilev any suggestions as to which benchmarks to try to run for this? Otherwise I'll just try our usual internal ones.

Just the usual sanity check of benchmarks is fine. If there are regressions on some other benchmarks, we can take care of them after integration.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From redestad at openjdk.java.net  Wed Feb 23 12:59:31 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Wed, 23 Feb 2022 12:59:31 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v5]
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <iboUFKnizUJJqKKF5r316fQmgc6cdwkTGJ4xlfpiTZE=.d7488f69-2e18-4062-bd69-c655fa0a0e11@github.com>

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

Claes Redestad has updated the pull request incrementally with two additional commits since the last revision:

 - Fix TestCountPositives to correctly allow 0 return when expected != len (for now)
 - aarch64: fix issue with short inputs divisible by wordSize

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7231/files
  - new: https://git.openjdk.java.net/jdk/pull/7231/files/a5e28b32..a95680cb

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=03-04

  Stats: 23 lines in 3 files changed: 3 ins; 4 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7231.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231

PR: https://git.openjdk.java.net/jdk/pull/7231

From duke at openjdk.java.net  Wed Feb 23 13:45:00 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Wed, 23 Feb 2022 13:45:00 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v25]
In-Reply-To: <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
Message-ID: <8puP07DM-ldrOlaYyU7ex_gpFMjSbRWjfuObimo-XPQ=.fa443055-bf97-4602-a177-4d62682f1f95@github.com>

On Tue, 22 Feb 2022 14:35:19 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
> 
>  - Merge master
>  - Merge master
>  - Merge master
>  - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret
>  - Add comments to enter calls
>  - Set PreserveFramePointer if use_rop_protection is set
>  - Merge enter_subframe into enter
>  - Review fixups
>  - Documentation updates
>  - Update copyrights to 2022
>  - ... and 24 more: https://git.openjdk.java.net/jdk/compare/022d8070...c4e0ee31

I did another full jteg run, and everything looks fine.
Think that's all the review comments resolved now too.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From redestad at openjdk.java.net  Wed Feb 23 14:19:20 2022
From: redestad at openjdk.java.net (Claes Redestad)
Date: Wed, 23 Feb 2022 14:19:20 GMT
Subject: RFR: 8281146: Replace StringCoding.hasNegatives with
 countPositives [v6]
In-Reply-To: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
References: <DzglpI1oYUyB2IYco3SVg1rzyKTUSUbejzLAl_SmCJI=.3ddbe1a8-6827-406e-9588-e1f5f31e21c7@github.com>
Message-ID: <GGlhZ0PA4ylRTq3mOm5QaAKjc38LKaVNfRZd-m9LqwY=.c36c7db0-c00c-4e1b-bf6d-8f48edd8b6b5@github.com>

> I'm requesting comments and, hopefully, some help with this patch to replace `StringCoding.hasNegatives` with `countPositives`. The new method does a very similar pass, but alters the intrinsic to return the number of leading bytes in the `byte[]` range which only has positive bytes. This allows for dealing much more efficiently with those `byte[]`s that has a ASCII prefix, with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
> 
> Microbenchmark results: https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
> 
> - Only implemented on x86 for now, but I want to verify that implementations of `countPositives` can be implemented with similar efficiency on all platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) before moving ahead. This pretty much means holding up this until it's implemented on all platforms, which can either contributed to this PR or as dependent follow-ups.
> 
> - An alternative to holding up until all platforms are on board is to allow the implementation of `StringCoding.hasNegatives` and `countPositives` to be implemented so that the non-intrinsified method calls into the intrinsified. This requires structuring the implementations differently based on which intrinsic - if any - is actually implemented. One way to do this could be to mimic how `java.nio` handles unaligned accesses and expose which intrinsic is available via `Unsafe` into a `static final` field.
> 
> - There are a few minor regressions (~5%) in the x86 implementation on `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, for example `encode-/decodeShortMixed` even see a minor improvement, which makes me consider those corner case regressions with little real world implications (if you have latin1 Strings, you're likely to also have ASCII-only strings in your mix).

Claes Redestad has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 29 commits:

 - Resolve merge conflict
 - Fix TestCountPositives to correctly allow 0 return when expected != len (for now)
 - aarch64: fix issue with short inputs divisible by wordSize
 - Switch aarch64 intrinsic to a variant of countPositives returning len or zero as a first step.
 - Revert micro changes, split out to #7516
 - Merge branch 'master' of https://github.com/cl4es/jdk into count_positives
 - Merge branch 'master' into count_positives
 - Restore partial vector checks in AVX2 and SSE intrinsic variants
 - Let countPositives use hasNegatives to allow ports not implementing the countPositives intrinsic to stay neutral
 - Simplify changes to encodeUTF8
 - ... and 19 more: https://git.openjdk.java.net/jdk/compare/5035bf5e...685795ce

-------------

Changes: https://git.openjdk.java.net/jdk/pull/7231/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7231&range=05
  Stats: 532 lines in 29 files changed: 308 ins; 53 del; 171 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7231.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7231/head:pull/7231

PR: https://git.openjdk.java.net/jdk/pull/7231

From coleenp at openjdk.java.net  Wed Feb 23 14:27:50 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Feb 2022 14:27:50 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
Message-ID: <PON8NggnSiJNnLlDOz9p5mo1cMjUi6U47keEXZvI9Ws=.acc6ed14-d79d-48a1-9ab3-f9ad730f49fa@github.com>

On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove underscore from name

Why does this go to pd_disjoint_words_atomic when it goes forward to the shared code?  I suspect the performance implications of the '#else' for x86 is minimal so not worth keeping. ie, it's not really platform dependent anymore really.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From duke at openjdk.java.net  Wed Feb 23 15:07:05 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 15:07:05 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access
Message-ID: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>

This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

-------------

Commit messages:
 - Improve os::is_first_C_frame(...)
 - Add frame::can_access_link(Thread *t) and use it

Changes: https://git.openjdk.java.net/jdk/pull/7591/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282306
  Stats: 35 lines in 10 files changed: 32 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 15:31:37 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 15:31:37 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <tNKoeZCeOn4UC0MfUR8l8aKyJFs2FggHjsy1N3nYzFQ=.d6a11843-a04f-4e19-be3c-c663a0543674@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

I've updated it. Thanks again.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Wed Feb 23 15:31:36 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 15:31:36 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v2]
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <2an2VYo7_ZVP5AuI8q0thE4undL8ejH-EBwf8x9flbc=.6e0418c6-89ab-4b2e-a511-33ac68a44a47@github.com>

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Add changes by David Holmes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7559/files
  - new: https://git.openjdk.java.net/jdk/pull/7559/files/8364d4b0..9f701eb0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=00-01

  Stats: 10 lines in 2 files changed: 6 ins; 1 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7559.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7559/head:pull/7559

PR: https://git.openjdk.java.net/jdk/pull/7559

From clanger at openjdk.java.net  Wed Feb 23 15:42:53 2022
From: clanger at openjdk.java.net (Christoph Langer)
Date: Wed, 23 Feb 2022 15:42:53 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <9suLeDH8iwypmCWZlLeWhGNXHrddUmWDHxsZTqJi0JY=.e83fe160-8f4b-47d9-a4b7-ed8371d258f0@github.com>

On Wed, 23 Feb 2022 14:59:49 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Changes requested by clanger (Reviewer).

src/hotspot/share/runtime/os.cpp line 1227:

> 1225:          !t->is_in_full_stack((address)fr->fp()) ||
> 1226:          !t->is_in_full_stack((address)fr->sender_sp()) ||
> 1227:          !t->is_in_full_stack((address)fr->link());

Should probably use
#ifdef _WINDOWS
...
#else
...
#endif

here

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 15:52:31 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 15:52:31 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v3]
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <dlx5OS_YADjC072mHWKFlLcAF6477L5b_FZeniLGduU=.5c33bbdf-d96c-4713-a21d-cbabeb4e34c9@github.com>

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Johannes Bechberger has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  Add changes by David Holmes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7559/files
  - new: https://git.openjdk.java.net/jdk/pull/7559/files/9f701eb0..ca295d34

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=01-02

  Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7559.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7559/head:pull/7559

PR: https://git.openjdk.java.net/jdk/pull/7559

From mbaesken at openjdk.java.net  Wed Feb 23 15:53:54 2022
From: mbaesken at openjdk.java.net (Matthias Baesken)
Date: Wed, 23 Feb 2022 15:53:54 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend [v2]
In-Reply-To: <3YEOwRmxyanbFix2WLt8DMgR6cweiAJlR8SbL6AUFE8=.d44da1b1-fba6-423e-97d8-9106a52a4aa7@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
 <3YEOwRmxyanbFix2WLt8DMgR6cweiAJlR8SbL6AUFE8=.d44da1b1-fba6-423e-97d8-9106a52a4aa7@github.com>
Message-ID: <wAnfJEiBYs7_O9MkDPRG3FZu2pSE7Rjf2B-gbXKXQHw=.968ca506-0f19-4032-96eb-8487c8d71aa6@github.com>

On Wed, 23 Feb 2022 07:22:07 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> NMT backend can be further simplified and cleaned out.
>> 
>> - some entry points require NMT_TrackingLevel as arguments, some use the global tracking level. Ultimately, every part of NMT always uses the global tracking level, so in many cases the explicit parameter can be removed and the global tracking level can be used instead.
>> - `MemTracker::malloc_header_size(level)` + `MemTracker::malloc_footer_size(level)` are fused into `MemTracker::overhead_per_malloc()`
>> - when adding to `MallocSiteTable`, caller gets back a shortcut to the entry. That shortcut is stored verbatim in the malloc header. It consists of two 16-bit values (bucket index and chain position). That tupel finds its way into many argument lists. It can be simplified into single 32-bit opaque marker. Code outside the MallocSiteTable does not need to know what it is.
>> - Currently, the `MallocHeader` class contains a lot of logic. It accounts (in constructor) and de-accounts (in `MallocHeader::release()`). It would simplify code if `MallocHeader` were just a dumb data carrier and the `MallocTracker` would do the actual work.
>> - `MallocHeader` can be simplified, almost all members made constant and modifying accessors removed.
>> - In some places we handle inputptr=NULL gracefully where we should assert instead
>> - Expressions like `MemTracker::tracking_level() != NMT_off` can be simplified to `MemTracker::enabled()`.
>> - MemTracker::malloc_base (all variants) can be removed. Note that we have MallocTracker::malloc_header, which achieves the same and does not require casting to the header.
>> 
>> Testing:
>> 
>> - GHAs
>> - manually ran NMT gtests (all NMT modes) and NMT jtreg tests on Ubuntu x64
>> - SAP nightlies ran through. Note that since 8275301 "Unify C-heap buffer overrun checks into NMT" NMT is enabled by default in debug builds, so it gets a lot more workout in tests now.
>> 
>> Note that I wanted to manually verify that the gdb "call pp" command still works in order to not break Zhengyu's recent addition, but found its already broken. I filed https://bugs.openjdk.java.net/browse/JDK-8281023 and am preparing a separate patch.
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
> 
>  - Zhengyus proposals
>  - fix build error after merge (need const variant of malloc_header())
>  - merge master
>  - pp should handle NULL correctly
>  - remove mostly unused MallocTracker accessors for header members
>  - Remove use of NMT level and simplify malloc+realloc+free
>  - dumb down malloc header
>  - mst bucket+pos=marker
>  - remove malloc_base

please check copyright years, e.g.  src/hotspot/share/services/memTracker.cpp   (still 2021).
Otherwise looks okay to me.

-------------

Marked as reviewed by mbaesken (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7283

From duke at openjdk.java.net  Wed Feb 23 15:56:52 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 15:56:52 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v3]
In-Reply-To: <dlx5OS_YADjC072mHWKFlLcAF6477L5b_FZeniLGduU=.5c33bbdf-d96c-4713-a21d-cbabeb4e34c9@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
 <dlx5OS_YADjC072mHWKFlLcAF6477L5b_FZeniLGduU=.5c33bbdf-d96c-4713-a21d-cbabeb4e34c9@github.com>
Message-ID: <JDSlW3Fgjw9rC7FAPkFv1xiVamLODkDWY5A_viycDto=.2ab7b174-903a-460b-904e-ad5a42265681@github.com>

On Wed, 23 Feb 2022 15:52:31 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.
>
> Johannes Bechberger has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   Add changes by David Holmes

I ran my original tests and found no crashes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Wed Feb 23 16:10:25 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 16:10:25 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Improve use of C macros

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/9aa9cb6a..4aad3ad2

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=00-01

  Stats: 6 lines in 1 file changed: 3 ins; 1 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 16:10:29 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 16:10:29 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <9suLeDH8iwypmCWZlLeWhGNXHrddUmWDHxsZTqJi0JY=.e83fe160-8f4b-47d9-a4b7-ed8371d258f0@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <9suLeDH8iwypmCWZlLeWhGNXHrddUmWDHxsZTqJi0JY=.e83fe160-8f4b-47d9-a4b7-ed8371d258f0@github.com>
Message-ID: <ddKOX7Xolu1J8Wi1_IgzdEEMUCwKdMnGjHmvKUmr2rA=.0eb1aa15-6a6b-4b13-9c30-0a86a60ddafb@github.com>

On Wed, 23 Feb 2022 15:39:42 GMT, Christoph Langer <clanger at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve use of C macros
>
> src/hotspot/share/runtime/os.cpp line 1227:
> 
>> 1225:          !t->is_in_full_stack((address)fr->fp()) ||
>> 1226:          !t->is_in_full_stack((address)fr->sender_sp()) ||
>> 1227:          !t->is_in_full_stack((address)fr->link());
> 
> Should probably use
> #ifdef _WINDOWS
> ...
> #else
> ...
> #endif
> 
> here

And also in the original method

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From stuefe at openjdk.java.net  Wed Feb 23 18:03:47 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 23 Feb 2022 18:03:47 GMT
Subject: RFR: JDK-8281015: Further simplify NMT backend [v2]
In-Reply-To: <wAnfJEiBYs7_O9MkDPRG3FZu2pSE7Rjf2B-gbXKXQHw=.968ca506-0f19-4032-96eb-8487c8d71aa6@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
 <3YEOwRmxyanbFix2WLt8DMgR6cweiAJlR8SbL6AUFE8=.d44da1b1-fba6-423e-97d8-9106a52a4aa7@github.com>
 <wAnfJEiBYs7_O9MkDPRG3FZu2pSE7Rjf2B-gbXKXQHw=.968ca506-0f19-4032-96eb-8487c8d71aa6@github.com>
Message-ID: <XVO4b8cw-2XrK5En1gC7TQF2f11HTWBfBwZeiSgvNlo=.6c97fc81-ee1b-49a3-8afe-5dfd62210fe7@github.com>

On Wed, 23 Feb 2022 15:50:21 GMT, Matthias Baesken <mbaesken at openjdk.org> wrote:

> please check copyright years, e.g. src/hotspot/share/services/memTracker.cpp (still 2021). Otherwise looks okay to me.

Thank you @MBaesken ! 

I will fix copyrights before pushing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7283

From stuefe at openjdk.java.net  Wed Feb 23 19:36:53 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 23 Feb 2022 19:36:53 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
Message-ID: <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>

On Wed, 23 Feb 2022 16:10:25 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Improve use of C macros

Hi Johannes,

Thanks for doing this, solving this makes sense.

But I'm not sure yours is the right approach. I think it would better to use SafeFetch to check the addresses in the relevant registers. 

Using Safefetch would mean that we don't depend on the existence of Thread (which may be NULL, especially in signal contexts). It would work if the registers erroneously point into unmapped or guarded portions of the stack, or if Thread is corrupted or outdated. And it would be way simpler, since it would not require a new version of is_first_C_frame.

I also find the interface - passing Thread* to the function just for it to then do error checking - slightly off. Without any comment on the prototype explaining what this argument is for, this causes head scratching. And semantically, there is only one instance of Thread this can ever be called for.

A function like this:

// check if frame is valid within the Thread's stack
bool Thread::is_valid_frame(const frame*)

would actually be clearer.

And if this error check is necessary, why do we then need two variants of is_first_c_frame? Should the error check not always happen?

But bottom line, I think safefetch would be a simpler and more robust approach.

Cheers, Thomas

src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 155:

> 153: inline intptr_t* frame::link() const              { return (intptr_t*) *(intptr_t **)addr_at(link_offset); }
> 154: 
> 155: inline bool frame::can_access_link(Thread *thread) const { return thread->is_in_full_stack((address)addr_at(link_offset)); }

is there a reason Thread* is non-const in all your variants of can_access_link and is_first_c_frame?

src/hotspot/cpu/ppc/frame_ppc.inline.hpp line 120:

> 118: }
> 119: 
> 120: inline bool frame::can_access_link(Thread *thread) const { return true; }

Why are ppc and s390 different from other platforms? If there is a valid reason, could you please add a short comment?

src/hotspot/cpu/zero/frame_zero.inline.hpp line 85:

> 83: }
> 84: 
> 85: inline bool frame::can_access_link(Thread *t) const {

Did you test zero? Would this not just crash it?

src/hotspot/share/runtime/os.cpp line 1223:

> 1221:   return true; // native stack isn't walkable on windows this way.
> 1222: #else
> 1223:   return !fr->can_access_link(t) || os::is_first_C_frame(fr) ||

Check t for NULL.

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7591

From bulasevich at openjdk.java.net  Wed Feb 23 20:04:20 2022
From: bulasevich at openjdk.java.net (Boris Ulasevich)
Date: Wed, 23 Feb 2022 20:04:20 GMT
Subject: RFR: 8280872: Reorder code cache segments to improve code density
Message-ID: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>

Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH.

In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB.

As a side effect, the performance of some tests is slightly improved:
``ArraysFill.testCharFill      10  thrpt   15  170235.720 -> 178477.212  ops/ms``

Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH

-------------

Commit messages:
 - fix name: is_non_nmethod, adding target_needs_far_branch func
 - change codecache segments order: nonprofiled-nonmethod-profiled

Changes: https://git.openjdk.java.net/jdk/pull/7517/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8280872
  Stats: 108 lines in 7 files changed: 47 ins; 38 del; 23 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7517.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517

PR: https://git.openjdk.java.net/jdk/pull/7517

From duke at openjdk.java.net  Wed Feb 23 20:04:22 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Wed, 23 Feb 2022 20:04:22 GMT
Subject: RFR: 8280872: Reorder code cache segments to improve code density
In-Reply-To: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>
References: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>
Message-ID: <mUn5qrvf9ChlnQyFmgm2giKmGScdzQSyz_W-nreT-ww=.45ec874f-57d6-4e55-8771-2e2aeeba218d@github.com>

On Thu, 17 Feb 2022 15:40:07 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH.
> 
> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB.
> 
> As a side effect, the performance of some tests is slightly improved:
> ``ArraysFill.testCharFill      10  thrpt   15  170235.720 -> 178477.212  ops/ms``
> 
> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 411:

> 409:   assert(CodeCache::find_blob(entry.target()) != NULL,
> 410:          "destination of far call not found in code cache");
> 411:   if (far_branches()) {

Can we write something like this:

if (is_target_far_from_heap(entry, cbuf->target_code_heap())) {
...
}


And the implementation:

static inline bool is_target_far_from_heap(Address addr, CodeHeap* heap = nullptr) {
  if (!SegmentedCodeCache || heap == nullptr) {
    return ReservedCodeCacheSize > branch_range;
  }

  return max_dist_to_heap(addr, heap) > branch_range;
}

src/hotspot/share/code/codeCache.cpp line 893:

> 891: }
> 892: 
> 893: bool CodeCache::is_codestub(address addr) {

Should it be named `is_non_nmethod`? According to the comments, there can be buffers, adapters and stubs.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7517

From bulasevich at openjdk.java.net  Wed Feb 23 20:04:23 2022
From: bulasevich at openjdk.java.net (Boris Ulasevich)
Date: Wed, 23 Feb 2022 20:04:23 GMT
Subject: RFR: 8280872: Reorder code cache segments to improve code density
In-Reply-To: <mUn5qrvf9ChlnQyFmgm2giKmGScdzQSyz_W-nreT-ww=.45ec874f-57d6-4e55-8771-2e2aeeba218d@github.com>
References: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>
 <mUn5qrvf9ChlnQyFmgm2giKmGScdzQSyz_W-nreT-ww=.45ec874f-57d6-4e55-8771-2e2aeeba218d@github.com>
Message-ID: <53JfCz6gB1YlkMd2KLpU_J0oHsW98-8RTE7jXthUBPw=.9684208c-01a8-48c9-9c52-a48b45493ad4@github.com>

On Tue, 22 Feb 2022 23:53:19 GMT, Evgeny Astigeevich <duke at openjdk.java.net> wrote:

>> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH.
>> 
>> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB.
>> 
>> As a side effect, the performance of some tests is slightly improved:
>> ``ArraysFill.testCharFill      10  thrpt   15  170235.720 -> 178477.212  ops/ms``
>> 
>> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 411:
> 
>> 409:   assert(CodeCache::find_blob(entry.target()) != NULL,
>> 410:          "destination of far call not found in code cache");
>> 411:   if (far_branches()) {
> 
> Can we write something like this:
> 
> if (is_target_far_from_heap(entry, cbuf->target_code_heap())) {
> ...
> }
> 
> 
> And the implementation:
> 
> static inline bool is_target_far_from_heap(Address addr, CodeHeap* heap = nullptr) {
>   if (!SegmentedCodeCache || heap == nullptr) {
>     return ReservedCodeCacheSize > branch_range;
>   }
> 
>   return max_dist_to_heap(addr, heap) > branch_range;
> }

Yes, inline expression is difficult to read. I added target_needs_far_branch, I hope it is better now.

> src/hotspot/share/code/codeCache.cpp line 893:
> 
>> 891: }
>> 892: 
>> 893: bool CodeCache::is_codestub(address addr) {
> 
> Should it be named `is_non_nmethod`? According to the comments, there can be buffers, adapters and stubs.

Ok. Thanks

-------------

PR: https://git.openjdk.java.net/jdk/pull/7517

From dholmes at openjdk.java.net  Wed Feb 23 20:23:49 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 20:23:49 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v3]
In-Reply-To: <dlx5OS_YADjC072mHWKFlLcAF6477L5b_FZeniLGduU=.5c33bbdf-d96c-4713-a21d-cbabeb4e34c9@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
 <dlx5OS_YADjC072mHWKFlLcAF6477L5b_FZeniLGduU=.5c33bbdf-d96c-4713-a21d-cbabeb4e34c9@github.com>
Message-ID: <WndpRTUcBMQzPLOHkTMqxdNjnLsCXgQOR1f7863DXP4=.2e228a5d-e75d-41e7-98a5-2e79c293a2dd@github.com>

On Wed, 23 Feb 2022 15:52:31 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.
>
> Johannes Bechberger has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   Add changes by David Holmes

Hi Johannes,

Your original changes need removing again.

Thanks,
David

src/hotspot/share/runtime/thread.hpp line 1325:

> 1323:   // external JNI entry points where the JNIEnv is passed into the VM.
> 1324:   // Does not return null, check is_thread_from_jni_environment_termminated()
> 1325:   // if you are not sure that it is not.

Needs deleting.

src/hotspot/share/runtime/thread.hpp line 1354:

> 1352:     return current->is_terminated();
> 1353:   }
> 1354: 

Needs deleting.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7559

From vlivanov at openjdk.java.net  Wed Feb 23 20:32:54 2022
From: vlivanov at openjdk.java.net (Vladimir Ivanov)
Date: Wed, 23 Feb 2022 20:32:54 GMT
Subject: Integrated: 8280901: MethodHandle::linkToNative stub is missing w/
 -Xint
In-Reply-To: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
References: <RHDt1jsXbYttfM5JJAdadmdnmPD2JJ9wWeNAV2m6ZsA=.f809db92-fe81-4b12-abe3-fcbaea1df918@github.com>
Message-ID: <W3wrOEVs87LmhhuJIVJ3bEJRXPlBSAS-8PJ_E2_rav4=.cbe0913e-5bfe-440e-bace-e527756ddc12@github.com>

On Mon, 14 Feb 2022 13:40:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> MethodHandle::linkToNative linker doesn't have a dedicated stub for interpreter. A stub for compiled code is shared and it is invoked through i2c stub when accessed from interpreter. In interpreter-only mode, stubs for compiled code are not generated and linkToNative ends up in a broken state where `Method::_from_interpreted_entry` points to `i2c` stub while `Method::_from_compiled_entry` points to `c2i` stub.
> 
> Proposed fix unconditionally generates a stub for `MethodHandle::linkToNative` case irrespective whether it is a interpreter-only mode or not. 
> 
> Testing: test/jdk/java/foreign/ w/ -Xint

This pull request has now been integrated.

Changeset: f86f38a8
Author:    Vladimir Ivanov <vlivanov at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/f86f38a8afd31c76039206f8f1f33371ad814396
Stats:     6 lines in 2 files changed: 5 ins; 0 del; 1 mod

8280901: MethodHandle::linkToNative stub is missing w/ -Xint

Reviewed-by: shade, kvn

-------------

PR: https://git.openjdk.java.net/jdk/pull/7459

From dholmes at openjdk.java.net  Wed Feb 23 20:36:58 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 20:36:58 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
Message-ID: <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>

On Wed, 23 Feb 2022 16:10:25 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Improve use of C macros

I'm struggling to understand the motivation for this change and what problem is being solved.

Do all these extra checks need to be done in product bits or would debug-only work? What kind of errors are we trying to guard against by doing this?

Thanks,
David

src/hotspot/share/utilities/vmError.cpp line 338:

> 336:       // is_first_C_frame() does only simple checks for frame pointer,
> 337:       // it will pass if java compiled code has a pointer in EBP.
> 338:       if (os::is_first_C_frame(&fr, t)) return invalid;

Is the comment still accurate?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 21:32:53 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 21:32:53 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>
Message-ID: <PjzTGicdcJS5UCC04nAvFTB7lOoUB1y5VkJXvyeFbMw=.117286d8-fba5-4377-aa43-77d6419800d7@github.com>

On Wed, 23 Feb 2022 19:06:05 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve use of C macros
>
> src/hotspot/cpu/ppc/frame_ppc.inline.hpp line 120:
> 
>> 118: }
>> 119: 
>> 120: inline bool frame::can_access_link(Thread *thread) const { return true; }
> 
> Why are ppc and s390 different from other platforms? If there is a valid reason, could you please add a short comment?

Because they do not (as I see it) directly dereference a location on the stack to get to this value.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 21:39:44 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 21:39:44 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>
Message-ID: <gh0xX2mpf5CpJ8FlYbvld8800BzoBbJt593b5hr2ZZQ=.24a05212-16e8-4dd4-98bf-227970cf4dd9@github.com>

On Wed, 23 Feb 2022 19:31:03 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve use of C macros
>
> src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 155:
> 
>> 153: inline intptr_t* frame::link() const              { return (intptr_t*) *(intptr_t **)addr_at(link_offset); }
>> 154: 
>> 155: inline bool frame::can_access_link(Thread *thread) const { return thread->is_in_full_stack((address)addr_at(link_offset)); }
> 
> is there a reason Thread* is non-const in all your variants of can_access_link and is_first_c_frame?

No there is none.

> src/hotspot/cpu/zero/frame_zero.inline.hpp line 85:
> 
>> 83: }
>> 84: 
>> 85: inline bool frame::can_access_link(Thread *t) const {
> 
> Did you test zero? Would this not just crash it?

You're correct, I look into this.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 21:39:45 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 21:39:45 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>
Message-ID: <I2TCtfMk3ftIKBvHNx7dbTs1KpRQ5dSaQweayimfsdQ=.7c215110-517b-4903-9508-4e1d1155cc5c@github.com>

On Wed, 23 Feb 2022 20:29:43 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve use of C macros
>
> src/hotspot/share/utilities/vmError.cpp line 338:
> 
>> 336:       // is_first_C_frame() does only simple checks for frame pointer,
>> 337:       // it will pass if java compiled code has a pointer in EBP.
>> 338:       if (os::is_first_C_frame(&fr, t)) return invalid;
> 
> Is the comment still accurate?

I think so? But maybe removing the second line would be helpful.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 21:51:46 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 21:51:46 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v3]
In-Reply-To: <WndpRTUcBMQzPLOHkTMqxdNjnLsCXgQOR1f7863DXP4=.2e228a5d-e75d-41e7-98a5-2e79c293a2dd@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
 <dlx5OS_YADjC072mHWKFlLcAF6477L5b_FZeniLGduU=.5c33bbdf-d96c-4713-a21d-cbabeb4e34c9@github.com>
 <WndpRTUcBMQzPLOHkTMqxdNjnLsCXgQOR1f7863DXP4=.2e228a5d-e75d-41e7-98a5-2e79c293a2dd@github.com>
Message-ID: <APC2YARaS7r4DIB8SjXsncPLWzKbzYQBtA6bYvkkrN4=.fbef716c-ed80-45d0-b8fd-ed22448df80e@github.com>

On Wed, 23 Feb 2022 20:18:45 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Johannes Bechberger has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>> 
>>   Add changes by David Holmes
>
> src/hotspot/share/runtime/thread.hpp line 1354:
> 
>> 1352:     return current->is_terminated();
>> 1353:   }
>> 1354: 
> 
> Needs deleting.

Of course.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From bulasevich at openjdk.java.net  Wed Feb 23 21:52:11 2022
From: bulasevich at openjdk.java.net (Boris Ulasevich)
Date: Wed, 23 Feb 2022 21:52:11 GMT
Subject: RFR: 8280872: Reorder code cache segments to improve code density
 [v2]
In-Reply-To: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>
References: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>
Message-ID: <LgdXzi8u2jSr15R9eo3H2u6GVC32HE1SoBEqmwRGpf8=.04e681a6-eb61-4222-b6e6-193ccd80eefe@github.com>

> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH.
> 
> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB.
> 
> As a side effect, the performance of some tests is slightly improved:
> ``ArraysFill.testCharFill      10  thrpt   15  170235.720 -> 178477.212  ops/ms``
> 
> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH

Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:

 - fix name: is_non_nmethod, adding target_needs_far_branch func
 - change codecache segments order: nonprofiled-nonmethod-profiled
   increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M

-------------

Changes: https://git.openjdk.java.net/jdk/pull/7517/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7517&range=01
  Stats: 107 lines in 7 files changed: 46 ins; 38 del; 23 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7517.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7517/head:pull/7517

PR: https://git.openjdk.java.net/jdk/pull/7517

From duke at openjdk.java.net  Wed Feb 23 21:58:56 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 21:58:56 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>
Message-ID: <4D3MbB3BO800obAYOqficpSlewTQdQW_y7oP78NQoGg=.9d3b6f0b-1b1e-405f-9bbf-64cb5a46976b@github.com>

On Wed, 23 Feb 2022 20:33:28 GMT, David Holmes <dholmes at openjdk.org> wrote:

> Do all these extra checks need to be done in product bits or would debug-only work? What kind of errors are we trying to guard against by doing this?

They currently do not affect production code, but I forgot that the `NativeCallStack` class exists that can make use of it (especially when using the simpler API as @tstuefe correctly proposed).

The main motivation is to prevent crashes in native stack walking in cases where just calling `frame.is_safe_for_sender` would return false, but a walk is still possible (typically on the bottom of the native call stack). I currently observe these crashes while walking on AsyncGetCallTrace modifications.

And to @tstuefe:

> But bottom line, I think safefetch would be a simpler and more robust approach.

Thanks for the comment. I missed that safefetch does exactly what I want,and hopefully without a large performance penalty?).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From dholmes at openjdk.java.net  Wed Feb 23 21:59:06 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 21:59:06 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <RcNrjTiJgCpGazTDo-0688Vq4BRkSKPvVI-CCk5ZgKo=.a139b185-b832-4ebe-8956-164f567da4bf@github.com>
Message-ID: <35D5t24EajEpM7JwVi9Q36-admzDEGaIcwpeKDtvtgo=.fe380706-76ce-4fdd-8296-725a64632702@github.com>

On Wed, 23 Feb 2022 19:26:50 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve use of C macros
>
> src/hotspot/share/runtime/os.cpp line 1223:
> 
>> 1221:   return true; // native stack isn't walkable on windows this way.
>> 1222: #else
>> 1223:   return !fr->can_access_link(t) || os::is_first_C_frame(fr) ||
> 
> Check t for NULL.

I would assert for not NULL and ensure the caller only uses this with a non-NULL thread.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From dholmes at openjdk.java.net  Wed Feb 23 21:59:13 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 21:59:13 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <I2TCtfMk3ftIKBvHNx7dbTs1KpRQ5dSaQweayimfsdQ=.7c215110-517b-4903-9508-4e1d1155cc5c@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>
 <I2TCtfMk3ftIKBvHNx7dbTs1KpRQ5dSaQweayimfsdQ=.7c215110-517b-4903-9508-4e1d1155cc5c@github.com>
Message-ID: <M9d6hVtyftEtu32_o_ana3elAOOP4H1YqjF1qNZIsCw=.a4b6c164-4035-48f0-a98e-32da54a45894@github.com>

On Wed, 23 Feb 2022 21:35:53 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> src/hotspot/share/utilities/vmError.cpp line 338:
>> 
>>> 336:       // is_first_C_frame() does only simple checks for frame pointer,
>>> 337:       // it will pass if java compiled code has a pointer in EBP.
>>> 338:       if (os::is_first_C_frame(&fr, t)) return invalid;
>> 
>> Is the comment still accurate?
>
> I think so? But maybe removing the second line would be helpful.

But are the checks still "simple"?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 21:59:35 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 21:59:35 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v4]
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Remove old code

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7559/files
  - new: https://git.openjdk.java.net/jdk/pull/7559/files/ca295d34..b5bd5f6e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7559&range=02-03

  Stats: 16 lines in 1 file changed: 0 ins; 16 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7559.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7559/head:pull/7559

PR: https://git.openjdk.java.net/jdk/pull/7559

From dholmes at openjdk.java.net  Wed Feb 23 22:02:11 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 22:02:11 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
Message-ID: <Dg4h8NF0wGWO-AxSWAOlkFPRMR7UrtRtIM6lUVPYoGQ=.8f68f015-872c-4faa-9ab6-3ef53760c9c6@github.com>

On Wed, 23 Feb 2022 16:10:25 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Improve use of C macros

src/hotspot/share/runtime/os.cpp line 1227:

> 1225:          !t->is_in_full_stack((address)fr->fp()) ||
> 1226:          !t->is_in_full_stack((address)fr->sender_sp()) ||
> 1227:          !t->is_in_full_stack((address)fr->link());

Isn't this check of `fr.link()` what you already did in `can_access_link`?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From dholmes at openjdk.java.net  Wed Feb 23 22:04:03 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Feb 2022 22:04:03 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v4]
In-Reply-To: <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
 <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>
Message-ID: <P1j7uY2hySlGIryjUNIdwK2BM9P91-hc2v1eOecRWho=.4151115e-34fc-43d0-b0f2-d1b5bf7e8c2b@github.com>

On Wed, 23 Feb 2022 21:59:35 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove old code

Looks good to me (but I am biased :) )!

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Wed Feb 23 22:10:06 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 22:10:06 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <Dg4h8NF0wGWO-AxSWAOlkFPRMR7UrtRtIM6lUVPYoGQ=.8f68f015-872c-4faa-9ab6-3ef53760c9c6@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <Dg4h8NF0wGWO-AxSWAOlkFPRMR7UrtRtIM6lUVPYoGQ=.8f68f015-872c-4faa-9ab6-3ef53760c9c6@github.com>
Message-ID: <f49Opj_4IXgqZ61R2BkQjL6itsAWLRPCLszJzvDtJzA=.eac69d50-3d08-4446-90ca-753c43b33da6@github.com>

On Wed, 23 Feb 2022 21:58:59 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve use of C macros
>
> src/hotspot/share/runtime/os.cpp line 1227:
> 
>> 1225:          !t->is_in_full_stack((address)fr->fp()) ||
>> 1226:          !t->is_in_full_stack((address)fr->sender_sp()) ||
>> 1227:          !t->is_in_full_stack((address)fr->link());
> 
> Isn't this check of `fr.link()` what you already did in `can_access_link`?

You're correct.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 22:36:03 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 22:36:03 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <f49Opj_4IXgqZ61R2BkQjL6itsAWLRPCLszJzvDtJzA=.eac69d50-3d08-4446-90ca-753c43b33da6@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <Dg4h8NF0wGWO-AxSWAOlkFPRMR7UrtRtIM6lUVPYoGQ=.8f68f015-872c-4faa-9ab6-3ef53760c9c6@github.com>
 <f49Opj_4IXgqZ61R2BkQjL6itsAWLRPCLszJzvDtJzA=.eac69d50-3d08-4446-90ca-753c43b33da6@github.com>
Message-ID: <kgCDeYnM0q9i19L1Gm9dIkdhA1IJAnsONFAXpHA8n6c=.7dec94cb-732f-4de1-a8ae-1ca31624f32e@github.com>

On Wed, 23 Feb 2022 22:06:47 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> src/hotspot/share/runtime/os.cpp line 1227:
>> 
>>> 1225:          !t->is_in_full_stack((address)fr->fp()) ||
>>> 1226:          !t->is_in_full_stack((address)fr->sender_sp()) ||
>>> 1227:          !t->is_in_full_stack((address)fr->link());
>> 
>> Isn't this check of `fr.link()` what you already did in `can_access_link`?
>
> You're correct.

But as I said, I'm going to remove these checks all to gether.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 22:42:04 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 22:42:04 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <M9d6hVtyftEtu32_o_ana3elAOOP4H1YqjF1qNZIsCw=.a4b6c164-4035-48f0-a98e-32da54a45894@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
 <o-fEsL3mfWsZ8l7bJdqSp5T1yvhnjKgTv0dGi2iL22I=.e3a9165a-3ede-4f70-badf-76b9b5266571@github.com>
 <I2TCtfMk3ftIKBvHNx7dbTs1KpRQ5dSaQweayimfsdQ=.7c215110-517b-4903-9508-4e1d1155cc5c@github.com>
 <M9d6hVtyftEtu32_o_ana3elAOOP4H1YqjF1qNZIsCw=.a4b6c164-4035-48f0-a98e-32da54a45894@github.com>
Message-ID: <v3Qo1_kSVH568TJvj-uYFmUojEdKdlSBc4-gYUk22zc=.fae2ef92-c83d-44a2-983b-7e4c6bac591c@github.com>

On Wed, 23 Feb 2022 21:56:49 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> I think so? But maybe removing the second line would be helpful.
>
> But are the checks still "simple"?

After the change proposed by Thomas: I think so, it still only checks the pointer value and safefetches the value of the stack pointer, ... to check whether they are valid.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 22:48:07 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Wed, 23 Feb 2022 22:48:07 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long
In-Reply-To: <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
 <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>
Message-ID: <qOv--w4OsMeaSnbKCztDpy2F3SuwQNhGsnroKJ3CKOU=.5e3097d0-d13e-432b-9e09-65a01db3d445@github.com>

On Wed, 23 Feb 2022 05:43:10 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.
>
> src/hotspot/cpu/x86/x86_64.ad line 8602:
> 
>> 8600:     __ jmp(done);
>> 8601:     __ bind(neg_divisor_fastpath); 
>> 8602:     // Fastpath for divisor < 0: 
> 
> Move in macro assembly routine.

Sure, will move it to a macro assembly routine

> src/hotspot/cpu/x86/x86_64.ad line 8633:
> 
>> 8631:     __ jmp(done);
>> 8632:     __ bind(neg_divisor_fastpath);
>> 8633:     // Fastpath for divisor < 0: 
> 
> Move in macro assembly rountine.

Sure, will move it to a macro assembly routine

> src/hotspot/cpu/x86/x86_64.ad line 8902:
> 
>> 8900:     __ subl(tmp_rax, divisor);
>> 8901:     __ andnl(tmp_rax, tmp_rax, rdx);
>> 8902:     __ sarl(tmp_rax, 31);
> 
> Please move this into a macro assembly routine.

Sure, will move it to a macro assembly routine

> src/hotspot/cpu/x86/x86_64.ad line 8932:
> 
>> 8930:     // Fastpath when divisor < 0: 
>> 8931:     // remainder = dividend - (((dividend & ~(dividend - divisor)) >> (Long.SIZE - 1)) & divisor)
>> 8932:     // See Hacker's Delight (2nd ed), section 9.3 which is implemented in java.lang.Long.remainderUnsigned()
> 
> Please move it into a macro assembly routine.

Sure, will move it to a macro assembly routine

> src/hotspot/share/opto/compile.cpp line 3499:
> 
>> 3497:       Node* d = n->find_similar(Op_UDivI);
>> 3498:       if (d) {
>> 3499:         // Replace them with a fused unsigned divmod if supported
> 
> Can you explain a bit here, why can't this transformation be handled earlier ?

This is following the existing approach being used for signed DivMod

> test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 75:
> 
>> 73:         }
>> 74:         return quotients;
>> 75:     }
> 
> Do we need to return quotients, since it's a field  being explicitly modified.

Will remove it.

> test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 82:
> 
>> 80:             remainders[i] = Long.remainderUnsigned(dividends[i], divisors[i]);
>> 81:         }
>> 82:         return remainders;
> 
> Same as above

Will remove it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7572

From duke at openjdk.java.net  Wed Feb 23 22:51:45 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 22:51:45 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v2]
In-Reply-To: <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <l_hqdCwkLFsukJu3J36ZfOvtIE8pXacIKuh7G-3ZDKQ=.8003b0ae-a95e-4e53-a46e-ef7057ff4cae@github.com>
Message-ID: <A-nceaONKTNrER2I8xAVY8MLzPu-ZF_1qDe3VYPLdog=.bc6d4359-1f86-4642-92eb-c1954b5a0dbc@github.com>

On Wed, 23 Feb 2022 16:10:25 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Improve use of C macros

The last commit rewrites it to something that might resemble Thomas' ideas.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 22:51:44 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Wed, 23 Feb 2022 22:51:44 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v3]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Use safefetch

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/4aad3ad2..5b7d6004

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=01-02

  Stats: 50 lines in 10 files changed: 3 ins; 36 del; 11 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Wed Feb 23 22:55:11 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Wed, 23 Feb 2022 22:55:11 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long
In-Reply-To: <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
 <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>
Message-ID: <ks-w0w2zABPbboQ-ZS2cr4u0MV2D8Z_ldc5TMnXWNFI=.93d83301-66fc-416c-b5c3-41852b0fcbe7@github.com>

On Wed, 23 Feb 2022 05:52:00 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.
>
> test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 76:
> 
>> 74:         return quotients;
>> 75:     }
>> 76: 
> 
> Return seems redundant here.

Will remove it.

> test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 83:
> 
>> 81:         }
>> 82:         return remainders;
>> 83:     }
> 
> Return seems redundant here.

Will remove it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7572

From duke at openjdk.java.net  Wed Feb 23 23:11:03 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Wed, 23 Feb 2022 23:11:03 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long
In-Reply-To: <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
 <oaATLVe4meOXI97Pb3XQn5SehxJaRvS7td7-bMpVT3U=.751ccae1-b992-4d8f-a055-5df021010b08@github.com>
Message-ID: <giWisIfFyQkddFi_9mTGq-iBaIpNAZ2hxhODNW7J0eg=.0e8f46a5-896b-4bb8-9ac7-256a2124aa56@github.com>

On Wed, 23 Feb 2022 05:46:45 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.
>
> src/hotspot/share/opto/divnode.cpp line 1350:
> 
>> 1348:     return NULL;
>> 1349:   }
>> 1350: 
> 
> Please remove Value and Ideal routines if no explicit transforms are being done.

Will remove the unused transformations.

> src/hotspot/share/opto/divnode.cpp line 1362:
> 
>> 1360:   }
>> 1361: 
>> 1362: //=============================================================================
> 
> You can remove Ideal routine is not transformation is being done.

Will remove the unused transformation.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7572

From duke at openjdk.java.net  Wed Feb 23 23:18:56 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Wed, 23 Feb 2022 23:18:56 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long [v3]
In-Reply-To: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
Message-ID: <XDqulKWM-u31jIMiaSbGOB1npSt8gZtks48D3KjcuqU=.7886cd75-0548-43ff-b3d9-b83bfcad8e0b@github.com>

> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:

  Fix line at end of file

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/7fc18af3..13549290

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572

From duke at openjdk.java.net  Wed Feb 23 23:15:53 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Wed, 23 Feb 2022 23:15:53 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long [v2]
In-Reply-To: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
Message-ID: <UeDaUp0vHrvBKF-Ob_w1XgGFcdOfY89MaHQJSG7zKrE=.21c2d6af-e1c1-4a80-af67-8d892565549b@github.com>

> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:

  Move intrinsic code to macro assembly routines; remove unused transformations for div and mod nodes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/fa57175a..7fc18af3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=00-01

  Stats: 326 lines in 7 files changed: 137 ins; 176 del; 13 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572

From sviswanathan at openjdk.java.net  Thu Feb 24 00:47:06 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Thu, 24 Feb 2022 00:47:06 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v7]
In-Reply-To: <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
Message-ID: <kaC15h_DGeCKO-Qni0hfmY_FJ696AfbHdP6rBBdjovA=.b93791a1-74f5-4ca3-abd1-36db73129f4a@github.com>

On Wed, 23 Feb 2022 09:03:37 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | --
>> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
>> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
>> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
>> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8279508: Review comments resolved.

Also curious, how does the performance look with all these changes.

src/hotspot/cpu/x86/assembler_x86.hpp line 2254:

> 2252:   void vroundps(XMMRegister dst, XMMRegister src, int32_t rmode, int vector_len);
> 2253:   void vrndscaleps(XMMRegister dst,  XMMRegister src,  int32_t rmode, int vector_len);
> 2254: 

These instructions are not used anymore and can be removed.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4116:

> 4114:                                             KRegister ktmp1, KRegister ktmp2, AddressLiteral double_sign_flip,
> 4115:                                             Register scratch, int vec_enc) {
> 4116:   evcvttpd2qq(dst, src, vec_enc);

The vcvttpd2qq instruction on overflow sets the result as  2^w -1 where w is 64. Whereas the special case handling is expecting 0x80000.....

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4145:

> 4143:   evpbroadcastq(xtmp1, scratch, vec_enc);
> 4144:   vaddpd(xtmp1, src , xtmp1, vec_enc);
> 4145:   evcvtpd2qq(dst, xtmp1, vec_enc);

The vcvtpd2qq instruction on overflow also sets the result as 2^w -1 where w is 64. Whereas the special case handling is expecting 0x80000.....

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4176:

> 4174:   vpbroadcastd(xtmp1, xtmp1, vec_enc);
> 4175:   vaddps(xtmp1, src , xtmp1, vec_enc);
> 4176:   vcvtps2dq(dst, xtmp1, vec_enc);

The vcvtps2dq returns 0x7FFFFFFF in case of overflow whereas the special case handling expects 0x80000000 incase of overflow. The same question applies to the corresponding vector_round_float_avx() implementation as well.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From sviswanathan at openjdk.java.net  Thu Feb 24 01:47:07 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Thu, 24 Feb 2022 01:47:07 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v7]
In-Reply-To: <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
Message-ID: <ZEUMbHG8f-JNJyvA-VfcVu-cp3iYU7XeqECf_uAeNxQ=.94d44361-b2bd-45e3-9505-379645c559a8@github.com>

On Wed, 23 Feb 2022 09:03:37 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | --
>> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
>> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
>> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
>> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8279508: Review comments resolved.

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8984:

> 8982: }
> 8983: 
> 8984: void MacroAssembler::round_double(Register dst, XMMRegister src, Register rtmp, Register rcx) {

Is it possible to implement this using the similar mxcsr change? In any case comments will help to review round_double and round_float code.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From sviswanathan at openjdk.java.net  Thu Feb 24 02:00:05 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Thu, 24 Feb 2022 02:00:05 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v7]
In-Reply-To: <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
Message-ID: <UwXGPnwms2ydicsnvxdiH8iIhIpEiXn-PEZxA80G5Hs=.48973d74-17e9-46db-b895-039b1c5eac80@github.com>

On Wed, 23 Feb 2022 09:03:37 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | --
>> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
>> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
>> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
>> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8279508: Review comments resolved.

test/hotspot/jtreg/compiler/c2/cr6340864/TestDoubleVect.java line 441:

> 439:       errn += verify("test_round: ", 1, l0[1], Long.MAX_VALUE);
> 440:       errn += verify("test_round: ", 2, l0[2], Long.MIN_VALUE);
> 441:       errn += verify("test_round: ", 3, l0[3], Long.MAX_VALUE);

Good to add additional test cases:
  Case with a1 value >= Long Max and < infinity. 
  Case with a1 value <= Long Min and > -infinity.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Thu Feb 24 02:43:46 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Thu, 24 Feb 2022 02:43:46 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
In-Reply-To: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
Message-ID: <cwKV4mTg3NBMUUHxowae5spxx7LfTuURx1oVKNTi4RU=.946ab01d-e0e9-46da-bf4e-33b3fdbda5ec@github.com>

> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:

  fix 32bit build issues

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/13549290..2915b2e7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7572&range=02-03

  Stats: 91 lines in 2 files changed: 49 ins; 42 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572

From dholmes at openjdk.java.net  Thu Feb 24 02:45:07 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 24 Feb 2022 02:45:07 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v3]
In-Reply-To: <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
Message-ID: <CVFSh3Yu3uzh3a-ZeHm86SzgW5yVHgapeSfnwpBtOjs=.2682ecc2-7972-43d7-ae9f-97c0543cdc08@github.com>

On Wed, 23 Feb 2022 22:51:44 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use safefetch

src/hotspot/share/runtime/os.cpp line 1192:

> 1190: 
> 1191:   uintptr_t usp    = (uintptr_t)fr->sp();
> 1192:   if ((usp & sp_align_mask) != 0 || SafeFetchN(fr->sp(), 0) == 0) return true;

This doesn't quite make sense to me. If the SafeFetchN were to fail then the load in the previous line would already have crashed wouldn't it?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From dholmes at openjdk.java.net  Thu Feb 24 03:53:05 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 24 Feb 2022 03:53:05 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v3]
In-Reply-To: <CVFSh3Yu3uzh3a-ZeHm86SzgW5yVHgapeSfnwpBtOjs=.2682ecc2-7972-43d7-ae9f-97c0543cdc08@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
 <CVFSh3Yu3uzh3a-ZeHm86SzgW5yVHgapeSfnwpBtOjs=.2682ecc2-7972-43d7-ae9f-97c0543cdc08@github.com>
Message-ID: <sJYbd80BTW0FkQKtE6ybhIfwuS6XjC0_df7LKF_tcpc=.302a3c4f-92cf-4f59-b457-796c4361a445@github.com>

On Thu, 24 Feb 2022 02:41:25 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Use safefetch
>
> src/hotspot/share/runtime/os.cpp line 1192:
> 
>> 1190: 
>> 1191:   uintptr_t usp    = (uintptr_t)fr->sp();
>> 1192:   if ((usp & sp_align_mask) != 0 || SafeFetchN(fr->sp(), 0) == 0) return true;
> 
> This doesn't quite make sense to me. If the SafeFetchN were to fail then the load in the previous line would already have crashed wouldn't it?

Sorry ignore that. The SafeFetch loads `*fr->sp()`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From dholmes at openjdk.java.net  Thu Feb 24 03:53:04 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 24 Feb 2022 03:53:04 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v3]
In-Reply-To: <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
Message-ID: <BlUkqJDfaENz5kWhp1Da4DCalof0Tg7TmjaPhV5nZuU=.5794d13b-3b85-4d6f-9937-a74ee599ef07@github.com>

On Wed, 23 Feb 2022 22:51:44 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use safefetch

This approach looks much better/cleaner - thanks.

Do we have any crash tests we can use to verify this?

Thanks,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From stuefe at openjdk.java.net  Thu Feb 24 06:21:08 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 24 Feb 2022 06:21:08 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v3]
In-Reply-To: <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
Message-ID: <N2QOQXJ4o_MmjGPA1Y_zkKW4ezxG_p2RRFor24Xib0E=.752f64a0-0cd3-49ef-91fc-1b5ffd62963f@github.com>

On Wed, 23 Feb 2022 22:51:44 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use safefetch

Hi Johannes,

thanks for taking my suggestion. This is better, and helps beyond your AsyncGetCallTrace scenario (e.g. in NMT).

safefetch works as an unconditional sub routine call to a prolog-free piece of code which does a single load. Basically:


(1) jump <safefetch pc> -> (2) load from questionable address -> (3) return


and the signal handler knows how to handle things if a segfault happens at (2).

So, for the standard case, if no fault happens, you pay for a subroutine call and a load. This is as cheap as it gets, but still not as cheap as a single inline load would be.

---

Still, I'm not sure I would add this to such a low-level function as frame::link(), at least not without analyzing the callers. Most of the callers of frame::link don't seem to be that performance-sensitive that a sub-routine call would throw them off. But I'm not sure here.

Moreover, even though your solution is beautifully simple, I don't like "lying" at this level. There may be cases where we rather have an honest crash when dereferencing an invalid frame, because we may want to analyze the root cause.

What I actually had in mind - sorry I was not too clear in my first review - was to use SafeFetch inside is_first_C_frame to check the validity of the link before dereferencing it. `is_first_C_frame()` is not super performace-critical, so it should be fine to use safefetch here. 

Note that we have `os::is_readable_pointer()` which encapsulates SafeFetch for checking pointer validity. So I imagine something like this:


bool frame::link_is_valid() {
	return os::is_readable_pointer(link);
}

...

bool os::is_first_C_frame(frame* fr) {
...
  // If the link address is invalid we are not walkable beyond this point.
  if (!fr->link_is_valid()) return true;
}


@dholmes-ora : the motivation is to harden a piece of code which may run in unsafe situations in production scenarios. Examples: AsyncGetCallTrace, stack printing in error reports, stack printing in NMT... Error handling has its secondary crash guards, but the other scenarios are "naked". And we have downstream additional facilities which use VM stack printing.

About a test, I agree, that would be nice. But one would have to "fake" an invalid stack. Maybe a new error reporting test where one deliberately overwrites portions of the stack and then tries to print the stack. However, I imagine things could be brittle, because the OS may catch a stack overwrite first. It's not totally trivial, maybe something for a separate RFE?

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Thu Feb 24 07:25:07 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Thu, 24 Feb 2022 07:25:07 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v3]
In-Reply-To: <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <sQCXVrPQNUEUt-Ejor3kxiR1ItnWAAbPjnpSb1o2QdI=.cefd9552-31e3-4d15-b7a6-23ee5861ae51@github.com>
Message-ID: <kWY9irIjDd7RXlXwSpByaEuYJ8baDVa2jVhoAQLAQEU=.92dd1f20-dff7-42c9-95ea-63efe0ed4015@github.com>

On Wed, 23 Feb 2022 22:51:44 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use safefetch

Know I understand.

I simple test would be to just allocate an area of zeroes and then create a frame for it. The proposed changes should prevent it from crashing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From mdoerr at openjdk.java.net  Thu Feb 24 09:32:12 2022
From: mdoerr at openjdk.java.net (Martin Doerr)
Date: Thu, 24 Feb 2022 09:32:12 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v4]
In-Reply-To: <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
 <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>
Message-ID: <mJj5cEsootzeJh1-k-L2FZZsr0oJ407Dzsv7vyxXXb4=.6f51f73c-e246-4667-ad6c-bed4f5eca00a@github.com>

On Wed, 23 Feb 2022 21:59:35 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove old code

LGTM.

src/hotspot/share/prims/forte.cpp line 565:

> 563: JNIEXPORT
> 564: void AsyncGetCallTrace(ASGCT_CallTrace *trace, jint depth, void* ucontext) {
> 565: 

Feel free to remove the extra newline.

-------------

Marked as reviewed by mdoerr (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7559

From kevinw at openjdk.java.net  Thu Feb 24 10:10:06 2022
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Thu, 24 Feb 2022 10:10:06 GMT
Subject: RFR: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422 [v4]
In-Reply-To: <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
 <tu9sdh3KHXtiSQyEOHHD68oQnzM7HrjYOoX_--v82hs=.5d0817ec-2221-4229-b801-528291e52c30@github.com>
Message-ID: <sgL_NjrOxdxnQCr8DY1cEtuayuj5UZQpP0Wb4wYIbfk=.5dab802f-3d30-4dee-8210-6ff0bdfd98ca@github.com>

On Wed, 23 Feb 2022 21:59:35 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove old code

Marked as reviewed by kevinw (Committer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Thu Feb 24 10:55:28 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 24 Feb 2022 10:55:28 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v25]
In-Reply-To: <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
Message-ID: <0Ob4kezo_Q0ro0eF_OeEABrzYeZCNmoaD5KQUcBpZRc=.6c772f45-c70c-4983-880a-8878e281d04b@github.com>

On Tue, 22 Feb 2022 14:35:19 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
> 
>  - Merge master
>  - Merge master
>  - Merge master
>  - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret
>  - Add comments to enter calls
>  - Set PreserveFramePointer if use_rop_protection is set
>  - Merge enter_subframe into enter
>  - Review fixups
>  - Documentation updates
>  - Update copyrights to 2022
>  - ... and 24 more: https://git.openjdk.java.net/jdk/compare/022d8070...c4e0ee31

Any more comments? Otherwise I'll integrate later

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From dholmes at openjdk.java.net  Thu Feb 24 11:48:18 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 24 Feb 2022 11:48:18 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
Message-ID: <QOs2NrTPCDyqrYnfUPE8_RpiZ8lNoO3-6WrVq1CnAJk=.0352888c-66d9-4b29-8bef-7561a7b8e52f@github.com>

On Wed, 23 Feb 2022 05:38:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Replace the common "atomic" switch+loop code chunks in the pd code with a shared version that uses Atomic::load/store.
>> 
>> See details in the bug report that show how current code is actually replaced by `memcpy` (in some places at least) whereas the new code is not.
>> 
>> Platforms affected:
>>  - all x86
>>  - Zero
>>  - Windows Aarch64
>>  - PPC
>> 
>> Testing: tiers 1-3
>> Additional builds: tiers 4 and 5
>>  - builds covered: x86 and Zero
>> 
>> GHA
>> - builds covered:  Windows-Aarch64
>> 
>> The only build affected and not tested is PPC. It would be great if someone could take this for a spin on PPC.
>> 
>> For platforms not affected by this change, i.e. those that already specialise the code, I make not claims regarding the atomicity or otherwise of those specialized versions. That would be for someone interested in those specific platforms to check out.
>> 
>> Thanks,
>> David
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove underscore from name

I ran some GC benchmarks which turned out to be just specjbb2005 and specjvm2008-*.

There were two regressions flagged:

Linux-x64: SPECjvm2008-LU.large-ZGC  -5.82%
macos-x64: SPECjvm2008-Serial-ParGC  -4.16%

However, Erik thinks these are just noise as apparently ZGC doesn't use these atomic copy routines, nor does he think ParGC does either.

Thoughts?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From coleenp at openjdk.java.net  Thu Feb 24 12:56:43 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Thu, 24 Feb 2022 12:56:43 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only
Message-ID: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>

Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
Tested with tier1 on Oracle platforms.

-------------

Commit messages:
 - 8282240: Add _name field to Method for NOT_PRODUCT only

Changes: https://git.openjdk.java.net/jdk/pull/7608/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282240
  Stats: 14 lines in 4 files changed: 10 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7608.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7608/head:pull/7608

PR: https://git.openjdk.java.net/jdk/pull/7608

From jbhateja at openjdk.java.net  Thu Feb 24 13:01:59 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 24 Feb 2022 13:01:59 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v7]
In-Reply-To: <ZEUMbHG8f-JNJyvA-VfcVu-cp3iYU7XeqECf_uAeNxQ=.94d44361-b2bd-45e3-9505-379645c559a8@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
 <ZEUMbHG8f-JNJyvA-VfcVu-cp3iYU7XeqECf_uAeNxQ=.94d44361-b2bd-45e3-9505-379645c559a8@github.com>
Message-ID: <hNjOWpxq-E6ikq7IZy9sYZXjrOniQcuKP1Hs4N4N1I0=.832230ca-7ca7-4540-84bc-b5e0e62838d9@github.com>

On Thu, 24 Feb 2022 01:43:27 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8279508: Review comments resolved.
>
> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8984:
> 
>> 8982: }
>> 8983: 
>> 8984: void MacroAssembler::round_double(Register dst, XMMRegister src, Register rtmp, Register rcx) {
> 
> Is it possible to implement this using the similar mxcsr change? In any case comments will help to review round_double and round_float code.

LDMXCSR has multi-cycle latency and it will degrade the performance of scalar operation's fast path.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Thu Feb 24 13:01:58 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 24 Feb 2022 13:01:58 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v8]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <XcbHyyMss1tlSLZC6uai_VCv0oaWux-uCRtm0eCZZRU=.12733ed3-02eb-4384-b13b-fbb474e099ac@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | --
> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98
> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58
> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56
> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  8279508: Review comments resolved.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/6c869c76..f7dec3d9

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=06-07

  Stats: 35 lines in 5 files changed: 8 ins; 22 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From coleenp at openjdk.java.net  Thu Feb 24 14:03:49 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Thu, 24 Feb 2022 14:03:49 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v2]
In-Reply-To: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
Message-ID: <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>

> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
> Tested with tier1 on Oracle platforms.

Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:

 - Enhance comment to say why name needs to be set later.
 - 8282240: Add _name field to Method for NOT_PRODUCT only
 - Merge branch 'master' into method-name
 - Enhance comment to say why name needs to be set later.
 - 8282240: Add _name field to Method for NOT_PRODUCT only

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7608/files
  - new: https://git.openjdk.java.net/jdk/pull/7608/files/ea440441..ab762ed7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=00-01

  Stats: 168915 lines in 3622 files changed: 116842 ins; 28264 del; 23809 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7608.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7608/head:pull/7608

PR: https://git.openjdk.java.net/jdk/pull/7608

From jbhateja at openjdk.java.net  Thu Feb 24 14:18:13 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 24 Feb 2022 14:18:13 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
In-Reply-To: <cwKV4mTg3NBMUUHxowae5spxx7LfTuURx1oVKNTi4RU=.946ab01d-e0e9-46da-bf4e-33b3fdbda5ec@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
 <cwKV4mTg3NBMUUHxowae5spxx7LfTuURx1oVKNTi4RU=.946ab01d-e0e9-46da-bf4e-33b3fdbda5ec@github.com>
Message-ID: <oCd9O4Ecr3MnFogrUlo-lMw-MUHNTfwJqdUIV2fxXNc=.e3ec0775-0264-4cea-b1a4-657be815b390@github.com>

On Thu, 24 Feb 2022 02:43:46 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in java.lang.Integer and java.lang.Long classes using x86 intrinsics. This change shows 3x improvement for Integer methods and upto 25% improvement for Long. This change also implements the DivMod optimization which fuses division and modulus operations if needed. The DivMod optimization shows 3x improvement for Integer and ~65% improvement for Long.
>
> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix 32bit build issues

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4408:

> 4406:   jmp(done);
> 4407:   bind(neg_divisor_fastpath);
> 4408:   // Fastpath for divisor < 0:

How about checking if divisor is +ve or -ve constant and non-constant dividend in identity routine and setting a flag in IR node, which can be used to either emit fast / slow path in a new instruction selection pattern. It will save emitting redundant instructions.

src/hotspot/share/opto/divnode.cpp line 881:

> 879:   return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : this;
> 880: }
> 881: //------------------------------Value------------------------------------------

Ideal transform to replace unsigned divide by cheaper logical right shift instruction if divisor is POW will be useful.

src/hotspot/share/opto/divnode.cpp line 897:

> 895: 
> 896:   // Either input is BOTTOM ==> the result is the local BOTTOM
> 897:   const Type *bot = bottom_type();

Can we add constant folding handling when both dividend and divisor are constants.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7572

From duke at openjdk.java.net  Thu Feb 24 14:26:39 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Thu, 24 Feb 2022 14:26:39 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v4]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <dDEOZ7T0lpcfDfFNsT9IfYhKl_JjdH2Gq-OrgBnp98w=.ce36e85d-fb73-4859-bc5f-d9bf0c86e6ce@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Introduce frame::link_or_null()

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/5b7d6004..1cc247d7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=02-03

  Stats: 33 lines in 8 files changed: 24 ins; 2 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Thu Feb 24 14:36:05 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Thu, 24 Feb 2022 14:36:05 GMT
Subject: Integrated: 8282200: ShouldNotReachHere() reached by AsyncGetCallTrace
 after JDK-8280422
In-Reply-To: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
References: <EwbQxz2VZzqYN-hiaB_q2_LDesVdWHV0hTHbG3ss3RQ=.57c74dcc-7b38-44f0-932f-e18df7e30066@github.com>
Message-ID: <FSXsrehvVGN1mBNaYrfpWaGOO94v8U-bvbjgtpMgJ3s=.1a3babc4-85f5-431e-b959-adf3389c0aaa@github.com>

On Mon, 21 Feb 2022 14:43:27 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

> Fixes the mentioned bug by replacing the check in AsyncGetCallTrace using the newly introduced method `JavaThread::thread_from_jni_environment`.

This pull request has now been integrated.

Changeset: 231e48fa
Author:    Johannes Bechberger <johannes.bechberger at sap.com>
Committer: Martin Doerr <mdoerr at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/231e48fa63aeb4e35c7c948f958695d62b7157ce
Stats:     9 lines in 2 files changed: 3 ins; 3 del; 3 mod

8282200: ShouldNotReachHere() reached by AsyncGetCallTrace after JDK-8280422

Reviewed-by: dholmes, mdoerr, kevinw

-------------

PR: https://git.openjdk.java.net/jdk/pull/7559

From duke at openjdk.java.net  Thu Feb 24 14:43:58 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Thu, 24 Feb 2022 14:43:58 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v4]
In-Reply-To: <dDEOZ7T0lpcfDfFNsT9IfYhKl_JjdH2Gq-OrgBnp98w=.ce36e85d-fb73-4859-bc5f-d9bf0c86e6ce@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <dDEOZ7T0lpcfDfFNsT9IfYhKl_JjdH2Gq-OrgBnp98w=.ce36e85d-fb73-4859-bc5f-d9bf0c86e6ce@github.com>
Message-ID: <Fj-CPcIlDNdNRdmYhojHru3TH0FHYyD54zwPlhBrS5Q=.33fc0b8c-e020-4878-bfa9-2d6be3f5d210@github.com>

On Thu, 24 Feb 2022 14:26:39 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Introduce frame::link_or_null()

I changed it again, introducing "frame::link_or_null()" that is the safe version of "frame::link()".

> About a test, I agree, that would be nice. But one would have to "fake" an invalid stack. Maybe a new error reporting test where one deliberately overwrites portions of the stack and then tries to print the stack. However, I imagine things could be brittle, because the OS may catch a stack overwrite first. It's not totally trivial, maybe something for a separate RFE?

I think tests would be nice but also quite difficult. A simple test would be to allocate a frame with zero values for all entries and check that `os::is_first_C_frame` returns true and that `frame::link_or_null()` returns also null. Then the same with a good frame (pointing to sensible values).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Thu Feb 24 14:50:40 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Thu, 24 Feb 2022 14:50:40 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v5]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <Noi_goKf6YORayC3CfGGLxH0Y8Bz0_wKPn2n0D7fzRY=.1225c825-cfc1-45fc-a767-6ff4d9d54645@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Fix compile warnings

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/1cc247d7..e91bfeef

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=03-04

  Stats: 7 lines in 4 files changed: 0 ins; 0 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From jbhateja at openjdk.java.net  Thu Feb 24 14:56:03 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Thu, 24 Feb 2022 14:56:03 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v7]
In-Reply-To: <kaC15h_DGeCKO-Qni0hfmY_FJ696AfbHdP6rBBdjovA=.b93791a1-74f5-4ca3-abd1-36db73129f4a@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <cywj8L_sEGLoeEKWVJLRNKG6BqzkOhHVfEiAW2jGpaE=.553ac6dd-13dc-4940-810f-587ad0baa1d6@github.com>
 <kaC15h_DGeCKO-Qni0hfmY_FJ696AfbHdP6rBBdjovA=.b93791a1-74f5-4ca3-abd1-36db73129f4a@github.com>
Message-ID: <Hz9QErmwNa0W0a8YsHnAqK1_VkZc6DCC1GG5J13bf8I=.7e2db892-4713-46c6-9c2a-5a82507e82b1@github.com>

On Thu, 24 Feb 2022 00:43:27 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

> Also curious, how does the performance look with all these changes.

Updated new perf numbers.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From stuefe at openjdk.java.net  Thu Feb 24 16:38:10 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 24 Feb 2022 16:38:10 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v5]
In-Reply-To: <Noi_goKf6YORayC3CfGGLxH0Y8Bz0_wKPn2n0D7fzRY=.1225c825-cfc1-45fc-a767-6ff4d9d54645@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <Noi_goKf6YORayC3CfGGLxH0Y8Bz0_wKPn2n0D7fzRY=.1225c825-cfc1-45fc-a767-6ff4d9d54645@github.com>
Message-ID: <uYPmkOZ-24DkWAnhfEnS6K8-rGkM2vIBlOpqhx5__98=.85b0314f-92a2-49ae-a9eb-ab3a6ca4fde5@github.com>

On Thu, 24 Feb 2022 14:50:40 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix compile warnings

Looks almost good now. Small remarks remain.

src/hotspot/share/runtime/os.cpp line 1178:

> 1176: 
> 1177: // Looks like all platforms can use the same function to check if C
> 1178: // stack is walkable beyond current frame.

This comment is somewhat weird and it - and the one at the prototype in os.hpp - could do with some massaging. Buts its fine to do this in a different RFE.

src/hotspot/share/runtime/os.cpp line 1193:

> 1191: 
> 1192:   uintptr_t usp    = (uintptr_t)fr->sp();
> 1193:   if ((usp & sp_align_mask) != 0 || SafeFetchN(fr->sp(), (intptr_t)0) == 0) return true;

I'd use os::is_readable_ptr instead for easier readibility.

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Thu Feb 24 16:43:14 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Thu, 24 Feb 2022 16:43:14 GMT
Subject: Integrated: 8277204: Implement PAC-RET branch protection on
 Linux/AArch64
In-Reply-To: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
Message-ID: <XOJax9CYOxkQqo_L7t5Fm-95bDAKfOm2g4iqBHCXV1Y=.869e892e-055f-43bf-84c1-e6e41814941b@github.com>

On Wed, 10 Nov 2021 12:32:53 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
> of its uses is to protect against ROP based attacks. This is done by
> signing the Link Register whenever it is stored on the stack, and
> authenticating the value when it is loaded back from the stack. If an
> attacker were to try to change control flow by editing the stack then
> the authentication check of the Link Register will fail, causing a
> segfault when the function returns.
> 
> On a system with PAC enabled, it is expected that all applications will
> be compiled with ROP protection. Fedora 33 and upwards already provide
> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
> PAC instructions that exist in the NOP space - on hardware without PAC,
> these instructions act as NOPs, allowing backward compatibility for
> negligible performance cost (2 NOPs per non-leaf function).
> 
> Hardware is currently limited to the Apple M1 MacBooks. All testing has
> been done within a Fedora Docker image. A run of SpecJVM showed no
> difference to that of noise - which was surprising.
> 
> The most important part of this patch is simply compiling using branch
> protection provided by GCC/LLVM. This protects all C++ code from being
> used in ROP attacks, removing all static ROP gadgets from use.
> 
> The remainder of the patch adds ROP protection to runtime generated
> code, in both stubs and compiled Java code. Attacks here are much harder
> as ROP gadgets must be found dynamically at runtime. If/when AOT
> compilation is added to JDK, then all stubs and compiled Java will be
> susceptible ROP gadgets being found by static analysis and therefore
> potentially as vulnerable as C++ code.
> 
> There are a number of places where the VM changes control flow by
> rewriting the stack or otherwise. I?ve done some analysis as to how
> these could also be used for attacks (which I didn?t want to post here).
> These areas can be protected ensuring the pointers to various stubs and
> entry points are stored in memory as signed pointers. These changes are
> simple to make (they can be reduced to a type change in common code and
> a few addition sign/auth calls in the backend), but there a lot of them
> and the total code change is fairly large. I?m happy to provide a few
> work in progress patches.
> 
> In order to match the security benefits of the Apple Arm64e ABI across
> the whole of JDK, then all the changes mentioned above would be
> required.

This pull request has now been integrated.

Changeset: 6fab8a2d
Author:    Alan Hayward <alan.hayward at arm.com>
Committer: Andrew Dinn <adinn at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/6fab8a2d6a97dbd2ffceca275716d020cb9f1eea
Stats:     1481 lines in 35 files changed: 574 ins; 32 del; 875 mod

8277204: Implement PAC-RET branch protection on Linux/AArch64

Reviewed-by: erikj, ihse, adinn, ngasson

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From adinn at openjdk.java.net  Thu Feb 24 16:40:16 2022
From: adinn at openjdk.java.net (Andrew Dinn)
Date: Thu, 24 Feb 2022 16:40:16 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v25]
In-Reply-To: <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
Message-ID: <Z4KBz1q522P9E4-eLyNOx2KIunHSeGU5Wkas1wUszWo=.ca074bd0-eade-4380-b4c6-f3f29e92ccd6@github.com>

On Tue, 22 Feb 2022 14:35:19 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
> 
>  - Merge master
>  - Merge master
>  - Merge master
>  - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret
>  - Add comments to enter calls
>  - Set PreserveFramePointer if use_rop_protection is set
>  - Merge enter_subframe into enter
>  - Review fixups
>  - Documentation updates
>  - Update copyrights to 2022
>  - ... and 24 more: https://git.openjdk.java.net/jdk/compare/022d8070...c4e0ee31

Yup this looks good to me. I will sponsor.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From duke at openjdk.java.net  Thu Feb 24 19:08:08 2022
From: duke at openjdk.java.net (Vamsi Parasa)
Date: Thu, 24 Feb 2022 19:08:08 GMT
Subject: RFR: 8282221: x86 intrinsics for divideUnsigned and
 remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
In-Reply-To: <oCd9O4Ecr3MnFogrUlo-lMw-MUHNTfwJqdUIV2fxXNc=.e3ec0775-0264-4cea-b1a4-657be815b390@github.com>
References: <GpDaOvmQ0jX2V29JoVtlsTef5OjZQVnZEdWZJcYRcR4=.707aff45-31d4-46a3-a070-aa73e93e63d0@github.com>
 <cwKV4mTg3NBMUUHxowae5spxx7LfTuURx1oVKNTi4RU=.946ab01d-e0e9-46da-bf4e-33b3fdbda5ec@github.com>
 <oCd9O4Ecr3MnFogrUlo-lMw-MUHNTfwJqdUIV2fxXNc=.e3ec0775-0264-4cea-b1a4-657be815b390@github.com>
Message-ID: <fJGpa0iOx6AwnUsmoZA7pzw0B-PocgklwxSWzDqoRZQ=.b12fb49a-4b2e-4514-9b84-e093d2fbbb03@github.com>

On Thu, 24 Feb 2022 14:13:47 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix 32bit build issues
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4408:
> 
>> 4406:   jmp(done);
>> 4407:   bind(neg_divisor_fastpath);
>> 4408:   // Fastpath for divisor < 0:
> 
> How about checking if divisor is +ve or -ve constant and non-constant dividend in identity routine and setting a flag in IR node, which can be used to either emit fast / slow path in a new instruction selection pattern. It will save emitting redundant instructions.

Thanks for suggesting the enhancement. This enhancement will be implemented as a part of https://bugs.openjdk.java.net/browse/JDK-8282365

> src/hotspot/share/opto/divnode.cpp line 881:
> 
>> 879:   return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : this;
>> 880: }
>> 881: //------------------------------Value------------------------------------------
> 
> Ideal transform to replace unsigned divide by cheaper logical right shift instruction if divisor is POW will be useful.

Thanks for suggesting the enhancement. This enhancement will be implemented as a part of https://bugs.openjdk.java.net/browse/JDK-8282365

> src/hotspot/share/opto/divnode.cpp line 897:
> 
>> 895: 
>> 896:   // Either input is BOTTOM ==> the result is the local BOTTOM
>> 897:   const Type *bot = bottom_type();
> 
> Can we add constant folding handling when both dividend and divisor are constants.

Thanks for suggesting the enhancement. This enhancement will be implemented as a part of https://bugs.openjdk.java.net/browse/JDK-8282365

-------------

PR: https://git.openjdk.java.net/jdk/pull/7572

From jbhateja at openjdk.java.net  Fri Feb 25 06:22:42 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Fri, 25 Feb 2022 06:22:42 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v9]
In-Reply-To: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
Message-ID: <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>

> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | -- | --
> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07
> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93
> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  8279508: Adding descriptive comments.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7094/files
  - new: https://git.openjdk.java.net/jdk/pull/7094/files/f7dec3d9..54d4ea36

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=08
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=07-08

  Stats: 31 lines in 2 files changed: 14 ins; 0 del; 17 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Fri Feb 25 07:30:05 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Fri, 25 Feb 2022 07:30:05 GMT
Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written
 > 0) when no space left on device. [v4]
In-Reply-To: <jQUZoxEkixgJcta_LlTPV1C02t8mZL5qVwBoVxPjB3g=.b3928584-31c6-4291-99d5-4171495b2d80@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
 <jQUZoxEkixgJcta_LlTPV1C02t8mZL5qVwBoVxPjB3g=.b3928584-31c6-4291-99d5-4171495b2d80@github.com>
Message-ID: <_H_syP7hDZ6iLNt8M8qOL48M2y6xdA28AR0gWzvt6Yw=.4e39ac63-dfe7-4554-b737-174491996544@github.com>

On Tue, 22 Feb 2022 05:53:31 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

>> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
>> 
>> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
>> by using JfrJavaSupport::abort().
>> 
>> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
>> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
>> 
>> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
>> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
>> because there is no space on device.
>> Could you please review the fix?
>
> KIRIYAMA Takuya has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

I hope this change is integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From duke at openjdk.java.net  Fri Feb 25 11:34:08 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Fri, 25 Feb 2022 11:34:08 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v5]
In-Reply-To: <uYPmkOZ-24DkWAnhfEnS6K8-rGkM2vIBlOpqhx5__98=.85b0314f-92a2-49ae-a9eb-ab3a6ca4fde5@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <Noi_goKf6YORayC3CfGGLxH0Y8Bz0_wKPn2n0D7fzRY=.1225c825-cfc1-45fc-a767-6ff4d9d54645@github.com>
 <uYPmkOZ-24DkWAnhfEnS6K8-rGkM2vIBlOpqhx5__98=.85b0314f-92a2-49ae-a9eb-ab3a6ca4fde5@github.com>
Message-ID: <BWV7HkvA0K4B-4bK438RJZ_tjjNKne1FpltVbPRQrug=.0adbf095-4c9a-414e-a417-58fce1f5ac69@github.com>

On Thu, 24 Feb 2022 16:25:19 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix compile warnings
>
> src/hotspot/share/runtime/os.cpp line 1178:
> 
>> 1176: 
>> 1177: // Looks like all platforms can use the same function to check if C
>> 1178: // stack is walkable beyond current frame.
> 
> This comment is somewhat weird and it - and the one at the prototype in os.hpp - could do with some massaging. Buts its fine to do this in a different RFE.

yes...

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Fri Feb 25 11:43:20 2022
From: duke at openjdk.java.net (KIRIYAMA Takuya)
Date: Fri, 25 Feb 2022 11:43:20 GMT
Subject: Integrated: 8280684: JfrRecorderService failes with
 guarantee(num_written > 0) when no space left on device.
In-Reply-To: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
References: <RpDADhRHBE57IHXopoH8FNtAbq6TIY8ZDR4NRJKW89I=.a37f4bb6-b7bf-47cf-b3f7-2900d5060bfe@github.com>
Message-ID: <BbxrQeby7xbHgVuewoUtn6lKCZ0oXr2NjXC1SFScQpw=.d29f364b-9c00-4dee-b090-5804cd525d8d@github.com>

On Wed, 26 Jan 2022 06:41:41 GMT, KIRIYAMA Takuya <duke at openjdk.java.net> wrote:

> I think JFR should report an error message and jvm should shut down safely instead of gurantee failure.
> 
> For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below
> by using JfrJavaSupport::abort().
> 
> [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp)
> [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM...
> 
> I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort().
> I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core 
> because there is no space on device.
> Could you please review the fix?

This pull request has now been integrated.

Changeset: 9471f24c
Author:    KIRIYAMA Takuya <kiriyama.takuya at fujitsu.com>
Committer: Markus Gr?nlund <mgronlun at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/9471f24ca191832669a13e5a1ea73f7097a25927
Stats:     16 lines in 3 files changed: 8 ins; 2 del; 6 mod

8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device.

Reviewed-by: mgronlun

-------------

PR: https://git.openjdk.java.net/jdk/pull/7227

From stuefe at openjdk.java.net  Fri Feb 25 12:30:00 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Fri, 25 Feb 2022 12:30:00 GMT
Subject: Integrated: JDK-8281015: Further simplify NMT backend
In-Reply-To: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
References: <cjIF5WBiFac5ovqW3es_F39nt9h1jNBM2vileOLhuG0=.9d03e42b-22b8-4340-bfbc-9d8524a9d6b8@github.com>
Message-ID: <Ep3UuCB4vJQ1cz1JOenYC_rGGdIxdJJ9gMd2_FXrob0=.2577d86f-0622-4500-86e2-c7425a9f14af@github.com>

On Mon, 31 Jan 2022 08:12:02 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> NMT backend can be further simplified and cleaned out.
> 
> - some entry points require NMT_TrackingLevel as arguments, some use the global tracking level. Ultimately, every part of NMT always uses the global tracking level, so in many cases the explicit parameter can be removed and the global tracking level can be used instead.
> - `MemTracker::malloc_header_size(level)` + `MemTracker::malloc_footer_size(level)` are fused into `MemTracker::overhead_per_malloc()`
> - when adding to `MallocSiteTable`, caller gets back a shortcut to the entry. That shortcut is stored verbatim in the malloc header. It consists of two 16-bit values (bucket index and chain position). That tupel finds its way into many argument lists. It can be simplified into single 32-bit opaque marker. Code outside the MallocSiteTable does not need to know what it is.
> - Currently, the `MallocHeader` class contains a lot of logic. It accounts (in constructor) and de-accounts (in `MallocHeader::release()`). It would simplify code if `MallocHeader` were just a dumb data carrier and the `MallocTracker` would do the actual work.
> - `MallocHeader` can be simplified, almost all members made constant and modifying accessors removed.
> - In some places we handle inputptr=NULL gracefully where we should assert instead
> - Expressions like `MemTracker::tracking_level() != NMT_off` can be simplified to `MemTracker::enabled()`.
> - MemTracker::malloc_base (all variants) can be removed. Note that we have MallocTracker::malloc_header, which achieves the same and does not require casting to the header.
> 
> Testing:
> 
> - GHAs
> - manually ran NMT gtests (all NMT modes) and NMT jtreg tests on Ubuntu x64
> - SAP nightlies ran through. Note that since 8275301 "Unify C-heap buffer overrun checks into NMT" NMT is enabled by default in debug builds, so it gets a lot more workout in tests now.
> 
> Note that I wanted to manually verify that the gdb "call pp" command still works in order to not break Zhengyu's recent addition, but found its already broken. I filed https://bugs.openjdk.java.net/browse/JDK-8281023 and am preparing a separate patch.

This pull request has now been integrated.

Changeset: b96b7437
Author:    Thomas Stuefe <stuefe at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/b96b743727a628c1b33cc9b3374f010c2ea30b78
Stats:     273 lines in 10 files changed: 56 ins; 147 del; 70 mod

8281015: Further simplify NMT backend

Reviewed-by: zgu, mbaesken

-------------

PR: https://git.openjdk.java.net/jdk/pull/7283

From duke at openjdk.java.net  Fri Feb 25 12:31:30 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Fri, 25 Feb 2022 12:31:30 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v6]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <goqc2Q5MOMMLXBQHXnP5SfkS759ju4f2C6MEEluPEWo=.d80b878a-ca2a-4c5c-98b0-c5b680764677@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with two additional commits since the last revision:

 - Simple test
 - Use os::is_readable_pointer

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/e91bfeef..2d29a6db

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=04-05

  Stats: 75 lines in 5 files changed: 63 ins; 3 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Fri Feb 25 12:35:38 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Fri, 25 Feb 2022 12:35:38 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v7]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <888gPfj0r4fbAmhIWUIqbv91hS1la5fK8h-9lPodA1E=.dd2071e3-a0fd-4ca3-84c4-b7bfddd49c51@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Fix trailing whitespace

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/2d29a6db..7ee0c0b8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=05-06

  Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Fri Feb 25 12:41:30 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Fri, 25 Feb 2022 12:41:30 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v8]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <mA9LEfslvOnXtjUzWuSqKQMrGUKHkQtizgx4ljYzZMk=.1a4b24ac-19b8-4627-943c-f1da985c70b5@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Correct mistake

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/7ee0c0b8..de36fd68

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=06-07

  Stats: 34 lines in 1 file changed: 0 ins; 33 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Fri Feb 25 13:02:37 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Fri, 25 Feb 2022 13:02:37 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Fix tests

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/de36fd68..1f08203f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=08
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=07-08

  Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From dholmes at openjdk.java.net  Fri Feb 25 13:12:04 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 25 Feb 2022 13:12:04 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v25]
In-Reply-To: <0Ob4kezo_Q0ro0eF_OeEABrzYeZCNmoaD5KQUcBpZRc=.6c772f45-c70c-4983-880a-8878e281d04b@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
 <0Ob4kezo_Q0ro0eF_OeEABrzYeZCNmoaD5KQUcBpZRc=.6c772f45-c70c-4983-880a-8878e281d04b@github.com>
Message-ID: <YaH2-5pWHVhcBA15WqN4tDBuWG-lG2GKhoe9WibbW2s=.23a1d4cf-cee9-4006-9b11-f1917cb74cfc@github.com>

On Thu, 24 Feb 2022 10:52:00 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
>> 
>>  - Merge master
>>  - Merge master
>>  - Merge master
>>  - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret
>>  - Add comments to enter calls
>>  - Set PreserveFramePointer if use_rop_protection is set
>>  - Merge enter_subframe into enter
>>  - Review fixups
>>  - Documentation updates
>>  - Update copyrights to 2022
>>  - ... and 24 more: https://git.openjdk.java.net/jdk/compare/022d8070...c4e0ee31
>
> Any more comments? Otherwise I'll integrate later

@a74nh this seems to have broken the Zero build:

src/hotspot/share/gc/shared/barrierSetNMethod.cpp:58:33: error: 'pauth_strip_pointer' was not declared in this scope
 58 |   AARCH64_ONLY(return_address = pauth_strip_pointer(return_address));

I'm guessing a missing include file.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From hseigel at openjdk.java.net  Fri Feb 25 14:24:00 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 25 Feb 2022 14:24:00 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values [v2]
In-Reply-To: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
Message-ID: <aJwb-3vTRdKEPx3NgF1RwUio49ViDqO2ogHNYI0Lbcc=.be667fbb-d6b6-45bb-abb6-5ff0ac06360c@github.com>

> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
> 
> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
> 
> Thanks, Harold

Harold Seigel has updated the pull request incrementally with one additional commit since the last revision:

  add gtest, fix TestParallelGCThreads.java, and revise implementation

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7522/files
  - new: https://git.openjdk.java.net/jdk/pull/7522/files/354e3f5c..e8de1741

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7522&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7522&range=00-01

  Stats: 262 lines in 5 files changed: 111 ins; 133 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7522/head:pull/7522

PR: https://git.openjdk.java.net/jdk/pull/7522

From hseigel at openjdk.java.net  Fri Feb 25 14:24:01 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 25 Feb 2022 14:24:01 GMT
Subject: RFR: 8281472: JVM options processing silently truncates large
 illegal options values
In-Reply-To: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
References: <XSRUmYy9mu7jgVEfw6e2NlCptmcj371GNlvUu4vRbTM=.5b8bdfa8-bec0-44b7-be41-19e8ebe4d2ac@github.com>
Message-ID: <p4Y3iMyZLKtgzV7iofIqimyfb3LotSOaj-UF0XJC5dc=.08f9f7f6-ad3a-4694-ac58-19c868062475@github.com>

On Thu, 17 Feb 2022 19:09:26 GMT, Harold Seigel <hseigel at openjdk.org> wrote:

> Please review this change to fix JDK-8281472.  The fix prevents truncation of large illegal option values by rejecting those values if they exceed the range of their type.  For example, it rejects values of int options that are not between max_int and min_int.
> 
> The fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux-x64 and Windows-x64.
> 
> Thanks, Harold

This new commit replaces the JTReg test with a gtest as suggested by David, has a revised implementatin suggested by Ioi, and fixes TestParallelGCThreads.java.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7522

From coleenp at openjdk.java.net  Fri Feb 25 15:01:53 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Fri, 25 Feb 2022 15:01:53 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v6]
In-Reply-To: <oUFhtmXkwBixwlYK8bcsnQclOIO7GUi4LMMGwAXw7Pw=.b9a83d0d-f57b-44d7-9428-a7e35b9ce6ae@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
 <oUFhtmXkwBixwlYK8bcsnQclOIO7GUi4LMMGwAXw7Pw=.b9a83d0d-f57b-44d7-9428-a7e35b9ce6ae@github.com>
Message-ID: <hX7SW8mQsS8u7wVvw2kiPuyqGAfFMv1FFqFHqNUF-pI=.ea3d5766-659f-4c31-a12c-2afce3b326a8@github.com>

On Wed, 23 Feb 2022 04:15:28 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> **Background:**
>> 
>> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this:
>> 
>> 
>> public enum Day {  SUNDAY, MONDAY ... } 
>> 
>> 
>> to
>> 
>> 
>> public class Day extends java.lang.Enum {
>>     public static final SUNDAY = new Day("SUNDAY");
>>     public static final MONDAY = new Day("MONDAY"); ...
>> }
>> 
>> 
>> With CDS archived heap objects, `Day::<clinit>` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
>> 
>> **Fix:**
>> 
>> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::<clinit>`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time.
>> 
>> This is safe as we know that `X::<clinit>` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized.
>> 
>> **Verification:**
>> 
>> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
>> 
>> **Testing:**
>> 
>> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fixed whitespace

Sorry for the long delay. It's a big change, but a lot in debug so that's ok.  Looks good.

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6653

From coleenp at openjdk.java.net  Fri Feb 25 15:01:54 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Fri, 25 Feb 2022 15:01:54 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v3]
In-Reply-To: <4CLwCQdc_haGT_ueBQGZKzJVasGK26B6iYcO7VtOfAs=.02f3deb9-7ac7-45fd-9a7c-37b0fe4a8ea2@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
 <pOfPrnVSbe4SOyUeCaTXl-2dzck0EeSCa1fuarocajo=.22176dfa-d283-4891-a91d-48aae98fce09@github.com>
 <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com>
 <4CLwCQdc_haGT_ueBQGZKzJVasGK26B6iYcO7VtOfAs=.02f3deb9-7ac7-45fd-9a7c-37b0fe4a8ea2@github.com>
Message-ID: <1B2f3fl1vAMMiwyVKdf5rmn_kmJFhYxXFg71WAkILbw=.22926dc1-4234-4f21-98ee-64b5372c00c6@github.com>

On Wed, 19 Jan 2022 05:44:10 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> src/hotspot/share/cds/heapShared.cpp line 433:
>> 
>>> 431:   oop mirror = k->java_mirror();
>>> 432:   int i = 0;
>>> 433:   for (JavaFieldStream fs(k); !fs.done(); fs.next()) {
>> 
>> This seems like it should also use InstanceKlass::do_local_static_fields.
>
> Converting this to InstanceKlass::do_nonstatic_fields() is difficult because the loop body references 7 different variables declared outside of the loop. 
> 
> One thing I tried is to add a new version of do_nonstatic_fields2() that supports C++ lambdas. You can see my experiment from here: 
> 
> https://github.com/openjdk/jdk/compare/master...iklam:lambda-for-instanceklass-do_local_static_fields2?expand=1
> 
> I changed all my new code to use the do_nonstatic_fields2() function with lambda.

Ok, if it requires lambdas and additional change, never mind then.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6653

From duke at openjdk.java.net  Fri Feb 25 15:20:04 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Fri, 25 Feb 2022 15:20:04 GMT
Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64
 [v25]
In-Reply-To: <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
References: <Incu1NvV4G3SROSqBQmwIW3kTMb3dzEMvQFLeLAvmng=.c433cad4-5540-4fe9-b4bb-991b8597d973@github.com>
 <PznyMgwgokS2upKnYF7pz76MrXv90aaJBh1h1JGa4Nw=.95b08dfa-68cf-447e-a7d2-66cd34ff05de@github.com>
Message-ID: <ArmGs6Cgh0Btoy5bQPD2tbMGsh54co0X66oWBYrHhWg=.1cc128d0-9a74-4de5-a1b1-28aea6c7a2ad@github.com>

On Tue, 22 Feb 2022 14:35:19 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One
>> of its uses is to protect against ROP based attacks. This is done by
>> signing the Link Register whenever it is stored on the stack, and
>> authenticating the value when it is loaded back from the stack. If an
>> attacker were to try to change control flow by editing the stack then
>> the authentication check of the Link Register will fail, causing a
>> segfault when the function returns.
>> 
>> On a system with PAC enabled, it is expected that all applications will
>> be compiled with ROP protection. Fedora 33 and upwards already provide
>> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of
>> PAC instructions that exist in the NOP space - on hardware without PAC,
>> these instructions act as NOPs, allowing backward compatibility for
>> negligible performance cost (2 NOPs per non-leaf function).
>> 
>> Hardware is currently limited to the Apple M1 MacBooks. All testing has
>> been done within a Fedora Docker image. A run of SpecJVM showed no
>> difference to that of noise - which was surprising.
>> 
>> The most important part of this patch is simply compiling using branch
>> protection provided by GCC/LLVM. This protects all C++ code from being
>> used in ROP attacks, removing all static ROP gadgets from use.
>> 
>> The remainder of the patch adds ROP protection to runtime generated
>> code, in both stubs and compiled Java code. Attacks here are much harder
>> as ROP gadgets must be found dynamically at runtime. If/when AOT
>> compilation is added to JDK, then all stubs and compiled Java will be
>> susceptible ROP gadgets being found by static analysis and therefore
>> potentially as vulnerable as C++ code.
>> 
>> There are a number of places where the VM changes control flow by
>> rewriting the stack or otherwise. I?ve done some analysis as to how
>> these could also be used for attacks (which I didn?t want to post here).
>> These areas can be protected ensuring the pointers to various stubs and
>> entry points are stored in memory as signed pointers. These changes are
>> simple to make (they can be reduced to a type change in common code and
>> a few addition sign/auth calls in the backend), but there a lot of them
>> and the total code change is fairly large. I?m happy to provide a few
>> work in progress patches.
>> 
>> In order to match the security benefits of the Apple Arm64e ABI across
>> the whole of JDK, then all the changes mentioned above would be
>> required.
>
> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
> 
>  - Merge master
>  - Merge master
>  - Merge master
>  - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret
>  - Add comments to enter calls
>  - Set PreserveFramePointer if use_rop_protection is set
>  - Merge enter_subframe into enter
>  - Review fixups
>  - Documentation updates
>  - Update copyrights to 2022
>  - ... and 24 more: https://git.openjdk.java.net/jdk/compare/022d8070...c4e0ee31

Yes, we spotted this today too. https://bugs.openjdk.java.net/browse/JDK-8282392

My initial thought was that I needed to add a pauth header file with stub functions to linux_zero/. Which does feel a little awkward.

AARCH64_PORT_ONLY does sound like a better option.

Thankfully there is no need for full pac support in zero too.... :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/6334

From pchilanomate at openjdk.java.net  Fri Feb 25 17:00:07 2022
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 25 Feb 2022 17:00:07 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v2]
In-Reply-To: <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>
Message-ID: <Dd5oz8i-kr8h2Xtn86BhR0HQDbMbiNpM2Ow074rpO4Q=.50d516d4-d519-473a-9070-ab2d46f56cf4@github.com>

On Thu, 24 Feb 2022 14:03:49 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
> 
>  - Enhance comment to say why name needs to be set later.
>  - 8282240: Add _name field to Method for NOT_PRODUCT only
>  - Merge branch 'master' into method-name
>  - Enhance comment to say why name needs to be set later.
>  - 8282240: Add _name field to Method for NOT_PRODUCT only

Hi Coleen,

Looks good to me. I see we also call set_constants() and set_name_index() from VM_RedefineClasses::set_new_constant_pool(), do we need to set the name in there too or it's not needed?

Thanks,
Patricio

-------------

Marked as reviewed by pchilanomate (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7608

From hseigel at openjdk.java.net  Fri Feb 25 18:18:10 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 25 Feb 2022 18:18:10 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v2]
In-Reply-To: <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>
Message-ID: <OJ-1zNfQePSbTVeSaHnZzSSt-lb4QymbLnnImKhLmF0=.392d9697-6c1c-4abb-92b2-c23c97d4c153@github.com>

On Thu, 24 Feb 2022 14:03:49 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
> 
>  - Enhance comment to say why name needs to be set later.
>  - 8282240: Add _name field to Method for NOT_PRODUCT only
>  - Merge branch 'master' into method-name
>  - Enhance comment to say why name needs to be set later.
>  - 8282240: Add _name field to Method for NOT_PRODUCT only

Changes look good!
Thanks, Harold

-------------

Marked as reviewed by hseigel (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7608

From coleenp at openjdk.java.net  Fri Feb 25 20:19:26 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Fri, 25 Feb 2022 20:19:26 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v3]
In-Reply-To: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
Message-ID: <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>

> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
> Tested with tier1 on Oracle platforms.

Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:

  Set _name field in constructor with available symbol rather than later when constant pool pointer is set.  I like this one better.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7608/files
  - new: https://git.openjdk.java.net/jdk/pull/7608/files/ab762ed7..f75cf1e2

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=01-02

  Stats: 43 lines in 5 files changed: 28 ins; 7 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7608.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7608/head:pull/7608

PR: https://git.openjdk.java.net/jdk/pull/7608

From coleenp at openjdk.java.net  Fri Feb 25 20:19:32 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Fri, 25 Feb 2022 20:19:32 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v2]
In-Reply-To: <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <yVNIYqjcxFmT3w63rRRwb2muPDIQi5v31M8X_O_0INc=.8a561d63-e16f-4216-910d-05dae6c50b7b@github.com>
Message-ID: <jSmDIRCX6Zl-lLfiB0GLXery7YknPuDu2p99oL34sDE=.3ac33787-c0d2-4e3e-80ce-25b21b69432d@github.com>

On Thu, 24 Feb 2022 14:03:49 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
> 
>  - Enhance comment to say why name needs to be set later.
>  - 8282240: Add _name field to Method for NOT_PRODUCT only
>  - Merge branch 'master' into method-name
>  - Enhance comment to say why name needs to be set later.
>  - 8282240: Add _name field to Method for NOT_PRODUCT only

Hi Patricio,  In the RedefineClasses case, the methods already have the _name field set.  But your comment pointed out the fragility of this change, so I changed it.  Hope this one is better.   Please re-review.
Harold, can you re-review also?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7608

From hseigel at openjdk.java.net  Fri Feb 25 20:57:55 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Fri, 25 Feb 2022 20:57:55 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v3]
In-Reply-To: <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>
Message-ID: <E4_SgFJhyZmihPlEHskUVm79PqaxjYm_iBi85WVffCY=.6e47b8eb-7c85-4082-8d3c-524c8ee46fee@github.com>

On Fri, 25 Feb 2022 20:19:26 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Set _name field in constructor with available symbol rather than later when constant pool pointer is set.  I like this one better.

Still looks good!
Harold

-------------

Marked as reviewed by hseigel (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7608

From pchilanomate at openjdk.java.net  Fri Feb 25 21:37:54 2022
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 25 Feb 2022 21:37:54 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v3]
In-Reply-To: <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>
Message-ID: <jGTwh444QLRNZ5GfoJHv_SsnWmEw4pQ9fj3OAIi0PC4=.8677cb58-841a-4230-94fd-b7774abbed42@github.com>

On Fri, 25 Feb 2022 20:19:26 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Set _name field in constructor with available symbol rather than later when constant pool pointer is set.  I like this one better.

Still good!

Thanks,
Patricio

-------------

Marked as reviewed by pchilanomate (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7608

From sviswanathan at openjdk.java.net  Sat Feb 26 01:33:54 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Sat, 26 Feb 2022 01:33:54 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v9]
In-Reply-To: <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
Message-ID: <Uh_FQzMDzw-9p5ojLandAlJRSJAcfHzcPKMhSQIKYL8=.afb061c1-0cfc-43a6-80a9-4d3f7c03ba3d@github.com>

On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8279508: Adding descriptive comments.

Other than this the patch looks good to me. What testing have you done?

src/hotspot/cpu/x86/x86.ad line 7263:

> 7261:     __ vector_round_float_avx($dst$$XMMRegister, $src$$XMMRegister, $xtmp1$$XMMRegister,
> 7262:                               $xtmp2$$XMMRegister, $xtmp3$$XMMRegister, $xtmp4$$XMMRegister,
> 7263:                               ExternalAddress(vector_float_signflip()), new_mxcsr, $scratch$$Register, vlen_enc);

The vector_float_signflip() here should be replaced by vector_all_bits_set().
cvtps2dq description:
If a converted result cannot be represented in the destination
format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value
(2w-1, where w represents the number of bits in the destination format) is returned.

src/hotspot/cpu/x86/x86.ad line 7280:

> 7278:     __ vector_round_float_evex($dst$$XMMRegister, $src$$XMMRegister, $xtmp1$$XMMRegister,
> 7279:                                $xtmp2$$XMMRegister, $ktmp1$$KRegister, $ktmp2$$KRegister,
> 7280:                                ExternalAddress(vector_float_signflip()), new_mxcsr, $scratch$$Register, vlen_enc);

The vector_float_signflip() here should be replaced by vector_all_bits_set().

src/hotspot/cpu/x86/x86.ad line 7295:

> 7293:     __ vector_round_double_evex($dst$$XMMRegister, $src$$XMMRegister, $xtmp1$$XMMRegister,
> 7294:                                 $xtmp2$$XMMRegister, $ktmp1$$KRegister, $ktmp2$$KRegister,
> 7295:                                 ExternalAddress(vector_double_signflip()), new_mxcsr, $scratch$$Register, vlen_enc);

The vector_double_signflip() here should be replaced by vector_all_bits_set().
vcvtpd2qq description:
If a converted result cannot be represented in the destination
format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value
(2w-1, where w represents the number of bits in the destination format) is returned.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From sviswanathan at openjdk.java.net  Sat Feb 26 03:05:54 2022
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Sat, 26 Feb 2022 03:05:54 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v9]
In-Reply-To: <Uh_FQzMDzw-9p5ojLandAlJRSJAcfHzcPKMhSQIKYL8=.afb061c1-0cfc-43a6-80a9-4d3f7c03ba3d@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
 <Uh_FQzMDzw-9p5ojLandAlJRSJAcfHzcPKMhSQIKYL8=.afb061c1-0cfc-43a6-80a9-4d3f7c03ba3d@github.com>
Message-ID: <1K0c0y8K8bVNJEFMyTQSxwdgJlx9E2N8uhHC7O9sfyM=.c4ead8b5-abe0-42f4-ae10-aa24425eb75d@github.com>

On Sat, 26 Feb 2022 01:06:21 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   8279508: Adding descriptive comments.
>
> src/hotspot/cpu/x86/x86.ad line 7263:
> 
>> 7261:     __ vector_round_float_avx($dst$$XMMRegister, $src$$XMMRegister, $xtmp1$$XMMRegister,
>> 7262:                               $xtmp2$$XMMRegister, $xtmp3$$XMMRegister, $xtmp4$$XMMRegister,
>> 7263:                               ExternalAddress(vector_float_signflip()), new_mxcsr, $scratch$$Register, vlen_enc);
> 
> The vector_float_signflip() here should be replaced by vector_all_bits_set().
> cvtps2dq description:
> If a converted result cannot be represented in the destination
> format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value
> (2w-1, where w represents the number of bits in the destination format) is returned.

Clarification, the number in my comments above is (2^w  - 1). This is from Intel SDM (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html).
Also you will need to take care when the valid unoverflowed result is -1 i.e. 0xFFFFFFFF (2^32 - 1).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Sat Feb 26 03:42:02 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sat, 26 Feb 2022 03:42:02 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v9]
In-Reply-To: <8mhsd-DL1IccFiqrRigKdck8OJg79sjKgaYXrHc4zwY=.c92cb7f5-8e54-42ab-84f1-9cfa1ce76779@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
 <Uh_FQzMDzw-9p5ojLandAlJRSJAcfHzcPKMhSQIKYL8=.afb061c1-0cfc-43a6-80a9-4d3f7c03ba3d@github.com>
 <1K0c0y8K8bVNJEFMyTQSxwdgJlx9E2N8uhHC7O9sfyM=.c4ead8b5-abe0-42f4-ae10-aa24425eb75d@github.com>
 <8mhsd-DL1IccFiqrRigKdck8OJg79sjKgaYXrHc4zwY=.c92cb7f5-8e54-42ab-84f1-9cfa1ce76779@github.com>
Message-ID: <mPYDVHjrBrCoZU853aifNgYbVOokUK__UWdd4QPtVQ8=.45572f64-bf9d-42f6-b033-b39f3bf5b7be@github.com>

On Sat, 26 Feb 2022 03:37:32 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Clarification, the number in my comments above is (2^w  - 1). This is from Intel SDM (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html).
>> Also you will need to take care when the valid unoverflowed result is -1 i.e. 0xFFFFFFFF (2^32 - 1).
>
> I believe the indefinite value should be 2^(w - 1) (a.k.a 0x80000000) and the documentation is typoed. If you look at `cvtss2si`, the indefinite value is also written as 2^w - 1 but yet in `MacroAssembler::convert_f2i` we compare it with 0x80000000. In addition, choosing -1 as an indefinite value is weird enough and to complicate it as 2^w - 1 is really unusual.

`MacroAssembler::convert_f2i`

https://github.com/openjdk/jdk/blob/c5c6058fd57d4b594012035eaf18a57257f4ad85/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L8919

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From duke at openjdk.java.net  Sat Feb 26 03:42:02 2022
From: duke at openjdk.java.net (Quan Anh Mai)
Date: Sat, 26 Feb 2022 03:42:02 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v9]
In-Reply-To: <1K0c0y8K8bVNJEFMyTQSxwdgJlx9E2N8uhHC7O9sfyM=.c4ead8b5-abe0-42f4-ae10-aa24425eb75d@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
 <Uh_FQzMDzw-9p5ojLandAlJRSJAcfHzcPKMhSQIKYL8=.afb061c1-0cfc-43a6-80a9-4d3f7c03ba3d@github.com>
 <1K0c0y8K8bVNJEFMyTQSxwdgJlx9E2N8uhHC7O9sfyM=.c4ead8b5-abe0-42f4-ae10-aa24425eb75d@github.com>
Message-ID: <8mhsd-DL1IccFiqrRigKdck8OJg79sjKgaYXrHc4zwY=.c92cb7f5-8e54-42ab-84f1-9cfa1ce76779@github.com>

On Sat, 26 Feb 2022 03:02:51 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> src/hotspot/cpu/x86/x86.ad line 7263:
>> 
>>> 7261:     __ vector_round_float_avx($dst$$XMMRegister, $src$$XMMRegister, $xtmp1$$XMMRegister,
>>> 7262:                               $xtmp2$$XMMRegister, $xtmp3$$XMMRegister, $xtmp4$$XMMRegister,
>>> 7263:                               ExternalAddress(vector_float_signflip()), new_mxcsr, $scratch$$Register, vlen_enc);
>> 
>> The vector_float_signflip() here should be replaced by vector_all_bits_set().
>> cvtps2dq description:
>> If a converted result cannot be represented in the destination
>> format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value
>> (2w-1, where w represents the number of bits in the destination format) is returned.
>
> Clarification, the number in my comments above is (2^w  - 1). This is from Intel SDM (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html).
> Also you will need to take care when the valid unoverflowed result is -1 i.e. 0xFFFFFFFF (2^32 - 1).

I believe the indefinite value should be 2^(w - 1) (a.k.a 0x80000000) and the documentation is typoed. If you look at `cvtss2si`, the indefinite value is also written as 2^w - 1 but yet in `MacroAssembler::convert_f2i` we compare it with 0x80000000. In addition, choosing -1 as an indefinite value is weird enough and to complicate it as 2^w - 1 is really unusual.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From jbhateja at openjdk.java.net  Sat Feb 26 04:57:55 2022
From: jbhateja at openjdk.java.net (Jatin Bhateja)
Date: Sat, 26 Feb 2022 04:57:55 GMT
Subject: RFR: 8279508: Auto-vectorize Math.round API [v9]
In-Reply-To: <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
References: <iRtE5cC04m_648N-GqIn8FcaOIXiPjBTHgcr3rvZW2E=.146724a0-a18b-46e4-b4be-1ebbe37b9e4e@github.com>
 <zLyvngX3pdv0AhZamk10DX9bl1jSNJTo8uBYI0NMXJo=.ff366924-9e04-4f43-b539-fe7e991eefa1@github.com>
Message-ID: <gM42lIzUQBnOjZLyuBAh4-VqZZs0864NoDP5mOCUVDk=.12c99ec0-98ef-4e67-8fe7-279f82bdf9c9@github.com>

On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8279508: Adding descriptive comments.

As per SDM, if post conversion a floating point number is non-representable in destination format e.g. a floating point value 3.4028235E10 post integer conversion will overflow the value range of integer primitive type, hence a -0.0 value or 0x80000000 is returned here. Similarly for +/- NaN and  +/-Inf post conversion value returns is -0.0.  All these cases i.e. post conversion non-representable floating point values and NaN/Inf values are handled in a special manner where algorithm first performs an unordered comparison b/w original source value and returns a 0 in case of  NaN, this weeds out the NaN case and for rest of the special values we check the MSB bit of the source and either return an Integer.MAX_VALUE for +ve numbers or a Integer.MIN_VALUE to adhere to the semantics of Math.round API.

Existing tests were enhanced to cover various special cases (NaN/Inf/+ve/-ve value/values which may be inexact after adding 0.5/ values which post conversion overflow integer value range).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7094

From stuefe at openjdk.java.net  Sat Feb 26 06:18:56 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 26 Feb 2022 06:18:56 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
Message-ID: <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>

On Fri, 25 Feb 2022 13:02:37 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix tests

Hi Johannes,

Getting closer. More remarks inline.

Cheers, Thomas

src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 154:

> 152: 
> 153: inline intptr_t* frame::link_or_null() const {
> 154:   auto ptr = (intptr_t **)addr_at(link_offset);

Please don't use auto. In general, use features and style that is adapted around you, and beyond that pls refer to the C++ style guide. When in Rome...

src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 155:

> 153: inline intptr_t* frame::link_or_null() const {
> 154:   auto ptr = (intptr_t **)addr_at(link_offset);
> 155:   if (os::is_readable_pointer((const void*)ptr)) {

You don't need this cast

src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 159:

> 157:   }
> 158:   return NULL;
> 159: }

You could shorten these four lines to a single one using `?`, especially since this code is duplicated across platforms.

src/hotspot/share/runtime/os.cpp line 1179:

> 1177: // Looks like all platforms can use the same function to check if C
> 1178: // stack is walkable beyond current frame.
> 1179: // Returns false if this is the cas

typo

src/hotspot/share/runtime/os.cpp line 1184:

> 1182: #ifdef _WINDOWS
> 1183:   return true; // native stack isn't walkable on windows this way.
> 1184: #else

This change has nothing to do with the bug.

I would leave this as it is and let the code below at least compile on windows. Then we know it does not bitrot there. I am also not clear why this would not work on windows, since we could optionally build with framepointers enabled, right? And don't we have frame pointers on 32-bit windows always? I may remember this wrong.

src/hotspot/share/runtime/os.cpp line 1193:

> 1191: 
> 1192:   uintptr_t usp    = (uintptr_t)fr->sp();
> 1193:   if ((usp & sp_align_mask) != 0 || !os::is_readable_pointer((const void*)usp)) return true;

remove cast

test/hotspot/gtest/runtime/test_os.cpp line 874:

> 872:   frame invalid_frame;
> 873:   EXPECT_TRUE(os::is_first_C_frame(&invalid_frame)); // the frame has zeroes for all values
> 874: 

Please add a test with valid looking but garbage pointers, to test that your safefetch really works. We usually do this by reserving + protecting a stripe of memory and using that one as guaranteed faulting pointer.

test/hotspot/gtest/runtime/test_os.cpp line 875:

> 873:   EXPECT_TRUE(os::is_first_C_frame(&invalid_frame)); // the frame has zeroes for all values
> 874: 
> 875:   auto cur_frame = os::current_frame(); // this frame has to have a sender

please use a type here, not auto

test/hotspot/gtest/runtime/test_os.cpp line 878:

> 876:   EXPECT_FALSE(os::is_first_C_frame(&cur_frame));
> 877:   #endif // _WIN32
> 878: }

missing newline

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7591

From stuefe at openjdk.java.net  Sat Feb 26 06:18:56 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 26 Feb 2022 06:18:56 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
 <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
Message-ID: <QTCzme_rFRbzbCur_KJlqZv3YJv6_DvjQVSQoMzq9c8=.3923fff8-c33a-4a4e-951c-65f64a8ebbab@github.com>

On Sat, 26 Feb 2022 05:54:05 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix tests
>
> src/hotspot/share/runtime/os.cpp line 1193:
> 
>> 1191: 
>> 1192:   uintptr_t usp    = (uintptr_t)fr->sp();
>> 1193:   if ((usp & sp_align_mask) != 0 || !os::is_readable_pointer((const void*)usp)) return true;
> 
> remove cast

Also, could you factor out this test to a local helper, something like:

static bool pointer_is_bad(uintptr_t p) {
...
}

?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From stuefe at openjdk.java.net  Sat Feb 26 06:18:57 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 26 Feb 2022 06:18:57 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <QTCzme_rFRbzbCur_KJlqZv3YJv6_DvjQVSQoMzq9c8=.3923fff8-c33a-4a4e-951c-65f64a8ebbab@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
 <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
 <QTCzme_rFRbzbCur_KJlqZv3YJv6_DvjQVSQoMzq9c8=.3923fff8-c33a-4a4e-951c-65f64a8ebbab@github.com>
Message-ID: <SSIpSFGWjkJi1Y8-8pxpm72qDxgFZWF6YOjtxZjIY1s=.2c5e2fd4-3e41-40e6-a055-788f8eebc45b@github.com>

On Sat, 26 Feb 2022 05:57:06 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> src/hotspot/share/runtime/os.cpp line 1193:
>> 
>>> 1191: 
>>> 1192:   uintptr_t usp    = (uintptr_t)fr->sp();
>>> 1193:   if ((usp & sp_align_mask) != 0 || !os::is_readable_pointer((const void*)usp)) return true;
>> 
>> remove cast
>
> Also, could you factor out this test to a local helper, something like:
> 
> static bool pointer_is_bad(uintptr_t p) {
> ...
> }
> 
> ?

And the alignment check would be more readable with the is_aligned() function from align.hpp (this is old code, the function did not exist back then).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Sat Feb 26 07:57:54 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Sat, 26 Feb 2022 07:57:54 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
 <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
Message-ID: <DryWvp7QjbBKJZ5kw-mf8_2JIQzG0dhjLs1rneLPvzI=.03d0185a-ca04-471c-ae64-0f826d40cc93@github.com>

On Sat, 26 Feb 2022 06:02:45 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix tests
>
> src/hotspot/share/runtime/os.cpp line 1184:
> 
>> 1182: #ifdef _WINDOWS
>> 1183:   return true; // native stack isn't walkable on windows this way.
>> 1184: #else
> 
> This change has nothing to do with the bug.
> 
> I would leave this as it is and let the code below at least compile on windows. Then we know it does not bitrot there. I am also not clear why this would not work on windows, since we could optionally build with framepointers enabled, right? And don't we have frame pointers on 32-bit windows always? I may remember this wrong.

There is a special function on windows to obtain native stack traces, it uses an OS function

> test/hotspot/gtest/runtime/test_os.cpp line 874:
> 
>> 872:   frame invalid_frame;
>> 873:   EXPECT_TRUE(os::is_first_C_frame(&invalid_frame)); // the frame has zeroes for all values
>> 874: 
> 
> Please add a test with valid looking but garbage pointers, to test that your safefetch really works. We usually do this by reserving + protecting a stripe of memory and using that one as guaranteed faulting pointer.

I thought about, but was unsure how to it properly.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From stuefe at openjdk.java.net  Sat Feb 26 08:23:54 2022
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 26 Feb 2022 08:23:54 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <DryWvp7QjbBKJZ5kw-mf8_2JIQzG0dhjLs1rneLPvzI=.03d0185a-ca04-471c-ae64-0f826d40cc93@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
 <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
 <DryWvp7QjbBKJZ5kw-mf8_2JIQzG0dhjLs1rneLPvzI=.03d0185a-ca04-471c-ae64-0f826d40cc93@github.com>
Message-ID: <LCrIzGWSZewdWJ0T_kVhdpQNRNa3ftCu_pQIInP6Bxg=.d9be26fe-4f68-4bec-b862-5f2fe19cc67d@github.com>

On Sat, 26 Feb 2022 07:55:07 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> test/hotspot/gtest/runtime/test_os.cpp line 874:
>> 
>>> 872:   frame invalid_frame;
>>> 873:   EXPECT_TRUE(os::is_first_C_frame(&invalid_frame)); // the frame has zeroes for all values
>>> 874: 
>> 
>> Please add a test with valid looking but garbage pointers, to test that your safefetch really works. We usually do this by reserving + protecting a stripe of memory and using that one as guaranteed faulting pointer.
>
> I thought about, but was unsure how to it properly.

No problem, it's probably enough in its current form.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Sat Feb 26 09:53:53 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Sat, 26 Feb 2022 09:53:53 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <3IE97Ur28wo8YNWudqJKhQxDv5iO8cpUGneR-bsFR5s=.dda6f316-2ce1-4f89-b254-47019781ab6d@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
 <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
 <DryWvp7QjbBKJZ5kw-mf8_2JIQzG0dhjLs1rneLPvzI=.03d0185a-ca04-471c-ae64-0f826d40cc93@github.com>
 <LCrIzGWSZewdWJ0T_kVhdpQNRNa3ftCu_pQIInP6Bxg=.d9be26fe-4f68-4bec-b862-5f2fe19cc67d@github.com>
 <3IE97Ur28wo8YNWudqJKhQxDv5iO8cpUGneR-bsFR5s=.dda6f316-2ce1-4f89-b254-47019781ab6d@github.com>
Message-ID: <HOg7PjRk5wAtfb1QmUuHvpm3O5A5yQL9tKcGsrIivdU=.1e7b3f84-e640-42e1-a173-c816e7845151@github.com>

On Sat, 26 Feb 2022 09:49:22 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> No problem, it's probably enough in its current form.
>
> Ok, but it would be cool if you could tell me how to do it, because I have the suspicion, that this not the only PR that I will ever write regarding segfaults.

But I have to wait till Monday to let someone write an issue :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From duke at openjdk.java.net  Sat Feb 26 09:53:53 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Sat, 26 Feb 2022 09:53:53 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v9]
In-Reply-To: <LCrIzGWSZewdWJ0T_kVhdpQNRNa3ftCu_pQIInP6Bxg=.d9be26fe-4f68-4bec-b862-5f2fe19cc67d@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
 <WwZ8rljh-8_RJ4snAGdt434Xtb1FYjcq5D0pZyZmXgg=.40ceb003-c3a9-45a0-91a8-7195e9c3920e@github.com>
 <e6G-J0-c_CuuzcLWrr602faFCoXmkgsAVewQEQdPJCE=.afcedd67-0696-44e6-9162-156f8100bc6c@github.com>
 <DryWvp7QjbBKJZ5kw-mf8_2JIQzG0dhjLs1rneLPvzI=.03d0185a-ca04-471c-ae64-0f826d40cc93@github.com>
 <LCrIzGWSZewdWJ0T_kVhdpQNRNa3ftCu_pQIInP6Bxg=.d9be26fe-4f68-4bec-b862-5f2fe19cc67d@github.com>
Message-ID: <3IE97Ur28wo8YNWudqJKhQxDv5iO8cpUGneR-bsFR5s=.dda6f316-2ce1-4f89-b254-47019781ab6d@github.com>

On Sat, 26 Feb 2022 08:20:26 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> I thought about, but was unsure how to it properly.
>
> No problem, it's probably enough in its current form.

Ok, but it would be cool if you could tell me how to do it, because I have the suspicion, that this not the only PR that I will ever write regarding segfaults.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591

From coleenp at openjdk.java.net  Sat Feb 26 13:12:29 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Sat, 26 Feb 2022 13:12:29 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v4]
In-Reply-To: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
Message-ID: <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>

> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
> Tested with tier1 on Oracle platforms.

Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:

  Fix CDS ommission.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7608/files
  - new: https://git.openjdk.java.net/jdk/pull/7608/files/f75cf1e2..6b55334a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7608&range=02-03

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7608.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7608/head:pull/7608

PR: https://git.openjdk.java.net/jdk/pull/7608

From coleenp at openjdk.java.net  Sat Feb 26 13:12:30 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Sat, 26 Feb 2022 13:12:30 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v3]
In-Reply-To: <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <mMmnunrCmrMrbLUS1I1cCyYwSJMWQHik-waPi73UBxs=.f405bfd2-11c4-4a53-a46f-0eb12c0cfac8@github.com>
Message-ID: <HvsmH1aRPBfoXmvrr8ssV-at_JuE2tGvM5IyTtNi9Ng=.67a1a787-5bd5-49b6-a7bb-d143d06c1637@github.com>

On Fri, 25 Feb 2022 20:19:26 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Set _name field in constructor with available symbol rather than later when constant pool pointer is set.  I like this one better.

Thanks Patricio and Harold.  With this fix (needed to walk new field for CDS), the new test passes on all platforms.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7608

From coleenp at openjdk.java.net  Sat Feb 26 13:23:13 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Sat, 26 Feb 2022 13:23:13 GMT
Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails
 with "RuntimeException: the value of full_count is wrong."
Message-ID: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com>

This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom.  It also makes full_count atomic so that the test in codeCache for printing is correct.  This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads.
Tested with tier1-3 on linux and windows x64.

-------------

Commit messages:
 - 8279573: compiler/codecache/CodeCacheFullCountTest.java fails with "RuntimeException: the value of full_count is wrong."

Changes: https://git.openjdk.java.net/jdk/pull/7629/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7629&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8279573
  Stats: 12 lines in 4 files changed: 3 ins; 0 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7629.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7629/head:pull/7629

PR: https://git.openjdk.java.net/jdk/pull/7629

From dholmes at openjdk.java.net  Mon Feb 28 02:06:46 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 28 Feb 2022 02:06:46 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <AknquHaCXyRqeY3mHW46RTc_fyzF7-0Ht9t3QdNPWNA=.5437dea3-e6e0-4db8-bda2-a48a8fd64210@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
 <PqR2p2ZM-GCUvfn0-lGDt993uI11L_TqAgsTQNEuOgI=.6aba8e9b-06c2-49e0-9d2d-830a2789c970@github.com>
 <AknquHaCXyRqeY3mHW46RTc_fyzF7-0Ht9t3QdNPWNA=.5437dea3-e6e0-4db8-bda2-a48a8fd64210@github.com>
Message-ID: <xrGCpAE5dBqCRpR6hdd6Z-dqmnHweyOLpgUTrc1RFHk=.185c0e3f-eebc-4b20-ac20-fb0cbf4d5a2d@github.com>

On Wed, 23 Feb 2022 11:20:11 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Looks fine. There might be some performance implications to this, as IIRC this code gets called from GC copying, so some light benchmarking might be in order.
>
>> @shipilev any suggestions as to which benchmarks to try to run for this? Otherwise I'll just try our usual internal ones.
> 
> Just the usual sanity check of benchmarks is fine. If there are regressions on some other benchmarks, we can take care of them after integration.

Paging @shipilev - please see previous comment.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From iklam at openjdk.java.net  Mon Feb 28 06:34:14 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Mon, 28 Feb 2022 06:34:14 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v7]
In-Reply-To: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
Message-ID: <vUeGmWS_eQFDYi5uPMdE-Hic149OfsbBnFQ401nhIo0=.1c79af72-d04b-47f0-afc6-60f1341b18c1@github.com>

> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
>     public static final SUNDAY = new Day("SUNDAY");
>     public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::<clinit>` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::<clinit>`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time.
> 
> This is safe as we know that `X::<clinit>` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits:

 - fixed copyright year
 - Merge branch 'master' into 8275731-heapshared-enum
 - fixed whitespace
 - Fixed comments per @calvinccheung review
 - Merge branch 'master' into 8275731-heapshared-enum
 - Use InstanceKlass::do_local_static_fields for some field iterations
 - Merge branch 'master' into 8275731-heapshared-enum
 - added exclusions needed by "java -Xshare:dump -ea -esa"
 - Comments from @calvinccheung off-line
 - 8275731: CDS archived enums objects are recreated at runtime

-------------

Changes: https://git.openjdk.java.net/jdk/pull/6653/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=06
  Stats: 860 lines in 16 files changed: 807 ins; 4 del; 49 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653

From duke at openjdk.java.net  Mon Feb 28 12:36:59 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 28 Feb 2022 12:36:59 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64
Message-ID: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>

8282392: [zero] Build broken on AArch64

-------------

Commit messages:
 - 8282392: [zero] Build broken on AArch64

Changes: https://git.openjdk.java.net/jdk/pull/7633/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7633&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282392
  Stats: 13 lines in 5 files changed: 8 ins; 0 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7633.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7633/head:pull/7633

PR: https://git.openjdk.java.net/jdk/pull/7633

From shade at openjdk.java.net  Mon Feb 28 12:52:44 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Feb 2022 12:52:44 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <QOs2NrTPCDyqrYnfUPE8_RpiZ8lNoO3-6WrVq1CnAJk=.0352888c-66d9-4b29-8bef-7561a7b8e52f@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
 <QOs2NrTPCDyqrYnfUPE8_RpiZ8lNoO3-6WrVq1CnAJk=.0352888c-66d9-4b29-8bef-7561a7b8e52f@github.com>
Message-ID: <h87OawcbCJ6QGLWY4spjNCRLM6FUs3ezMrzEx871CN4=.e58a8e8d-a646-412a-9fd2-fc990aac9d78@github.com>

On Thu, 24 Feb 2022 11:45:17 GMT, David Holmes <dholmes at openjdk.org> wrote:

> I ran some GC benchmarks which turned out to be just specjbb2005 and specjvm2008-*.
> 
> There were two regressions flagged:
> 
> Linux-x64: SPECjvm2008-LU.large-ZGC -5.82%
> macos-x64: SPECjvm2008-Serial-ParGC -4.16%

Myself, I never trust LU.large results, since they experience quite large run-to-run variance in our runs. Serial regression is weird, though, it is usually a very stable workload. Does it reproduce locally? If it does not reproduce, we can go ahead and deal with any regressions later.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567

From aph at openjdk.java.net  Mon Feb 28 14:26:46 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Feb 2022 14:26:46 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64
In-Reply-To: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
Message-ID: <qWCjv9hFYmshrurUWq83FHfd0rzPbNjVdHYdC0JkJgk=.518ef0ef-5ff0-4819-a0e3-9d75ccdd411e@github.com>

On Mon, 28 Feb 2022 12:28:39 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> 8282392: [zero] Build broken on AArch64

src/hotspot/share/utilities/macros.hpp line 543:

> 541: #define AARCH64_PORT_ONLY(code) code
> 542: #define NOT_AARCH64_PORT_ONLY(code)
> 543: #else

I don't think we need `NOT_AARCH64_PORT_ONLY`, and it's too confusing. Otherwise OK.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From thartmann at openjdk.java.net  Mon Feb 28 14:40:52 2022
From: thartmann at openjdk.java.net (Tobias Hartmann)
Date: Mon, 28 Feb 2022 14:40:52 GMT
Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails
 with "RuntimeException: the value of full_count is wrong."
In-Reply-To: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com>
References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com>
Message-ID: <09ehUnw153f_DG6GP1nODViodtRmeK6PIJ8O-K3WfiM=.ce8a5999-448c-4f0b-acba-49bba4fec11f@github.com>

On Sat, 26 Feb 2022 13:14:57 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom.  It also makes full_count atomic so that the test in codeCache for printing is correct.  This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads.
> Tested with tier1-3 on linux and windows x64.

Looks good to me.

-------------

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7629

From duke at openjdk.java.net  Mon Feb 28 14:57:48 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 28 Feb 2022 14:57:48 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64
In-Reply-To: <qWCjv9hFYmshrurUWq83FHfd0rzPbNjVdHYdC0JkJgk=.518ef0ef-5ff0-4819-a0e3-9d75ccdd411e@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <qWCjv9hFYmshrurUWq83FHfd0rzPbNjVdHYdC0JkJgk=.518ef0ef-5ff0-4819-a0e3-9d75ccdd411e@github.com>
Message-ID: <kuQtJ1Q7fUu7OFcN8Q_r2WFnev6AwMsLcJceptkfbRw=.8fcaf7e8-d9f4-4ee0-bdba-bbe0f34d2cb9@github.com>

On Mon, 28 Feb 2022 14:23:58 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> 8282392: [zero] Build broken on AArch64
>
> src/hotspot/share/utilities/macros.hpp line 543:
> 
>> 541: #define AARCH64_PORT_ONLY(code) code
>> 542: #define NOT_AARCH64_PORT_ONLY(code)
>> 543: #else
> 
> I don't think we need `NOT_AARCH64_PORT_ONLY`, and it's too confusing. Otherwise OK.

Agreed. Only added it to keep with the style of the file. Will remove.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From aph at openjdk.java.net  Mon Feb 28 15:16:52 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Feb 2022 15:16:52 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64
In-Reply-To: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
Message-ID: <pIXh96__C-1c7reC2iimgAxFgh0M2NzoAlc_t48Tdw4=.563f41ca-1b4c-4265-9f40-0f39b0dab6a1@github.com>

On Mon, 28 Feb 2022 12:28:39 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> 8282392: [zero] Build broken on AArch64

Marked as reviewed by aph (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From duke at openjdk.java.net  Mon Feb 28 16:05:19 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Mon, 28 Feb 2022 16:05:19 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v10]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <gGS_9pEXElnmrFLM91jDK7XGsC9Qjx-WUHvK_y579I0=.801e691c-8a55-4903-bf57-a3527a48dda7@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Fix problem related to NMT
  
  The problem is that registering a thread for NMT uses
  the os::is_first_C_frame method which calls Thread::enable_wx
  internally. But enable_wx requires that the init_wx method
  has been called before, not after.
  Swapping two lines therefore fixes the problem.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/1f08203f..1714b69a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=09
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=08-09

  Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From shade at openjdk.java.net  Mon Feb 28 16:21:37 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Feb 2022 16:21:37 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
Message-ID: <Xe2AWbv-_sIHWCFhcbVk1DPxUk4sWafVaoIdiE1t8T8=.117d9129-045d-46ed-8c3e-bafd2d23150c@github.com>

On Mon, 28 Feb 2022 16:18:07 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> 8282392: [zero] Build broken on AArch64
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove NOT_AARCH64_PORT_ONLY

I think it is confusing to have `AARCH64_PORT_ONLY` defines, to be honest. In the similar cases for X86, we just additionally protect these blocks with !ZERO. Something like:


#if defined(AARCH64) && !defined(ZERO)
ret_pc = pauth_strip_pointer(ret_pc);
#endif

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From shade at openjdk.java.net  Mon Feb 28 16:21:38 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Feb 2022 16:21:38 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64
In-Reply-To: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
Message-ID: <yztT5Q-Tr-vO6AXAaSybA0ZkggPLci4ghbbQp71itKI=.6f2c4fd1-3bc4-4d03-a21e-7c39e0a8dca9@github.com>

On Mon, 28 Feb 2022 12:28:39 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

> 8282392: [zero] Build broken on AArch64

See for example: https://github.com/openjdk/jdk/blob/4e7fb41dafaf03baabe18ee1dabefed50d69e16d/src/hotspot/share/utilities/ticks.cpp#L66

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From duke at openjdk.java.net  Mon Feb 28 16:21:37 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 28 Feb 2022 16:21:37 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
Message-ID: <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>

> 8282392: [zero] Build broken on AArch64

Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:

  Remove NOT_AARCH64_PORT_ONLY

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7633/files
  - new: https://git.openjdk.java.net/jdk/pull/7633/files/d5952abc..edf11eae

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7633&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7633&range=00-01

  Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7633.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7633/head:pull/7633

PR: https://git.openjdk.java.net/jdk/pull/7633

From chagedorn at openjdk.java.net  Mon Feb 28 16:22:25 2022
From: chagedorn at openjdk.java.net (Christian Hagedorn)
Date: Mon, 28 Feb 2022 16:22:25 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files [v5]
In-Reply-To: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
Message-ID: <Xk0ZAEKBg48k7SyVHxyMNTyNUVHAVEpBJi65h2DW-sY=.69ab83c8-6619-408d-bcfc-0305231e1e8c@github.com>

> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method:
> 
> Stack: [0x00007f6e01739000,0x00007f6e0183a000],  sp=0x00007f6e01838110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64
> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec
> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899
> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df
> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69
> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d
> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167
> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180
> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f
> 
> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method.
> 
> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)):
> 
> Stack: [0x00007f34fca18000,0x00007f34fcb19000],  sp=0x00007f34fcb17110,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x620d86]  Compilation::~Compilation()+0x64  (c1_Compilation.cpp:607)
> V  [libjvm.so+0x624b92]  Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec  (c1_Compiler.cpp:250)
> V  [libjvm.so+0x8303ef]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899  (compileBroker.cpp:2291)
> V  [libjvm.so+0x82f067]  CompileBroker::compiler_thread_loop()+0x3df  (compileBroker.cpp:1966)
> V  [libjvm.so+0x84f0d1]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69  (compilerThread.cpp:59)
> V  [libjvm.so+0x1209329]  JavaThread::thread_main_inner()+0x15d  (thread.cpp:1297)
> V  [libjvm.so+0x12091c9]  JavaThread::run()+0x167  (thread.cpp:1280)
> V  [libjvm.so+0x1206ada]  Thread::call_run()+0x180  (thread.cpp:358)
> V  [libjvm.so+0x1012e55]  thread_native_entry(Thread*)+0x18f  (os_linux.cpp:705)
> 
> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. 
> 
> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf
> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. 
> 
> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability.
> 
> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. 
> 
> **Testing:**
> Apart from manual testing, I've added two kinds of tests:
> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers.
> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename.
> 
> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation.
> 
> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional  `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number.
> 
> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user).
> 
> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`!
>  
> Thanks,
> Christian

Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 54 commits:

 - Updating some comments
 - Cleanup loading dwarf file and add summary
 - Review comments of first pass by Thomas except dwarf file loading
 - Merge branch 'master' into JDK-8242181
 - Make dwarf tag NOT_PRODUCT
 - Change log_* to log_develop_* and log_warning to log_develop_info
 - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java
   
   Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com>
 - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java
   
   Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com>
 - Better formatting of trace output
 - some code move and more cleanups
 - ... and 44 more: https://git.openjdk.java.net/jdk/compare/efd3967b...5bea4841

-------------

Changes: https://git.openjdk.java.net/jdk/pull/7126/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=04
  Stats: 2665 lines in 19 files changed: 2524 ins; 76 del; 65 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7126.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7126/head:pull/7126

PR: https://git.openjdk.java.net/jdk/pull/7126

From chagedorn at openjdk.java.net  Mon Feb 28 16:22:31 2022
From: chagedorn at openjdk.java.net (Christian Hagedorn)
Date: Mon, 28 Feb 2022 16:22:31 GMT
Subject: RFR: 8242181: [Linux] Show source information when printing native
 stack traces in hs_err files [v4]
In-Reply-To: <ZTggm8ZVeipDBUrgoc_wPQNBHKTkO9HmIIQ1-6mFZDY=.d076d6b2-c6a8-423c-8f84-83a4c43f9a3f@github.com>
References: <b4LpGSdAhQPw3hzU9p273wI1RNp8jU2atUwgPbCN1yc=.7662be04-acc8-48eb-8d0e-b2e6e10d1e59@github.com>
 <PESchU9s7SJ30uIlIhCkZZZb84bvppiRwBPMFBNJvs0=.1308e4f4-a4af-427b-b2db-f13f2a05be3a@github.com>
 <ZTggm8ZVeipDBUrgoc_wPQNBHKTkO9HmIIQ1-6mFZDY=.d076d6b2-c6a8-423c-8f84-83a4c43f9a3f@github.com>
Message-ID: <5YeksetlUoja6cRgWtaorVsXCDLEQRw8s7B9W5UJOUE=.22f23a33-8877-4c3a-9b6d-f648fb1c4fc3@github.com>

On Tue, 22 Feb 2022 09:59:36 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Make dwarf tag NOT_PRODUCT
>
> src/hotspot/share/utilities/elfFile.cpp line 319:
> 
>> 317:     }
>> 318:     log_develop_info(dwarf)("No separate .debuginfo file for library %s. It already contains the required DWARF sections.", _filepath);
>> 319:     _dwarf_file = new (std::nothrow) DwarfFile(_filepath);
> 
> Would it be useful to explicitly bail out on a `nullptr` value here to avoid crashes below?

Yes, I think that's the right way. I changed other allocations as well to bail out.

> src/hotspot/share/utilities/elfFile.cpp line 357:
> 
>> 355:   }
>> 356: 
>> 357:   strcpy(debug_pathname, _filepath);
> 
> I'm always a bit uneasy using "raw" `strcpy` instead of `strncpy` and friends. The code seems to be correct though.

Yes that's true. I updated usages while introducing a new helper class `DwarfFilePath`.

> src/hotspot/share/utilities/elfFile.cpp line 784:
> 
>> 782:   }
>> 783: 
>> 784:   if (!_reader.read_byte(&_header._address_size) || NOT_LP64(_header._address_size != 4)  LP64_ONLY( _header._address_size != 8)) {
> 
> Since this is the second time for the clause `|| NOT_LP64(_header._address_size != 4) LP64_ONLY( _header._address_size != 8)` maybe it is useful to make a constant out of the accepted address size somewhere instead of repeating this over and over.
> It's value could even be something like `sizeof(intptr_t)` or so.

I agree, I introduced a new constant `DwarfFile::ADDRESS_SIZE`.

> src/hotspot/share/utilities/elfFile.cpp line 1070:
> 
>> 1068:     // reason, GCC is currently using version 3 as specified in the DWARF 3 spec for the line number program even though GCC should
>> 1069:     // be using version 4 for DWARF 4 as it emits DWARF 4 by default.
>> 1070:     return false;
> 
> According to the specification (pg112):
> 
>> `version (uhalf)`
>> A version number (see Appendix F). This number is specific to the line number information
>> and is independent of the DWARF version number.
> 
> So this is just fine - actually things may break if the code accepted version 4 here assuming that there are breaking differences.
> On the other hand Appendix F mentions that DWARF4 contains .debug_line information in version 4.

The `LineNumberProgram` class should be able to handle both version 3 and 4. There are some differences (see `_dwarf_version` checks). But I found that GCC even mixes version 3 and 4:
https://github.com/chhagedorn/jdk/blob/820f0da65ab06b28ac75eec96d35269addda0246/src/hotspot/share/utilities/elfFile.cpp#L1302-L1308

> src/hotspot/share/utilities/elfFile.hpp line 211:
> 
>> 209: 
>> 210:   // Load the DWARF file (.debuginfo) that belongs to this file.
>> 211:   bool load_dwarf_file();
> 
> It would be nice to summarize from which places this methods tries to load the debug info to prevent the need for digging for it in the method implementation.

Good suggestion. I added a summary and refactored the different loading attempts into separate methods together with a new class `DwarfFilePath` which makes it easier to prepare the different paths.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7126

From duke at openjdk.java.net  Mon Feb 28 16:23:34 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Mon, 28 Feb 2022 16:23:34 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v11]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <inCDdVWgswvIjDJwkbXZq5VVPB2vs8j4G8AY4R3Fm0s=.f3210fe2-bf8e-46f2-b668-fdb7fd4e382a@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Fix small issues

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/1714b69a..c8223a75

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=10
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=09-10

  Stats: 35 lines in 5 files changed: 4 ins; 18 del; 13 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From coleenp at openjdk.java.net  Mon Feb 28 16:26:35 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 28 Feb 2022 16:26:35 GMT
Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails
 with "RuntimeException: the value of full_count is wrong." [v2]
In-Reply-To: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com>
References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com>
Message-ID: <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com>

> This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom.  It also makes full_count atomic so that the test in codeCache for printing is correct.  This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads.
> Tested with tier1-3 on linux and windows x64.

Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:

  I misunderstood the UseCodeCacheFlushing flag and make it act like MethodFlushing, which is a whole different flag.  Using MethodFlushing instead in the test makes it pass on loom and mainline.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7629/files
  - new: https://git.openjdk.java.net/jdk/pull/7629/files/7b790e07..03950bf0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7629&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7629&range=00-01

  Stats: 6 lines in 2 files changed: 0 ins; 2 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7629.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7629/head:pull/7629

PR: https://git.openjdk.java.net/jdk/pull/7629

From duke at openjdk.java.net  Mon Feb 28 16:28:27 2022
From: duke at openjdk.java.net (Johannes Bechberger)
Date: Mon, 28 Feb 2022 16:28:27 GMT
Subject: RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link
 access [v12]
In-Reply-To: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
References: <_oxztIwEWlkrlWHp2-w0-RHbm4iGxppT9zY8mcrKybE=.b4e356e3-2072-4c8c-94fe-41a62f4e48c8@github.com>
Message-ID: <ELnMixnlka-T4pzTBqOyvIZ8zY-Vr3hkPZCBx50u6J0=.8db43a6a-b089-4585-926c-e20f5a9d6fcf@github.com>

> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.

Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:

  Fix trailing whitespace

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7591/files
  - new: https://git.openjdk.java.net/jdk/pull/7591/files/c8223a75..219837e3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=11
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7591&range=10-11

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7591.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7591/head:pull/7591

PR: https://git.openjdk.java.net/jdk/pull/7591

From coleenp at openjdk.java.net  Mon Feb 28 16:39:46 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 28 Feb 2022 16:39:46 GMT
Subject: RFR: 8279573: compiler/codecache/CodeCacheFullCountTest.java fails
 with "RuntimeException: the value of full_count is wrong." [v2]
In-Reply-To: <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com>
References: <9kpGtp-T1jcm8LYcqrFjUB_VDRth_YnpgdLrarSonSQ=.66e97845-c0a9-4e82-b3e9-464cdffb2c72@github.com>
 <3Usf-CPfXE7q3-1QhVpOomY5LBNVtc9sr2iYbH1BWnQ=.6fdf0589-a203-41f7-8a31-119b8ad60edd@github.com>
Message-ID: <arGC9A3FMjTTB5umnT8D7IgBHSzu0igNXxdOnYCGFus=.2682fdcd-28fd-4aec-ab70-8859d0b1c0b4@github.com>

On Mon, 28 Feb 2022 16:26:35 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> This change adds a conditional to make -XX:-UseCodeCacheFlushing not flush the code cache so that the test passes on loom.  It also makes full_count atomic so that the test in codeCache for printing is correct.  This change also fixes the test because the full_count field and the message printing are not synchronized, so you can get 2 or more depending on the number of compiler threads.
>> Tested with tier1-3 on linux and windows x64.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   I misunderstood the UseCodeCacheFlushing flag and make it act like MethodFlushing, which is a whole different flag.  Using MethodFlushing instead in the test makes it pass on loom and mainline.

Thanks Tobias.  Erik asked me off PR why this UseCodeCacheFlushing flag didn't disable the NMethodSweeper completely, since I made it disable flushing methods.  Which made me aware of another flag that does what this test should want:

  product(bool, MethodFlushing, true,                                       \
          "Reclamation of zombie and not-entrant methods")                  \

vs.

  product(bool, UseCodeCacheFlushing, true,                                 \
          "Remove cold/old nmethods from the code cache")                   \

The latter flag disables removing cold methods from the code cache, where the former disables flushing.  I fixed the test to use MethodFlushing instead and verified that it passes on Loom and mainline.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7629

From aph at openjdk.java.net  Mon Feb 28 16:42:54 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Feb 2022 16:42:54 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <Xe2AWbv-_sIHWCFhcbVk1DPxUk4sWafVaoIdiE1t8T8=.117d9129-045d-46ed-8c3e-bafd2d23150c@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
 <Xe2AWbv-_sIHWCFhcbVk1DPxUk4sWafVaoIdiE1t8T8=.117d9129-045d-46ed-8c3e-bafd2d23150c@github.com>
Message-ID: <X-3edmyRGu_ODdp-GSK_KjZDh-I9IJiMXvw3fISktzQ=.8da21794-40c0-4204-b438-5448f6aff919@github.com>

On Mon, 28 Feb 2022 16:16:12 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> I think it is confusing to have `AARCH64_PORT_ONLY` defines, to be honest. In the similar cases for X86, we just additionally protect these blocks with !ZERO. Something like:

That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From pchilanomate at openjdk.java.net  Mon Feb 28 16:49:49 2022
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Mon, 28 Feb 2022 16:49:49 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v4]
In-Reply-To: <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>
Message-ID: <hMdUK7TV0gFPFxBLBeS9SaNtkgKIadzZwDjlEuaJA2A=.e7666ea8-e361-4eaf-a521-01ded231ff99@github.com>

On Sat, 26 Feb 2022 13:12:29 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix CDS ommission.

Adding that test case was a good idea :) 
Looks good!

Thanks,
Patricio

-------------

Marked as reviewed by pchilanomate (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7608

From shade at openjdk.java.net  Mon Feb 28 16:51:47 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Feb 2022 16:51:47 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <X-3edmyRGu_ODdp-GSK_KjZDh-I9IJiMXvw3fISktzQ=.8da21794-40c0-4204-b438-5448f6aff919@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
 <Xe2AWbv-_sIHWCFhcbVk1DPxUk4sWafVaoIdiE1t8T8=.117d9129-045d-46ed-8c3e-bafd2d23150c@github.com>
 <X-3edmyRGu_ODdp-GSK_KjZDh-I9IJiMXvw3fISktzQ=.8da21794-40c0-4204-b438-5448f6aff919@github.com>
Message-ID: <GTdx7d5hPYokAaen16OKQH6oMYfKbKPPkibtZriXpnY=.cdbd772b-7c80-42b5-8a26-1c7a26da2478@github.com>

On Mon, 28 Feb 2022 16:39:48 GMT, Andrew Haley <aph at openjdk.org> wrote:

> That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV.

>From the perspective of Zero maintenance, having the Zero-specific workarounds explicitly doing `!ZERO` is cleaner. This mess is mostly Zero-s problem with idenitifying itself as CPU. So, in my mind, there is little reason to accommodate that problem with "port" defines.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From shade at openjdk.java.net  Mon Feb 28 17:40:52 2022
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Feb 2022 17:40:52 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
Message-ID: <wrryyknf5vMcCmvkb6eani3bL_aIEr32UYz-qgkDkkw=.b44f657f-72f9-45bd-85e0-3dc475229d8a@github.com>

On Mon, 28 Feb 2022 16:21:37 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> 8282392: [zero] Build broken on AArch64
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove NOT_AARCH64_PORT_ONLY

Fine, let's do it in this form.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7633

From aph at openjdk.java.net  Mon Feb 28 17:40:52 2022
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Feb 2022 17:40:52 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <GTdx7d5hPYokAaen16OKQH6oMYfKbKPPkibtZriXpnY=.cdbd772b-7c80-42b5-8a26-1c7a26da2478@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
 <Xe2AWbv-_sIHWCFhcbVk1DPxUk4sWafVaoIdiE1t8T8=.117d9129-045d-46ed-8c3e-bafd2d23150c@github.com>
 <X-3edmyRGu_ODdp-GSK_KjZDh-I9IJiMXvw3fISktzQ=.8da21794-40c0-4204-b438-5448f6aff919@github.com>
 <GTdx7d5hPYokAaen16OKQH6oMYfKbKPPkibtZriXpnY=.cdbd772b-7c80-42b5-8a26-1c7a26da2478@github.com>
Message-ID: <_IW-vmjBMki32oD_xhjLWurDk5CGtAH0o1oWcS1-tsA=.11bfefec-625c-4c22-92a7-f12bc32d9fa3@github.com>

On Mon, 28 Feb 2022 16:48:35 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> > That's what we looked at and it was more of a mess, IMO. In the end it's a judgment call which to have, and I've seen this kind of mistake, where a particular port is confused with a particular CPU, enough times that I think this is OK; YMMV.
> 
> From the perspective of Zero maintenance, having the Zero-specific workarounds explicitly doing `!ZERO` is cleaner. This mess is mostly Zero-s problem with idenitifying itself as CPU. So, in my mind, there is little reason to accommodate that problem with "port" defines.

I think I understand your point, but IMO it's almost always easier to understand language which says what something is than what it isn't, and a simple name than a boolean expression. And that is more important, I believe.
Having said that, if you insist that flagging this up as a Zero-specific workaround with `!ZERO` is really important I will give way to your preference. (I don't think it is: I think we should flag this code as port-specific, not CPU-specific. But mostly I just want this patch pushed.)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From duke at openjdk.java.net  Mon Feb 28 17:40:52 2022
From: duke at openjdk.java.net (Alan Hayward)
Date: Mon, 28 Feb 2022 17:40:52 GMT
Subject: RFR: 8282392: [zero] Build broken on AArch64 [v2]
In-Reply-To: <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
References: <qxz5VzW3Kcds7Z6DqY4YX_34ogZvnwbe8U_tlE-7dZY=.eac8cb0e-a371-44fc-98f0-faf4739fd996@github.com>
 <DTMgbkrV1e74kAWTbhNWuveNyoxHwbMpBkCcVjkN1QI=.5274cf02-31d1-4573-ab0d-bba7b0c9a3c7@github.com>
Message-ID: <jD0jYYYQCCZQRV_SHZ1Nl1BjLmqgnD35VkISddQ2rU4=.ad713fac-6e93-44bf-8d42-8d8db96fadfd@github.com>

On Mon, 28 Feb 2022 16:21:37 GMT, Alan Hayward <duke at openjdk.java.net> wrote:

>> 8282392: [zero] Build broken on AArch64
>
> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove NOT_AARCH64_PORT_ONLY

My only issue with that is that:
AARCH64_PORT_ONLY(some_function());

becomes:
#if defined(AARCH64) && !defined(ZERO)
some_function();
#endif

Which is a little uglier.

How about defining the macro something like:
#if defined(AARCH64) && !defined(ZERO)
#define AARCH64_NOT_ZERO(code) code


(ultimately, I'm happy with any of the above)

-------------

PR: https://git.openjdk.java.net/jdk/pull/7633

From duke at openjdk.java.net  Mon Feb 28 18:52:18 2022
From: duke at openjdk.java.net (Evgeny Astigeevich)
Date: Mon, 28 Feb 2022 18:52:18 GMT
Subject: RFR: 8280872: Reorder code cache segments to improve code density
 [v2]
In-Reply-To: <LgdXzi8u2jSr15R9eo3H2u6GVC32HE1SoBEqmwRGpf8=.04e681a6-eb61-4222-b6e6-193ccd80eefe@github.com>
References: <xLxIBNvaur8wlO1DowHhztMnJjRUsL0kOE0M8xR_3T8=.a1fd6a29-a26f-41c4-ad96-385c038be79c@github.com>
 <LgdXzi8u2jSr15R9eo3H2u6GVC32HE1SoBEqmwRGpf8=.04e681a6-eb61-4222-b6e6-193ccd80eefe@github.com>
Message-ID: <6yR77yO0CGw6ciJPa97cS0O3PCsWznBy9x0x6ILWLZc=.43ad49ab-4ad0-49d9-9098-da4fef38dabf@github.com>

On Wed, 23 Feb 2022 21:52:11 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

>> Currently the codecache segment order is [non-nmethod, non-profiled, profiled]. With this change we move the non-nmethod segment between two code segments. It changes nothing for any platform besides AARCH.
>> 
>> In AARCH the offset limit for a branch instruction is 128MB. The bigger jumps are encoded with three instructions. Most of far branches are jumps into the non-nmethod blobs. With the non-nmethod segment in between code segments the jump distance from method to the stub becomes shorter. The result is a 4% reduction in generated code size for the CodeCache range from 128MB to 240MB.
>> 
>> As a side effect, the performance of some tests is slightly improved:
>> ``ArraysFill.testCharFill      10  thrpt   15  170235.720 -> 178477.212  ops/ms``
>> 
>> Testing: jdk/hotspot jtreg and microbenchmarks on AMD and AARCH
>
> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
> 
>  - fix name: is_non_nmethod, adding target_needs_far_branch func
>  - change codecache segments order: nonprofiled-nonmethod-profiled
>    increase far jump threshold: sideof(codecache)=128M -> sizeof(nonprofiled+nonmethod)=128M

src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 55:

> 53:   Label l;
> 54:   __ ldr(rscratch2, l);
> 55:   __ far_jump(ExternalAddress(entry_point), NULL, rscratch1, true);

This complicates `assemble_ic_buffer_code`. You need to know `far_jump` implementation, especially the generation of  NOPs. I understand why we need those NOPs.
Do we have calls of non-nmethod code here?

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 393:

> 391:   assert(CodeCache::find_blob(entry.target()) != NULL,
> 392:          "destination of far call not found in code cache");
> 393:   assert(CodeCache::is_non_nmethod(entry.target()), "must be a call to the code stub");

This restricts far calls to be calls of non-nmethod code.

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4379:

> 4377:       postcond(pc() == badAddress);
> 4378:       return NULL;
> 4379:     }

I believe replacing `trampoline_call` by `far_call` should be a separate PR.

src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp line 533:

> 531:   address stub = NULL;
> 532: 
> 533:   if (a.codecache_branch_needs_far_jump()

I prefer it to be `a.target_needs_far_jump(dest)`. `codecache_branch` looks like code cache  branches need far jumps. It is strange because the code cache is just a storage. It is the code generator has to use far jumps.

src/hotspot/share/code/codeCache.cpp line 898:

> 896: }
> 897: 
> 898: size_t CodeCache::max_distance_to_codestub() {

`max_distance_to_non_nmethod_heap`?
As this is public API, it sounds strange without the start point.
If someone changes positions of the heap, would it work as expected?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7517

From hseigel at openjdk.java.net  Mon Feb 28 20:35:21 2022
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Mon, 28 Feb 2022 20:35:21 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v4]
In-Reply-To: <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>
Message-ID: <HmxeajhC9el-9ihZZ4Y1Gp6IWsMLZ26HPWifxyNOJm8=.ff9c67ec-536d-4095-b048-1781d8393d4d@github.com>

On Sat, 26 Feb 2022 13:12:29 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix CDS ommission.

Looks good!  Harold

-------------

Marked as reviewed by hseigel (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7608

From coleenp at openjdk.java.net  Mon Feb 28 20:35:21 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 28 Feb 2022 20:35:21 GMT
Subject: RFR: 8282240: Add _name field to Method for NOT_PRODUCT only [v4]
In-Reply-To: <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
 <4x8eIKVaHBOOreUjGmLKZWfrPj6hTOJmj4zWSktdUik=.78c30107-46ec-4dd0-be30-7eb6bc8f01d1@github.com>
Message-ID: <RqiksoQU8Bq47Tp39U4Ann2NHq5W8_NNtSxeILJa5HQ=.ea66e504-4da2-4f02-828a-f8948a7f7629@github.com>

On Sat, 26 Feb 2022 13:12:29 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
>> Tested with tier1 on Oracle platforms.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix CDS ommission.

Thanks Harold and Patricio!

-------------

PR: https://git.openjdk.java.net/jdk/pull/7608

From coleenp at openjdk.java.net  Mon Feb 28 20:35:21 2022
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 28 Feb 2022 20:35:21 GMT
Subject: Integrated: 8282240: Add _name field to Method for NOT_PRODUCT only
In-Reply-To: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
References: <-earTaon4tAWa42gIN_zQGm297N0MCypdcEyaBGY9CE=.69d09b35-1b7c-4428-b32a-9e7a3bee5aea@github.com>
Message-ID: <S0Dgxr5VNwyN3OU4caNo0E0NBOjdzAwlNrt6sNvoM4o=.113c1eb9-aec8-4a06-a1dc-fba5d80a4021@github.com>

On Thu, 24 Feb 2022 12:50:01 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> Whenever I'm debugging I really wish I knew the name of the method that I'm looking at, so I added this field in not-product.
> Tested with tier1 on Oracle platforms.

This pull request has now been integrated.

Changeset: c7cd1487
Author:    Coleen Phillimore <coleenp at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/c7cd1487fe00172be59e7571991f960c59b8c0eb
Stats:     42 lines in 5 files changed: 33 ins; 0 del; 9 mod

8282240: Add _name field to Method for NOT_PRODUCT only

Reviewed-by: pchilanomate, hseigel

-------------

PR: https://git.openjdk.java.net/jdk/pull/7608

From iklam at openjdk.java.net  Mon Feb 28 20:38:24 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Mon, 28 Feb 2022 20:38:24 GMT
Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime
 [v4]
In-Reply-To: <ZNXfiilUk7SSsjyBRUvvr8sk-MhZyJiq-A_PYe5HiwQ=.5b1327d9-7047-4b64-bc04-ec82e76406fc@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
 <vr6Kx9et3LNNBT76J3vEav7eYlVk9rmdzmd4CPVlzH0=.40bd2ef0-edba-4c4a-a36d-86e72b7a0079@github.com>
 <ZNXfiilUk7SSsjyBRUvvr8sk-MhZyJiq-A_PYe5HiwQ=.5b1327d9-7047-4b64-bc04-ec82e76406fc@github.com>
Message-ID: <eLovu-oohH_FVvhbffsS03w1K1aN0Wt9v3YTJgLzWFU=.1fed4fca-9fe3-4742-b8f0-cbd4bb38ff71@github.com>

On Thu, 17 Feb 2022 23:20:41 GMT, Calvin Cheung <ccheung at openjdk.org> wrote:

>> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Use InstanceKlass::do_local_static_fields for some field iterations
>
> Looks good. Minor comment below.
> Also, several files with copyright year 2021 need updating.

Thanks @calvinccheung and @coleenp for the review. Passed tiers 1-5.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6653

From iklam at openjdk.java.net  Mon Feb 28 20:38:24 2022
From: iklam at openjdk.java.net (Ioi Lam)
Date: Mon, 28 Feb 2022 20:38:24 GMT
Subject: Integrated: 8275731: CDS archived enums objects are recreated at
 runtime
In-Reply-To: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com>
Message-ID: <wVAUORDsMtyD93Se896wQvFmYn--rGmXEKy5DZxIJYA=.ed89f194-db79-4062-af3e-428b7c7f8817@github.com>

On Wed, 1 Dec 2021 20:47:20 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
>     public static final SUNDAY = new Day("SUNDAY");
>     public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::<clinit>` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::<clinit>`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time.
> 
> This is safe as we know that `X::<clinit>` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

This pull request has now been integrated.

Changeset: d983d108
Author:    Ioi Lam <iklam at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/d983d108c565654e717e2811d88aa94d982da2f5
Stats:     860 lines in 16 files changed: 807 ins; 4 del; 49 mod

8275731: CDS archived enums objects are recreated at runtime

Reviewed-by: coleenp, ccheung

-------------

PR: https://git.openjdk.java.net/jdk/pull/6653

From dholmes at openjdk.java.net  Mon Feb 28 23:35:03 2022
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 28 Feb 2022 23:35:03 GMT
Subject: RFR: 8227369: pd_disjoint_words_atomic() needs to be atomic [v2]
In-Reply-To: <h87OawcbCJ6QGLWY4spjNCRLM6FUs3ezMrzEx871CN4=.e58a8e8d-a646-412a-9fd2-fc990aac9d78@github.com>
References: <5VWTTzHHgW3zN3B7ANKTF4_wjp5FEYlrXucH0Shx_Ig=.f3291823-90c1-4e61-8e21-916e664cd5a2@github.com>
 <k083U-feA36EplV5ZpjyF2Y0sEx7YOgX5mGpMvgagXA=.618b93d4-0c7c-46d5-952b-9b225fbf83ab@github.com>
 <QOs2NrTPCDyqrYnfUPE8_RpiZ8lNoO3-6WrVq1CnAJk=.0352888c-66d9-4b29-8bef-7561a7b8e52f@github.com>
 <h87OawcbCJ6QGLWY4spjNCRLM6FUs3ezMrzEx871CN4=.e58a8e8d-a646-412a-9fd2-fc990aac9d78@github.com>
Message-ID: <1HhKKfAKeHpy1WepMt3d0Zh2Gn21hvRaK8yJawdGbr8=.591981dc-f018-4419-84d9-df10cf211f3a@github.com>

On Mon, 28 Feb 2022 12:49:25 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> I ran some GC benchmarks which turned out to be just specjbb2005 and specjvm2008-*.
>> 
>> There were two regressions flagged:
>> 
>> Linux-x64: SPECjvm2008-LU.large-ZGC  -5.82%
>> macos-x64: SPECjvm2008-Serial-ParGC  -4.16%
>> 
>> However, Erik thinks these are just noise as apparently ZGC doesn't use these atomic copy routines, nor does he think ParGC does either.
>> 
>> Thoughts?
>
>> I ran some GC benchmarks which turned out to be just specjbb2005 and specjvm2008-*.
>> 
>> There were two regressions flagged:
>> 
>> Linux-x64: SPECjvm2008-LU.large-ZGC -5.82%
>> macos-x64: SPECjvm2008-Serial-ParGC -4.16%
> 
> Myself, I never trust LU.large results, since they experience quite large run-to-run variance in our runs. Serial regression is weird, though, it is usually a very stable workload. Does it reproduce locally? If it does not reproduce, we can go ahead and deal with any regressions later.

@shipilev  I can't run these benchmarks "locally" (I don't have the benchmarks nor a macOS system). I will try submitting another run just for that benchmark.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7567