From dongbohe at openjdk.java.net  Tue Dec  1 05:12:09 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Tue, 1 Dec 2020 05:12:09 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v3]
In-Reply-To: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
Message-ID: <EwItHSpOvG4Qxz8VQXBFNlJoA6ND42SNH5dqMUyMjyI=.03faec89-4f70-4713-82a3-0e8628e2707b@github.com>

> Hi,
> 
> this is the continuation of the review of the implementation for:
> 
> https://bugs.openjdk.java.net/browse/JDK-8257145

Dongbo He has updated the pull request incrementally with one additional commit since the last revision:

  Refactor the code

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1474/files
  - new: https://git.openjdk.java.net/jdk/pull/1474/files/04102cbc..dd3f9b7c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=01-02

  Stats: 8 lines in 4 files changed: 0 ins; 0 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1474.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1474/head:pull/1474

PR: https://git.openjdk.java.net/jdk/pull/1474


From jiefu at openjdk.java.net  Tue Dec  1 07:24:18 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Tue, 1 Dec 2020 07:24:18 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v2]
In-Reply-To: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
Message-ID: <9o_FHDCBzJ72slxwHUDW88J2skp_jsYN9Ll8UfgwDc4=.46bac349-8d9e-4f2c-8fc5-30afeaec0ce2@github.com>

> Hi all,
> 
> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
> For example, this assert [1] fired on our testing boxes.
> 
> It can be reproduced by the following two steps on Linux-64:
>   1) ulimit -v 8388608
>   2) java -XX:MinHeapSize=5g -version
> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
> 
> One more important fact is that this bug can be more common on Linux-32 systems.
> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
> 
> Testing:
>   - tier1 ~ tier3 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567

Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - Refinement & jtreg test
 - Merge branch 'master' into JDK-8257230
 - Merge branch 'master' into JDK-8257230
 - 8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1492/files
  - new: https://git.openjdk.java.net/jdk/pull/1492/files/545d89a1..0389bc4d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=00-01

  Stats: 13070 lines in 264 files changed: 8989 ins; 2442 del; 1639 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1492.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1492/head:pull/1492

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Tue Dec  1 07:29:03 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Tue, 1 Dec 2020 07:29:03 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v2]
In-Reply-To: <sS7chJx6Ml3p3UXfywLzf0R8k2wXNvLhRVZl-JMD_Uo=.a0c17616-da97-48df-930b-e690b2cfebcf@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <Vzef7URy6YJ0mtW_qW33wtWT_L8oFXnIYW51wkjfVmM=.05190e4b-bdad-46c4-9856-65b255305381@github.com>
 <p33tEirVzWXh4o4odKBPcfs-wCbXDE40wBJfpLmJBM8=.4f2fd64e-a248-4b56-ad8e-fe3058e07e55@github.com>
 <UpNDw12ytUR-pohillKK9AysEcgx7uDzPtJvFe7QkPc=.c97094a1-6cfa-4b0f-8c41-f3a25aa5d893@github.com>
 <sS7chJx6Ml3p3UXfywLzf0R8k2wXNvLhRVZl-JMD_Uo=.a0c17616-da97-48df-930b-e690b2cfebcf@github.com>
Message-ID: <iGNNBryoK38biChKs8DhvwMv9A1lM61djtyu8yviC_M=.16db09e3-e93f-461c-8553-c79d5ef0f949@github.com>

On Mon, 30 Nov 2020 23:31:20 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> I agree that the fix is in line with the current code and I guess setting `MinHeapSize` should override `MaxVirtMemFraction` and allow us to use more than half the address space specified.
>> 
>> In this case I think I would prefer moving the the call `limit_by_allocatable_memory(reasonable_initial);` [1] to right after the calculation on line 1902 [2]. This way we would only have one line doing lower limiting and one line doing upper limiting. 
>> 
>> Makes sense? Or will that lead to some other problem?
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
>> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1902
>
>> 
>> In this case I think I would prefer moving the the call `limit_by_allocatable_memory(reasonable_initial);` [1] to right after the calculation on line 1902 [2]. This way we would only have one line doing lower limiting and one line doing upper limiting.
>> 
> 
> Good suggestion!
> 
> Will test it soon.
> Thanks.

No regression.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Tue Dec  1 07:29:02 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Tue, 1 Dec 2020 07:29:02 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v2]
In-Reply-To: <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
Message-ID: <_4GNImOBUb3blpdvanP_ipNRxS1CgI5_NkTObuyF6ig=.3ebdafee-db84-4485-beb5-4c90fef36c14@github.com>

On Mon, 30 Nov 2020 13:42:56 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> I think the change is good, but please add a test for this.
> 
> E.g. vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java shows how to run a command with an ulimit prepended.

The jtreg test had been added.
And the fix had been refined based on @kstefanj 's suggestion.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From stefank at openjdk.java.net  Tue Dec  1 08:39:56 2020
From: stefank at openjdk.java.net (Stefan Karlsson)
Date: Tue, 1 Dec 2020 08:39:56 GMT
Subject: RFR: 8257415: ZGC: Fix barrier_data types
In-Reply-To: <SGbdCYmgarj2hoG9nK4lcII_CJy7HvQhcG4S9j7NvSE=.69d27eca-70a1-47c2-866d-5aadd9c2fa52@github.com>
References: <SGbdCYmgarj2hoG9nK4lcII_CJy7HvQhcG4S9j7NvSE=.69d27eca-70a1-47c2-866d-5aadd9c2fa52@github.com>
Message-ID: <NC09ZmPO8VCpX9qzGdtMcri_1GYzvxM6K7kaqfRT-Vo=.140f4b0d-65a3-4746-a635-f5e3345ca24a@github.com>

On Mon, 30 Nov 2020 12:42:00 GMT, Per Liden <pliden at openjdk.org> wrote:

> The `barrier_data` is an `uint8_t`, but we sometimes pass it around as an `int`. With this patch we always treat it as an `uint8_t`.

Marked as reviewed by stefank (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1514


From pliden at openjdk.java.net  Tue Dec  1 10:43:55 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 1 Dec 2020 10:43:55 GMT
Subject: Integrated: 8257415: ZGC: Fix barrier_data types
In-Reply-To: <SGbdCYmgarj2hoG9nK4lcII_CJy7HvQhcG4S9j7NvSE=.69d27eca-70a1-47c2-866d-5aadd9c2fa52@github.com>
References: <SGbdCYmgarj2hoG9nK4lcII_CJy7HvQhcG4S9j7NvSE=.69d27eca-70a1-47c2-866d-5aadd9c2fa52@github.com>
Message-ID: <zpY3EVkCRauorxkoFp50sbLuesxlNOmiE2vm3Gk6L2s=.590b8cc6-394d-460d-a465-ebd7e4d08990@github.com>

On Mon, 30 Nov 2020 12:42:00 GMT, Per Liden <pliden at openjdk.org> wrote:

> The `barrier_data` is an `uint8_t`, but we sometimes pass it around as an `int`. With this patch we always treat it as an `uint8_t`.

This pull request has now been integrated.

Changeset: 021dced2
Author:    Per Liden <pliden at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/021dced2
Stats:     7 lines in 4 files changed: 0 ins; 0 del; 7 mod

8257415: ZGC: Fix barrier_data types

Reviewed-by: smonteith, stefank

-------------

PR: https://git.openjdk.java.net/jdk/pull/1514


From pliden at openjdk.java.net  Tue Dec  1 10:43:54 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 1 Dec 2020 10:43:54 GMT
Subject: RFR: 8257415: ZGC: Fix barrier_data types
In-Reply-To: <NC09ZmPO8VCpX9qzGdtMcri_1GYzvxM6K7kaqfRT-Vo=.140f4b0d-65a3-4746-a635-f5e3345ca24a@github.com>
References: <SGbdCYmgarj2hoG9nK4lcII_CJy7HvQhcG4S9j7NvSE=.69d27eca-70a1-47c2-866d-5aadd9c2fa52@github.com>
 <NC09ZmPO8VCpX9qzGdtMcri_1GYzvxM6K7kaqfRT-Vo=.140f4b0d-65a3-4746-a635-f5e3345ca24a@github.com>
Message-ID: <2_dBqDO9TzC9UTzAnRq4HBobGpQLMUsoSMXNlK368VA=.27b1fcf1-e8a2-49c0-ad3c-8909a5b82a76@github.com>

On Tue, 1 Dec 2020 08:37:38 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:

>> The `barrier_data` is an `uint8_t`, but we sometimes pass it around as an `int`. With this patch we always treat it as an `uint8_t`.
>
> Marked as reviewed by stefank (Reviewer).

Thanks for reviewing!

-------------

PR: https://git.openjdk.java.net/jdk/pull/1514


From github.com+71302734+amitdpawar at openjdk.java.net  Tue Dec  1 11:15:59 2020
From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar)
Date: Tue, 1 Dec 2020 11:15:59 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits
In-Reply-To: <pXkbUsgucY6zsIv98Dwgu_2Srl5euGRAMTAWJJmngaY=.64847697-d28c-4cad-987f-9f98d22b9834@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <pXkbUsgucY6zsIv98Dwgu_2Srl5euGRAMTAWJJmngaY=.64847697-d28c-4cad-987f-9f98d22b9834@github.com>
Message-ID: <nfo6PU21fTLftJx77SUpqZoXarejVRz9i4ZzmGiDKVw=.0ec9f32c-c29d-4583-87b2-63cfc522cfa0@github.com>

On Mon, 30 Nov 2020 13:14:53 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> This PR fixes lower and default value of JVM flag PreTouchParallelChunkSize. Its default value is 1GB and is used by both G1GC and ParallelGC to pretouch the pages. Following test showed that reducing the chunk size improves JVM startup time and GC pause time.
>> 
>> Tests are: (Test machine 2P 64C/128T with 1TB memory)
>> 1. JVM startup time test with AdaptiveSizePolicy disabled: Pretouch 1TB of memory with/without transparent large page support and used time command to measure the time taken.
>> Command: time ./jdk/bin/java -XX:+AlwaysPreTouch -XX:+<UseParallelGC or UseG1GC>-Xmx900g -Xms900g -Xmn800g -XX:SurvivorRatio=400 -Xlog:gc*=debug:file=gc.log -XX:ParallelGCThreads=128 -XX:PreTouchParallelChunkSize=<chunk size> -version
>> 2. JVM startup and GC pause time test with AdaptiveSizePolicy enabled: SPECjbb composite run with 1TB heap and transparent large page support was enabled.
>> 
>> Test results are recorded in XL file. [PreTouchParallelChunkSize_TestResults.xlsx](https://github.com/openjdk/jdk/files/5612448/PreTouchParallelChunkSize_TestResults.xlsx)
>> 
>> Test results shows:
>> 1. With AdaptiveSizePolicy disabled.
>>     1. G1GC improved upto ~14% on large page disabled and ~5% on enabled.
>>     2. ParallelGC improved upto ~15% on large page disabled and ~5% on enabled.
>>     3. Tests showed improvement from 64KB for default page size and 2MB for lage page size.
>>     4. Please check "JVM_Startup_Summary" sheet in XL file for more detail.
>> 
>> 2. SPECjbb composite test with UseAdaptiveSizePolicy + UseLargePages enabled.
>>    1. Pretouch takes up-to 30-90% less time for memory range 32MB-4GB. This happens because memory less than 1GB also pretouched with multiple threads.
>>    2. Same also helps to bring down GC pause time and this is dependent on memory size. Effect is larger when expansion size is smaller.
>>    3. Please check SPECjbb_Summary sheet in XL file for more detail.
>> 
>> Default value of PreTouchParallelChunkSize is changed to 4MB and based your suggestion it can be changed to right value. Please check and review this PR.
>
> Looks good, but could you undo the changes in pretouchTask.cpp? These break the rule to have all gang tasks with a "Running .... with ... workers" message. Also, this message is then printed for all pretouch actions - even when resizing the heap which can be quite annoying.
> 
> Instead, the method could use a `GCTraceTime` instance to time the method. However I do not think this is really necessary or desired - imho in this case the caller should decide on whether it wants some log output, but others may have a different opinion :)
> 
> Since a CSR is needed for changes to product flags like this, I started one with [JDK-8257419](https://bugs.openjdk.java.net/browse/JDK-8257419). Probably also needs a release note.

Thanks Thomas for your reply. Log message was changed to include the time to make it easier for testing and reviewing. If not required will revert it back. Please suggest.

Thanks,
Amit

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From tschatzl at openjdk.java.net  Tue Dec  1 11:44:54 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Tue, 1 Dec 2020 11:44:54 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits
In-Reply-To: <nfo6PU21fTLftJx77SUpqZoXarejVRz9i4ZzmGiDKVw=.0ec9f32c-c29d-4583-87b2-63cfc522cfa0@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <pXkbUsgucY6zsIv98Dwgu_2Srl5euGRAMTAWJJmngaY=.64847697-d28c-4cad-987f-9f98d22b9834@github.com>
 <nfo6PU21fTLftJx77SUpqZoXarejVRz9i4ZzmGiDKVw=.0ec9f32c-c29d-4583-87b2-63cfc522cfa0@github.com>
Message-ID: <rRc5X_JPynA0wxQd_cPXhopACegOYzYo7jp81VX1MFU=.28da53f3-d9c4-49ac-86c3-a5016acebb46@github.com>

On Tue, 1 Dec 2020 11:13:18 GMT, Amit Pawar <github.com+71302734+amitdpawar at openjdk.org> wrote:

>> Looks good, but could you undo the changes in pretouchTask.cpp? These break the rule to have all gang tasks with a "Running .... with ... workers" message. Also, this message is then printed for all pretouch actions - even when resizing the heap which can be quite annoying.
>> 
>> Instead, the method could use a `GCTraceTime` instance to time the method. However I do not think this is really necessary or desired - imho in this case the caller should decide on whether it wants some log output, but others may have a different opinion :)
>> 
>> Since a CSR is needed for changes to product flags like this, I started one with [JDK-8257419](https://bugs.openjdk.java.net/browse/JDK-8257419). Probably also needs a release note.
>
> Thanks Thomas for your reply. Log message was changed to include the time to make it easier for testing and reviewing. If not required will revert it back. Please suggest.
> 
> Thanks,
> Amit

Please remove this what looks like debug code.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From sjohanss at openjdk.java.net  Tue Dec  1 11:44:55 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Tue, 1 Dec 2020 11:44:55 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits
In-Reply-To: <nfo6PU21fTLftJx77SUpqZoXarejVRz9i4ZzmGiDKVw=.0ec9f32c-c29d-4583-87b2-63cfc522cfa0@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <pXkbUsgucY6zsIv98Dwgu_2Srl5euGRAMTAWJJmngaY=.64847697-d28c-4cad-987f-9f98d22b9834@github.com>
 <nfo6PU21fTLftJx77SUpqZoXarejVRz9i4ZzmGiDKVw=.0ec9f32c-c29d-4583-87b2-63cfc522cfa0@github.com>
Message-ID: <T2bEonPsSwTHF-GoSIXw63-tl6jQX_v3xSgZe-gjIY4=.ade2c4a1-b3c0-401d-8b25-15801489eaa9@github.com>

On Tue, 1 Dec 2020 11:13:18 GMT, Amit Pawar <github.com+71302734+amitdpawar at openjdk.org> wrote:

> Thanks Thomas for your reply. Log message was changed to include the time to make it easier for testing and reviewing. If not required will revert it back. Please suggest.

I agree with Thomas, I think we should revert the changes done in `pretouchTask.cpp`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From pliden at openjdk.java.net  Tue Dec  1 12:35:57 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 1 Dec 2020 12:35:57 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <yY6fFpPM5OIuFjitGBO1uZ12qkeGe7xpQo75Q0zOEKI=.10291485-3f50-41f0-8043-00573d9257f4@github.com>
References: <DI8UYHrS6qJyzLFHrKlvPalR6kwDbTUpET11NcIMwDY=.ffe75199-09a4-4482-ab9d-20faf9450fd6@github.com>
 <xE4_wLg7izPrcfhKnhnmTGqYCTc6vEwbbIHchE5BFuM=.b76bb01a-132d-4ac9-95ff-f486ea95ddc3@github.com>
 <yY6fFpPM5OIuFjitGBO1uZ12qkeGe7xpQo75Q0zOEKI=.10291485-3f50-41f0-8043-00573d9257f4@github.com>
Message-ID: <ceLs7zyG4r7o19KFbzHIezK4JcMYG_Oc7FdZDw6bRSE=.cf52acd2-8fe8-4b63-8f44-78b716d6fe75@github.com>

On Mon, 30 Nov 2020 20:03:01 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> Just a friendly ping. Still looking for reviewers for this fix.
>
>> Just a friendly ping. Still looking for reviewers for this fix.
> 
> Until we resolve the discussion in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987), I don't think your suggested fix should be applied since it could be viewed as a workaround to a debug agent issue (not shutting down GC during `VM.suspendAll`) or as something that needs to be clarified in the JDI and JDWP specs (checking for `ObjectReference.disableCollection` failures, even when under `VM.suspendAll`, and retrying the allocation). I'd like to see the discussion resolved and follow-on bugs files.

Sorry, I had missed your latest reply in the JDK-8255987. Let's continue the discussion there.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1348


From dongbohe at openjdk.java.net  Tue Dec  1 15:03:11 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Tue, 1 Dec 2020 15:03:11 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v4]
In-Reply-To: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
Message-ID: <G7Q2cfrcYcE0VCfPgblfBFLyGRtqYyqvJ8iCCF5n2_U=.f9059a2c-f630-48f8-8602-736c8133ca26@github.com>

> Hi,
> 
> this is the continuation of the review of the implementation for:
> 
> https://bugs.openjdk.java.net/browse/JDK-8257145

Dongbo He has updated the pull request incrementally with one additional commit since the last revision:

  fix build error on aarch64

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1474/files
  - new: https://git.openjdk.java.net/jdk/pull/1474/files/dd3f9b7c..0aa22448

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=02-03

  Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1474.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1474/head:pull/1474

PR: https://git.openjdk.java.net/jdk/pull/1474


From sjohanss at openjdk.java.net  Tue Dec  1 15:43:02 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Tue, 1 Dec 2020 15:43:02 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v2]
In-Reply-To: <ih6RH_QNxFYW3wkW1h1Djd5T6apqBU91xrebFg9_Dxo=.70c41061-3d46-44e6-9b58-0a938242db04@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <kAQ_hyAowLvZo_3ld0YToZsNpQbYTdFg6jmg8dCjGko=.d8fa4668-8ce4-4cbe-900e-82a81d5ec33d@github.com>
 <ih6RH_QNxFYW3wkW1h1Djd5T6apqBU91xrebFg9_Dxo=.70c41061-3d46-44e6-9b58-0a938242db04@github.com>
Message-ID: <KoA0IblW0VRHBJhDHVbBK9zF3Femj9mUkYCjBPwEgxQ=.2674d8bc-cfe9-4184-b870-f5dfe31196f5@github.com>

On Mon, 30 Nov 2020 10:32:36 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Dongbo He has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   store the "default size" for the PLAB in the PLABStats
>
> Changes requested by tschatzl (Reviewer).

I think the move to use ParallelGCThreads in `g1EvacStats.cpp` is good, please also add:
#include "runtime/globals.hpp"

To not rely on other includes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From github.com+71302734+amitdpawar at openjdk.java.net  Tue Dec  1 16:41:09 2020
From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar)
Date: Tue, 1 Dec 2020 16:41:09 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
Message-ID: <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>

> This PR fixes lower and default value of JVM flag PreTouchParallelChunkSize. Its default value is 1GB and is used by both G1GC and ParallelGC to pretouch the pages. Following test showed that reducing the chunk size improves JVM startup time and GC pause time.
> 
> Tests are: (Test machine 2P 64C/128T with 1TB memory)
> 1. JVM startup time test with AdaptiveSizePolicy disabled: Pretouch 1TB of memory with/without transparent large page support and used time command to measure the time taken.
> Command: time ./jdk/bin/java -XX:+AlwaysPreTouch -XX:+<UseParallelGC or UseG1GC>-Xmx900g -Xms900g -Xmn800g -XX:SurvivorRatio=400 -Xlog:gc*=debug:file=gc.log -XX:ParallelGCThreads=128 -XX:PreTouchParallelChunkSize=<chunk size> -version
> 2. JVM startup and GC pause time test with AdaptiveSizePolicy enabled: SPECjbb composite run with 1TB heap and transparent large page support was enabled.
> 
> Test results are recorded in XL file. [PreTouchParallelChunkSize_TestResults.xlsx](https://github.com/openjdk/jdk/files/5612448/PreTouchParallelChunkSize_TestResults.xlsx)
> 
> Test results shows:
> 1. With AdaptiveSizePolicy disabled.
>     1. G1GC improved upto ~14% on large page disabled and ~5% on enabled.
>     2. ParallelGC improved upto ~15% on large page disabled and ~5% on enabled.
>     3. Tests showed improvement from 64KB for default page size and 2MB for lage page size.
>     4. Please check "JVM_Startup_Summary" sheet in XL file for more detail.
> 
> 2. SPECjbb composite test with UseAdaptiveSizePolicy + UseLargePages enabled.
>    1. Pretouch takes up-to 30-90% less time for memory range 32MB-4GB. This happens because memory less than 1GB also pretouched with multiple threads.
>    2. Same also helps to bring down GC pause time and this is dependent on memory size. Effect is larger when expansion size is smaller.
>    3. Please check SPECjbb_Summary sheet in XL file for more detail.
> 
> Default value of PreTouchParallelChunkSize is changed to 4MB and based your suggestion it can be changed to right value. Please check and review this PR.

Amit Pawar has updated the pull request incrementally with one additional commit since the last revision:

  Reverted changes in pretouchTask.cpp.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1503/files
  - new: https://git.openjdk.java.net/jdk/pull/1503/files/8ef5ed7c..ef6fa419

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1503&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1503&range=00-01

  Stats: 16 lines in 1 file changed: 3 ins; 12 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1503.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1503/head:pull/1503

PR: https://git.openjdk.java.net/jdk/pull/1503


From github.com+71302734+amitdpawar at openjdk.java.net  Tue Dec  1 16:41:09 2020
From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar)
Date: Tue, 1 Dec 2020 16:41:09 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <T2bEonPsSwTHF-GoSIXw63-tl6jQX_v3xSgZe-gjIY4=.ade2c4a1-b3c0-401d-8b25-15801489eaa9@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <pXkbUsgucY6zsIv98Dwgu_2Srl5euGRAMTAWJJmngaY=.64847697-d28c-4cad-987f-9f98d22b9834@github.com>
 <nfo6PU21fTLftJx77SUpqZoXarejVRz9i4ZzmGiDKVw=.0ec9f32c-c29d-4583-87b2-63cfc522cfa0@github.com>
 <T2bEonPsSwTHF-GoSIXw63-tl6jQX_v3xSgZe-gjIY4=.ade2c4a1-b3c0-401d-8b25-15801489eaa9@github.com>
Message-ID: <DoNJq4hx5dkQjPcglGLp1f6SIUc4_DtcOmTqkNJxBCM=.aacaace5-a40e-442f-a75a-3e78cd31cbc8@github.com>

On Tue, 1 Dec 2020 11:42:13 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Thanks Thomas for your reply. Log message was changed to include the time to make it easier for testing and reviewing. If not required will revert it back. Please suggest.
>> 
>> Thanks,
>> Amit
>
>> Thanks Thomas for your reply. Log message was changed to include the time to make it easier for testing and reviewing. If not required will revert it back. Please suggest.
> 
> I agree with Thomas, I think we should revert the changes done in `pretouchTask.cpp`.

Done.

Thanks,
Amit

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From tschatzl at openjdk.java.net  Tue Dec  1 16:44:59 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Tue, 1 Dec 2020 16:44:59 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
Message-ID: <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>

On Tue, 1 Dec 2020 16:41:09 GMT, Amit Pawar <github.com+71302734+amitdpawar at openjdk.org> wrote:

>> This PR fixes lower and default value of JVM flag PreTouchParallelChunkSize. Its default value is 1GB and is used by both G1GC and ParallelGC to pretouch the pages. Following test showed that reducing the chunk size improves JVM startup time and GC pause time.
>> 
>> Tests are: (Test machine 2P 64C/128T with 1TB memory)
>> 1. JVM startup time test with AdaptiveSizePolicy disabled: Pretouch 1TB of memory with/without transparent large page support and used time command to measure the time taken.
>> Command: time ./jdk/bin/java -XX:+AlwaysPreTouch -XX:+<UseParallelGC or UseG1GC>-Xmx900g -Xms900g -Xmn800g -XX:SurvivorRatio=400 -Xlog:gc*=debug:file=gc.log -XX:ParallelGCThreads=128 -XX:PreTouchParallelChunkSize=<chunk size> -version
>> 2. JVM startup and GC pause time test with AdaptiveSizePolicy enabled: SPECjbb composite run with 1TB heap and transparent large page support was enabled.
>> 
>> Test results are recorded in XL file. [PreTouchParallelChunkSize_TestResults.xlsx](https://github.com/openjdk/jdk/files/5612448/PreTouchParallelChunkSize_TestResults.xlsx)
>> 
>> Test results shows:
>> 1. With AdaptiveSizePolicy disabled.
>>     1. G1GC improved upto ~14% on large page disabled and ~5% on enabled.
>>     2. ParallelGC improved upto ~15% on large page disabled and ~5% on enabled.
>>     3. Tests showed improvement from 64KB for default page size and 2MB for lage page size.
>>     4. Please check "JVM_Startup_Summary" sheet in XL file for more detail.
>> 
>> 2. SPECjbb composite test with UseAdaptiveSizePolicy + UseLargePages enabled.
>>    1. Pretouch takes up-to 30-90% less time for memory range 32MB-4GB. This happens because memory less than 1GB also pretouched with multiple threads.
>>    2. Same also helps to bring down GC pause time and this is dependent on memory size. Effect is larger when expansion size is smaller.
>>    3. Please check SPECjbb_Summary sheet in XL file for more detail.
>> 
>> Default value of PreTouchParallelChunkSize is changed to 4MB and based your suggestion it can be changed to right value. Please check and review this PR.
>
> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Reverted changes in pretouchTask.cpp.

Lgtm.

We need to wait until the CSR has been approved. This typically happens on Thursdays.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1503


From github.com+71302734+amitdpawar at openjdk.java.net  Tue Dec  1 16:53:57 2020
From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar)
Date: Tue, 1 Dec 2020 16:53:57 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
 <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>
Message-ID: <Pc_ZF9O_3la_br1DC73GHKChS9l0WEGxPCL95L2eon4=.7a0b66cb-d1b7-4ffd-964d-3b1b2cce9512@github.com>

On Tue, 1 Dec 2020 16:42:01 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Reverted changes in pretouchTask.cpp.
>
> Lgtm.
> 
> We need to wait until the CSR has been approved. This typically happens on Thursdays.

Thanks Thomas and Stefan for reviewing and approving the changes. Will wait until csr approval.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  1 20:06:17 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 1 Dec 2020 20:06:17 GMT
Subject: RFR: 8256155: 2M large pages for code when LargePageSizeInBytes
 is set to 1G for heap [v3]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <UZRx_4DnnDwcVMwqYzshYZem4eoPWlpM9QkgCrJVMLE=.f45a9dfb-2ae3-47fd-a086-cf5dbabd1b22@github.com>

> Add 2M LargePages to _page_sizes
> 
>     Use 2m pages for large page requests
>     less than 1g on linux when 1G are default
>     pages
> 
>     - Add os::Linux::large_page_size_2m() that
>     returns 2m as size
>     - Add os::Linux::select_large_page_size() to return
>     correct large page size for size_t bytes
>     - Add 2m size to _page_sizes array
>     - Update reserve_memory_special methods
>     to set/use large_page_size based on bytes reserved
>     - Update large page not reserved warnings
>     to include large_page_size attempted
>     - Update TestLargePageUseForAuxMemory.java
>     to expect 2m large pages in some instances
> 
> Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Add 2M LargePages to _page_sizes
   
   Use 2m pages for large page requests
   less than 1g on linux when 1G are default
   pages
   
   - Add os::Linux::large_page_size_2m() that
   returns 2m as size
   - Add os::Linux::select_large_page_size() to return
   correct large page size for size_t bytes
   - Add 2m size to _page_sizes array
   - Update reserve_memory_special methods
   to set/use large_page_size based on bytes reserved
   - Update large page not reserved warnings
   to include large_page_size attempted
   - Update TestLargePageUseForAuxMemory.java
   to expect 2m large pages in some instances
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge remote-tracking branch 'upstream/master' into update_hlp
 - Add 2M LargePages to _page_sizes
   
   Use 2m pages for large page requests
   less than 1g on linux when 1G are default
   pages
   
   - Add os::Linux::large_page_size_2m() that
   returns 2m as size
   - Add os::Linux::select_large_page_size() to return
   correct large page size for size_t bytes
   - Add 2m size to _page_sizes array
   - Update reserve_memory_special methods
   to set/use large_page_size based on bytes reserved
   - Update large page not reserved warnings
   to include large_page_size attempted
   - Update TestLargePageUseForAuxMemory.java
   to expect 2m large pages in some instances
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/7092bec8..57e54963

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=01-02

  Stats: 204961 lines in 1352 files changed: 133754 ins; 50869 del; 20338 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From zgu at openjdk.java.net  Tue Dec  1 23:05:11 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Tue, 1 Dec 2020 23:05:11 GMT
Subject: RFR: 8255019: Shenandoah: Split STW and concurrent mark into
 separate classes [v17]
In-Reply-To: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
References: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
Message-ID: <PvKGwOuBcPx_YrJeUV5k44Aki1caDy_tDjUVO5cNy_Q=.2e8b0f0c-0b93-4e6c-9086-187e738aa0cf@github.com>

> This is the first part of refactoring, that aims to isolate three Shenandoah GC modes (concurrent, degenerated and full gc).
> 
> Shenandoah started with two GC modes, concurrent and full gc, with minimal shared code, mainly in mark phase. After introducing degenerated GC, it shared quite large portion of code with concurrent GC, with the concept that degenerated GC can simply pick up remaining work of concurrent GC in STW mode.
> 
> It was not a big problem at that time, since concurrent GC also processed roots STW. Since Shenandoah gradually moved root processing into concurrent phase, code started to diverge, that made code hard to reason and maintain.
> 
> First step, I would like to split STW and concurrent mark, so that:
> 1) Code has to special case for STW and concurrent mark.
> 2) STW mark does not need to rendezvous workers between root mark and the rest of mark
> 3) STW mark does not need to activate SATB barrier and drain SATB buffers.
> 4) STW mark does not need to remark some of roots.
> 
> The patch mainly just shuffles code.  Creates a base class ShenandoahMark, and moved shared code (from current shenandoahConcurrentMark) into this base class. I did 'git mv shenandoahConcurrentMark.inline.hpp  shenandoahMark.inline.hpp, but git does not seem to reflect that.
> 
> A few changes:
> 1) Moved task queue set from ShenandoahConcurrentMark to ShenandoahHeap. ShenandoahMark and its subclasses are stateless. Instead, mark states are maintained in task queue, mark bitmap and SATB buffers, so that they can be created on demand.
> 2) Split ShenandoahConcurrentRootScanner template to ShenandoahConcurrentRootScanner and ShenandoahSTWRootScanner
> 3) Split code inside op_final_mark code into finish_mark and prepare_evacuation helper functions.
> 4) Made ShenandoahMarkCompact stack allocated (as well as ShenandoahConcurrentGC and ShenandoahDegeneratedGC in upcoming refactoring)

Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits:

 - Silent valgrind on potential memory leak
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Removed ShenandoahConcurrentMark parameter from concurrent GC entry/op, etc.
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge
 - Moved task queues to marking context
 - Merge
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge branch 'master' into JDK-8255019-sh-mark
 - ... and 14 more: https://git.openjdk.java.net/jdk/compare/c5046ca5...367c9fc7

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1009/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1009&range=16
  Stats: 1947 lines in 22 files changed: 1067 ins; 742 del; 138 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1009.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1009/head:pull/1009

PR: https://git.openjdk.java.net/jdk/pull/1009


From jiefu at openjdk.java.net  Wed Dec  2 00:08:55 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Wed, 2 Dec 2020 00:08:55 GMT
Subject: RFR: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due
 to buffers_to_cards overflow [v4]
In-Reply-To: <4o_lK9LVq3ycbKpVI_NP7-B8wIHzLymHXbWGUbzUmWg=.16b9f77c-888a-4305-a389-15333330f599@github.com>
References: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
 <LNnq4-NwyglRVs5r55ETn2HNTIcFjlG-uSIoRmFjk_8=.7a96017c-163e-4f02-b1e1-df1178c43aef@github.com>
 <4o_lK9LVq3ycbKpVI_NP7-B8wIHzLymHXbWGUbzUmWg=.16b9f77c-888a-4305-a389-15333330f599@github.com>
Message-ID: <rKr1xpGxFmdlC-vJp8mhQtEbylMGbKqj6X7vZrDxCXs=.f8f7438e-4969-4853-88ee-d24d413e190d@github.com>

On Mon, 30 Nov 2020 11:30:26 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Lgtm

Thanks @tschatzl for your review.

@kimbarrett , are you OK with this change?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1489


From dongbohe at openjdk.java.net  Wed Dec  2 01:36:54 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Wed, 2 Dec 2020 01:36:54 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v2]
In-Reply-To: <KoA0IblW0VRHBJhDHVbBK9zF3Femj9mUkYCjBPwEgxQ=.2674d8bc-cfe9-4184-b870-f5dfe31196f5@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <kAQ_hyAowLvZo_3ld0YToZsNpQbYTdFg6jmg8dCjGko=.d8fa4668-8ce4-4cbe-900e-82a81d5ec33d@github.com>
 <ih6RH_QNxFYW3wkW1h1Djd5T6apqBU91xrebFg9_Dxo=.70c41061-3d46-44e6-9b58-0a938242db04@github.com>
 <KoA0IblW0VRHBJhDHVbBK9zF3Femj9mUkYCjBPwEgxQ=.2674d8bc-cfe9-4184-b870-f5dfe31196f5@github.com>
Message-ID: <WAmAVrHR-61o2TJZjvblX3vO1MotU5BU5Nl3Sv0aK4U=.e23a0faf-e075-404f-bdad-bfd4e77e8ef9@github.com>

On Tue, 1 Dec 2020 15:40:04 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> I think the move to use ParallelGCThreads in `g1EvacStats.cpp` is good, please also add:
> 
> ```
> #include "runtime/globals.hpp"
> ```
> 
> To not rely on other includes.

Do you mean adding `#include "runtime/globals.hpp"` to  `plab.hpp` on [Refactor the code](https://github.com/openjdk/jdk/pull/1474/commits/dd3f9b7cdca5d400c7b2296c3eee92e1c414a2bb)?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From kbarrett at openjdk.java.net  Wed Dec  2 03:36:00 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 2 Dec 2020 03:36:00 GMT
Subject: RFR: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due
 to buffers_to_cards overflow [v4]
In-Reply-To: <LNnq4-NwyglRVs5r55ETn2HNTIcFjlG-uSIoRmFjk_8=.7a96017c-163e-4f02-b1e1-df1178c43aef@github.com>
References: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
 <LNnq4-NwyglRVs5r55ETn2HNTIcFjlG-uSIoRmFjk_8=.7a96017c-163e-4f02-b1e1-df1178c43aef@github.com>
Message-ID: <pX8C_agibt33Z0ZJf83MKtlg7SU0RMPV49K0IsXM2D0=.c89483dc-f337-4561-bedb-a513b47871de@github.com>

On Mon, 30 Nov 2020 10:23:09 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Hi all,
>> 
>> SIGFPE was observed by running:
>> java -XX:G1ConcRefinementThresholdStep=16G -XX:G1UpdateBufferSize=1G -version
>> 
>> The reason is that buffers_to_cards [1] returns 0 for 'step' due to overflow.
>> It would be better to add overflow check logic is it.
>> 
>> Testing:
>>   - tier1 on Linux/x64
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp#L235
>
> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only run the test on 64-bit machines

Changes requested by kbarrett (Reviewer).

src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 255:

> 253: static size_t calc_init_green_zone() {
> 254:   size_t green = G1ConcRefinementGreenZone;
> 255:   char* name = (char*) "G1ConcRefinementGreenZone";

Change the type of name to `const char*` and eliminate the cast here and on line 258.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1489


From jiefu at openjdk.java.net  Wed Dec  2 04:12:16 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Wed, 2 Dec 2020 04:12:16 GMT
Subject: RFR: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due
 to buffers_to_cards overflow [v5]
In-Reply-To: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
References: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
Message-ID: <BkyxOAXJn-0UTsUNO0M9Uk6y2kIemHQBE5YEctYfnEk=.ab82e854-bc0f-42b3-99f9-f216a2d8223f@github.com>

> Hi all,
> 
> SIGFPE was observed by running:
> java -XX:G1ConcRefinementThresholdStep=16G -XX:G1UpdateBufferSize=1G -version
> 
> The reason is that buffers_to_cards [1] returns 0 for 'step' due to overflow.
> It would be better to add overflow check logic is it.
> 
> Testing:
>   - tier1 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp#L235

Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:

 - Eliminate the casts
 - Merge branch 'master' into JDK-8257228
 - Only run the test on 64-bit machines
 - Fix build error without PCH
 - Merge branch 'master' into JDK-8257228
 - Refine the erro msg
 - Fix mul-overflow-check and error reporting
 - 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1489/files
  - new: https://git.openjdk.java.net/jdk/pull/1489/files/8bdeb20a..efa0946c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1489&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1489&range=03-04

  Stats: 11737 lines in 303 files changed: 9388 ins; 869 del; 1480 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1489.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1489/head:pull/1489

PR: https://git.openjdk.java.net/jdk/pull/1489


From jiefu at openjdk.java.net  Wed Dec  2 04:15:58 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Wed, 2 Dec 2020 04:15:58 GMT
Subject: RFR: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due
 to buffers_to_cards overflow [v4]
In-Reply-To: <pX8C_agibt33Z0ZJf83MKtlg7SU0RMPV49K0IsXM2D0=.c89483dc-f337-4561-bedb-a513b47871de@github.com>
References: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
 <LNnq4-NwyglRVs5r55ETn2HNTIcFjlG-uSIoRmFjk_8=.7a96017c-163e-4f02-b1e1-df1178c43aef@github.com>
 <pX8C_agibt33Z0ZJf83MKtlg7SU0RMPV49K0IsXM2D0=.c89483dc-f337-4561-bedb-a513b47871de@github.com>
Message-ID: <Avn4HHseafObUjTLGA9hRbKEkZG1foAJkfNdzj3h840=.7f995ac7-e25a-4ef6-b0c4-51bd4e9e4af3@github.com>

On Wed, 2 Dec 2020 03:31:53 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Only run the test on 64-bit machines
>
> src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 255:
> 
>> 253: static size_t calc_init_green_zone() {
>> 254:   size_t green = G1ConcRefinementGreenZone;
>> 255:   char* name = (char*) "G1ConcRefinementGreenZone";
> 
> Change the type of name to `const char*` and eliminate the cast here and on line 258.

Amazing!
It gets compiled without the casts by just adding 'const'.

Updated. Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1489


From iklam at openjdk.java.net  Wed Dec  2 05:28:01 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 2 Dec 2020 05:28:01 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler
Message-ID: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>

Please review this trivial fix:

epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.

Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.

-------------

Commit messages:
 - 8257565: epsilonBarrierSet.hpp should not include barrierSetAssembler

Changes: https://git.openjdk.java.net/jdk/pull/1554/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1554&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257565
  Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1554.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1554/head:pull/1554

PR: https://git.openjdk.java.net/jdk/pull/1554


From kbarrett at openjdk.java.net  Wed Dec  2 06:41:59 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 2 Dec 2020 06:41:59 GMT
Subject: RFR: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due
 to buffers_to_cards overflow [v5]
In-Reply-To: <BkyxOAXJn-0UTsUNO0M9Uk6y2kIemHQBE5YEctYfnEk=.ab82e854-bc0f-42b3-99f9-f216a2d8223f@github.com>
References: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
 <BkyxOAXJn-0UTsUNO0M9Uk6y2kIemHQBE5YEctYfnEk=.ab82e854-bc0f-42b3-99f9-f216a2d8223f@github.com>
Message-ID: <3QS-eTXobMvM2Ov1_AiCx3TeFYA9o_4CsoD1_qhBtvo=.5470bb6e-de20-48f6-8c85-d90676c98ae8@github.com>

On Wed, 2 Dec 2020 04:12:16 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Hi all,
>> 
>> SIGFPE was observed by running:
>> java -XX:G1ConcRefinementThresholdStep=16G -XX:G1UpdateBufferSize=1G -version
>> 
>> The reason is that buffers_to_cards [1] returns 0 for 'step' due to overflow.
>> It would be better to add overflow check logic is it.
>> 
>> Testing:
>>   - tier1 on Linux/x64
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp#L235
>
> Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
> 
>  - Eliminate the casts
>  - Merge branch 'master' into JDK-8257228
>  - Only run the test on 64-bit machines
>  - Fix build error without PCH
>  - Merge branch 'master' into JDK-8257228
>  - Refine the erro msg
>  - Fix mul-overflow-check and error reporting
>  - 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow

Marked as reviewed by kbarrett (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1489


From kbarrett at openjdk.java.net  Wed Dec  2 06:49:57 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 2 Dec 2020 06:49:57 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler
In-Reply-To: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
Message-ID: <KAA3J0BKF4pPmhCsLivK1kKpNSPIgsYNtdhga8WarnI=.ae9744ff-6593-47e7-93d6-d6cb235564d4@github.com>

On Wed, 2 Dec 2020 05:22:04 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> Please review this trivial fix:
> 
> epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.
> 
> Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.

Marked as reviewed by kbarrett (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1554


From jiefu at openjdk.java.net  Wed Dec  2 06:55:01 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Wed, 2 Dec 2020 06:55:01 GMT
Subject: Integrated: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*)
 due to buffers_to_cards overflow
In-Reply-To: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
References: <kWe89La6pPsh21ocmErG5P8RYvgb2rAEKYwp6uUZnDI=.c0327435-4e0c-460f-9d39-25e87dbcccf5@github.com>
Message-ID: <JG6Lum8Md8I-qkEjMa63z_qud8J4jBghwQLjgmzCh_g=.3831ef04-77de-460f-9fb7-497fdf8120e8@github.com>

On Sat, 28 Nov 2020 08:26:57 GMT, Jie Fu <jiefu at openjdk.org> wrote:

> Hi all,
> 
> SIGFPE was observed by running:
> java -XX:G1ConcRefinementThresholdStep=16G -XX:G1UpdateBufferSize=1G -version
> 
> The reason is that buffers_to_cards [1] returns 0 for 'step' due to overflow.
> It would be better to add overflow check logic is it.
> 
> Testing:
>   - tier1 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp#L235

This pull request has now been integrated.

Changeset: f2a0988a
Author:    Jie Fu <jiefu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/f2a0988a
Stats:     64 lines in 2 files changed: 57 ins; 0 del; 7 mod

8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow

Reviewed-by: kbarrett, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/1489


From stuefe at openjdk.java.net  Wed Dec  2 07:06:55 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 2 Dec 2020 07:06:55 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler
In-Reply-To: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
Message-ID: <JMvI8o1VEm1UyjJ7Sa8v--ew9Naaxh4q0h7u7r6eyKU=.d17ccd5d-4bbf-484c-865b-8e2f1834c090@github.com>

On Wed, 2 Dec 2020 05:22:04 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> Please review this trivial fix:
> 
> epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.
> 
> Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.

Marked as reviewed by stuefe (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1554


From stuefe at openjdk.java.net  Wed Dec  2 07:06:56 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 2 Dec 2020 07:06:56 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler
In-Reply-To: <KAA3J0BKF4pPmhCsLivK1kKpNSPIgsYNtdhga8WarnI=.ae9744ff-6593-47e7-93d6-d6cb235564d4@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
 <KAA3J0BKF4pPmhCsLivK1kKpNSPIgsYNtdhga8WarnI=.ae9744ff-6593-47e7-93d6-d6cb235564d4@github.com>
Message-ID: <yNxeabZPQLL5S7DWPk13orhrLmJ93J-wmxuxOhJHFBc=.2d6410ee-1e02-419e-bc9b-200dc78e81a0@github.com>

On Wed, 2 Dec 2020 06:47:01 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this trivial fix:
>> 
>> epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.
>> 
>> Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.
>
> Marked as reviewed by kbarrett (Reviewer).

LGTM.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1554


From shade at openjdk.java.net  Wed Dec  2 07:12:56 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 2 Dec 2020 07:12:56 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler
In-Reply-To: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
Message-ID: <-B1IDVGAMaX2K2JmXlLoplCIE062Jcsnfnk7B9zdoPw=.f3d66523-7ee5-43fd-ad2f-73f501f445e9@github.com>

On Wed, 2 Dec 2020 05:22:04 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> Please review this trivial fix:
> 
> epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.
> 
> Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.

Changes requested by shade (Reviewer).

src/hotspot/share/gc/epsilon/epsilonBarrierSet.hpp line 2:

> 1: /*
> 2:  * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.

Why the copyright line addition, though? It is not like Red Hat adds its copyrights for trivial changes to shared files like these.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1554


From iklam at openjdk.java.net  Wed Dec  2 07:23:07 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 2 Dec 2020 07:23:07 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler [v2]
In-Reply-To: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
Message-ID: <uktljuQFnkcKyPsjBqC6z9fO3UMRchyB4d01DxN4XDE=.1b5904c1-4b3f-48b3-8d8e-f109d5f42cec@github.com>

> Please review this trivial fix:
> 
> epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.
> 
> Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.

Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:

  removed copyright

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1554/files
  - new: https://git.openjdk.java.net/jdk/pull/1554/files/3949a648..82b91085

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1554&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1554&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1554.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1554/head:pull/1554

PR: https://git.openjdk.java.net/jdk/pull/1554


From iklam at openjdk.java.net  Wed Dec  2 07:23:08 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 2 Dec 2020 07:23:08 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler [v2]
In-Reply-To: <-B1IDVGAMaX2K2JmXlLoplCIE062Jcsnfnk7B9zdoPw=.f3d66523-7ee5-43fd-ad2f-73f501f445e9@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
 <-B1IDVGAMaX2K2JmXlLoplCIE062Jcsnfnk7B9zdoPw=.f3d66523-7ee5-43fd-ad2f-73f501f445e9@github.com>
Message-ID: <zU2mpE1MDkRAUl2FlX-83OlaEzmRc7-y4fuD1eQZV1E=.7a51213f-1a0d-4c9b-a443-473b5df2b1d8@github.com>

On Wed, 2 Dec 2020 07:09:37 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   removed copyright
>
> src/hotspot/share/gc/epsilon/epsilonBarrierSet.hpp line 2:
> 
>> 1: /*
>> 2:  * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
> 
> Why the copyright line addition, though? It is not like Red Hat adds its copyrights for trivial changes to shared files like these.

Removed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1554


From shade at openjdk.java.net  Wed Dec  2 07:28:01 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 2 Dec 2020 07:28:01 GMT
Subject: RFR: 8257565: epsilonBarrierSet.hpp should not include
 barrierSetAssembler [v2]
In-Reply-To: <uktljuQFnkcKyPsjBqC6z9fO3UMRchyB4d01DxN4XDE=.1b5904c1-4b3f-48b3-8d8e-f109d5f42cec@github.com>
References: <u5sZty_3gKe5acY_DKLEOJt5YFm00FlRg40do97huEM=.1134323d-25f2-44f5-b0dc-18a24ef801af@github.com>
 <uktljuQFnkcKyPsjBqC6z9fO3UMRchyB4d01DxN4XDE=.1b5904c1-4b3f-48b3-8d8e-f109d5f42cec@github.com>
Message-ID: <oQwab3mjTO4MDAUvVoZUvy3jcdERAU4CH_aTeW0qKHk=.b972e200-da80-4e2e-a852-22a944b0db52@github.com>

On Wed, 2 Dec 2020 07:23:07 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Please review this trivial fix:
>> 
>> epsilonBarrierSet.hpp is included (recursively via access.hpp) by many CPP files. It unncessarily includes of barrierSetAssembler.hpp, which causes many of the native code assembler header files to be unnecessarily included by many HotSpot CPP files that do not deal with native code assembly.
>> 
>> Removing this one line reduced the total number of header inclusion for building HotSpot from 260096 to 258193, or about 0.8%.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   removed copyright

Looks good! Thanks, I'll take care of backports, if any.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1554


From sjohanss at openjdk.java.net  Wed Dec  2 10:06:56 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Wed, 2 Dec 2020 10:06:56 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <Pc_ZF9O_3la_br1DC73GHKChS9l0WEGxPCL95L2eon4=.7a0b66cb-d1b7-4ffd-964d-3b1b2cce9512@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
 <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>
 <Pc_ZF9O_3la_br1DC73GHKChS9l0WEGxPCL95L2eon4=.7a0b66cb-d1b7-4ffd-964d-3b1b2cce9512@github.com>
Message-ID: <9Ajth2BhjNfDOxM_0EmQ_MU3ymGzd7t0VToXi7afkPc=.b285e9f5-10ba-43cf-9156-c913444087bc@github.com>

On Tue, 1 Dec 2020 16:51:20 GMT, Amit Pawar <github.com+71302734+amitdpawar at openjdk.org> wrote:

>> Lgtm.
>> 
>> We need to wait until the CSR has been approved. This typically happens on Thursdays.
>
> Thanks Thomas and Stefan for reviewing and approving the changes. Will wait until csr approval.

I did some performance runs and found that on Windows this change will not speed up pre-touching. I see some quite big regressions in some cases. So I don't think we can do this change for all platforms without doing more benchmarking. 

But since it looks good on Linux, one solution would be to make `PreTouchParallelChunkSize` a platform-dependent flag and set it to 4M for Linux and keep it at 1G for the others until we can do more investigations. 

For guidance on how to make it a platform-dependent flag you can look at how `UseLargePages` is handled.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From sjohanss at openjdk.java.net  Wed Dec  2 10:13:55 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Wed, 2 Dec 2020 10:13:55 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v2]
In-Reply-To: <WAmAVrHR-61o2TJZjvblX3vO1MotU5BU5Nl3Sv0aK4U=.e23a0faf-e075-404f-bdad-bfd4e77e8ef9@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <kAQ_hyAowLvZo_3ld0YToZsNpQbYTdFg6jmg8dCjGko=.d8fa4668-8ce4-4cbe-900e-82a81d5ec33d@github.com>
 <ih6RH_QNxFYW3wkW1h1Djd5T6apqBU91xrebFg9_Dxo=.70c41061-3d46-44e6-9b58-0a938242db04@github.com>
 <KoA0IblW0VRHBJhDHVbBK9zF3Femj9mUkYCjBPwEgxQ=.2674d8bc-cfe9-4184-b870-f5dfe31196f5@github.com>
 <WAmAVrHR-61o2TJZjvblX3vO1MotU5BU5Nl3Sv0aK4U=.e23a0faf-e075-404f-bdad-bfd4e77e8ef9@github.com>
Message-ID: <q0Y85ZFQAe6DPKRYBrtrBg4Yon_J25lfa8mIMxXRpJQ=.92975ece-3ad2-43c2-80a2-7e551dfb75f7@github.com>

On Wed, 2 Dec 2020 01:34:32 GMT, Dongbo He <dongbohe at openjdk.org> wrote:

>> I think the move to use ParallelGCThreads in `g1EvacStats.cpp` is good, please also add:
>> #include "runtime/globals.hpp"
>> 
>> To not rely on other includes.
>
>> I think the move to use ParallelGCThreads in `g1EvacStats.cpp` is good, please also add:
>> 
>> ```
>> #include "runtime/globals.hpp"
>> ```
>> 
>> To not rely on other includes.
> 
> Do you mean adding `#include "runtime/globals.hpp"` to  `plab.hpp` on [Refactor the code](https://github.com/openjdk/jdk/pull/1474/commits/dd3f9b7cdca5d400c7b2296c3eee92e1c414a2bb)?

Please add it to `g1EvacStats.cpp`, the refactoring was good.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From dongbohe at openjdk.java.net  Wed Dec  2 10:53:55 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Wed, 2 Dec 2020 10:53:55 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v2]
In-Reply-To: <q0Y85ZFQAe6DPKRYBrtrBg4Yon_J25lfa8mIMxXRpJQ=.92975ece-3ad2-43c2-80a2-7e551dfb75f7@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <kAQ_hyAowLvZo_3ld0YToZsNpQbYTdFg6jmg8dCjGko=.d8fa4668-8ce4-4cbe-900e-82a81d5ec33d@github.com>
 <ih6RH_QNxFYW3wkW1h1Djd5T6apqBU91xrebFg9_Dxo=.70c41061-3d46-44e6-9b58-0a938242db04@github.com>
 <KoA0IblW0VRHBJhDHVbBK9zF3Femj9mUkYCjBPwEgxQ=.2674d8bc-cfe9-4184-b870-f5dfe31196f5@github.com>
 <WAmAVrHR-61o2TJZjvblX3vO1MotU5BU5Nl3Sv0aK4U=.e23a0faf-e075-404f-bdad-bfd4e77e8ef9@github.com>
 <q0Y85ZFQAe6DPKRYBrtrBg4Yon_J25lfa8mIMxXRpJQ=.92975ece-3ad2-43c2-80a2-7e551dfb75f7@github.com>
Message-ID: <xbd5PMtzIhgMEg0FR6kWLvoLG88XVph3m62qE0HMa2I=.6673ded9-2f58-46f4-82c3-4617997d5edf@github.com>

On Wed, 2 Dec 2020 10:10:52 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Please add it to `g1EvacStats.cpp`, the refactoring was good.

When I add it to g1EvacStates.cpp on [Refactor the code](https://github.com/openjdk/jdk/pull/1474/commits/dd3f9b7cdca5d400c7b2296c3eee92e1c414a2bb) , like this:
diff --git a/src/hotspot/share/gc/g1/g1EvacStats.cpp b/src/hotspot/share/gc/g1/g1EvacStats.cpp
index f8851b55dda..3f0f1b76cea 100644
--- a/src/hotspot/share/gc/g1/g1EvacStats.cpp
+++ b/src/hotspot/share/gc/g1/g1EvacStats.cpp
@@ -27,6 +27,7 @@
 #include "gc/shared/gcId.hpp"
 #include "logging/log.hpp"
 #include "memory/allocation.inline.hpp"
+#include "runtime/globals.hpp"

 void G1EvacStats::log_plab_allocation() {
   PLABStats::log_plab_allocation();

I still get an error in the build:
=== Output from failing command(s) repeated here ===
* For target hotspot_variant-server_libjvm_objs_g1EvacStats.o:
In file included from /home/hedongbo/temp/jdk/src/hotspot/share/gc/g1/g1EvacStats.hpp:28:0,
                 from /home/hedongbo/temp/jdk/src/hotspot/share/gc/g1/g1EvacStats.cpp:26:


     _desired_net_plab_sz(default_per_thread_plab_size * ParallelGCThreads),
                                                         ^~~~~~~~~~~~~~~~~
At global scope:
cc1plus: error: unrecognized command line option '-Wno-cast-function-type' [-Werror]
but it's OK to add it to `plab.hpp`. So, I do not know whether I have understood your meaning correctly. Thank you for your patient reply.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From sjohanss at openjdk.java.net  Wed Dec  2 11:01:56 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Wed, 2 Dec 2020 11:01:56 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v2]
In-Reply-To: <xbd5PMtzIhgMEg0FR6kWLvoLG88XVph3m62qE0HMa2I=.6673ded9-2f58-46f4-82c3-4617997d5edf@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <kAQ_hyAowLvZo_3ld0YToZsNpQbYTdFg6jmg8dCjGko=.d8fa4668-8ce4-4cbe-900e-82a81d5ec33d@github.com>
 <ih6RH_QNxFYW3wkW1h1Djd5T6apqBU91xrebFg9_Dxo=.70c41061-3d46-44e6-9b58-0a938242db04@github.com>
 <KoA0IblW0VRHBJhDHVbBK9zF3Femj9mUkYCjBPwEgxQ=.2674d8bc-cfe9-4184-b870-f5dfe31196f5@github.com>
 <WAmAVrHR-61o2TJZjvblX3vO1MotU5BU5Nl3Sv0aK4U=.e23a0faf-e075-404f-bdad-bfd4e77e8ef9@github.com>
 <q0Y85ZFQAe6DPKRYBrtrBg4Yon_J25lfa8mIMxXRpJQ=.92975ece-3ad2-43c2-80a2-7e551dfb75f7@github.com>
 <xbd5PMtzIhgMEg0FR6kWLvoLG88XVph3m62qE0HMa2I=.6673ded9-2f58-46f4-82c3-4617997d5edf@github.com>
Message-ID: <m2iAT7jMS_-4-xCprFwD70OwiYHgXNXrCtkFE0k7JmQ=.e8afc428-bec1-4dce-98f4-7e1db2fca6ab@github.com>

On Wed, 2 Dec 2020 10:49:55 GMT, Dongbo He <dongbohe at openjdk.org> wrote:

>> Please add it to `g1EvacStats.cpp`, the refactoring was good.
>
>> Please add it to `g1EvacStats.cpp`, the refactoring was good.
> 
> When I add it to g1EvacStates.cpp on [Refactor the code](https://github.com/openjdk/jdk/pull/1474/commits/dd3f9b7cdca5d400c7b2296c3eee92e1c414a2bb) , like this:
> diff --git a/src/hotspot/share/gc/g1/g1EvacStats.cpp b/src/hotspot/share/gc/g1/g1EvacStats.cpp
> index f8851b55dda..3f0f1b76cea 100644
> --- a/src/hotspot/share/gc/g1/g1EvacStats.cpp
> +++ b/src/hotspot/share/gc/g1/g1EvacStats.cpp
> @@ -27,6 +27,7 @@
>  #include "gc/shared/gcId.hpp"
>  #include "logging/log.hpp"
>  #include "memory/allocation.inline.hpp"
> +#include "runtime/globals.hpp"
> 
>  void G1EvacStats::log_plab_allocation() {
>    PLABStats::log_plab_allocation();
> 
> I still get an error in the build:
> === Output from failing command(s) repeated here ===
> * For target hotspot_variant-server_libjvm_objs_g1EvacStats.o:
> In file included from /home/hedongbo/temp/jdk/src/hotspot/share/gc/g1/g1EvacStats.hpp:28:0,
>                  from /home/hedongbo/temp/jdk/src/hotspot/share/gc/g1/g1EvacStats.cpp:26:
> 
> 
>      _desired_net_plab_sz(default_per_thread_plab_size * ParallelGCThreads),
>                                                          ^~~~~~~~~~~~~~~~~
> At global scope:
> cc1plus: error: unrecognized command line option '-Wno-cast-function-type' [-Werror]
> but it's OK to add it to `plab.hpp`. So, I do not know whether I have understood your meaning correctly. Thank you for your patient reply.

As I said:
> I think the move to use ParallelGCThreads in g1EvacStats.cpp is good...

I liked that refactoring of moving the use of `ParallelGCThreads` to G1EvacStats, I just want you to also add the include there.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From dongbohe at openjdk.java.net  Wed Dec  2 11:24:08 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Wed, 2 Dec 2020 11:24:08 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v5]
In-Reply-To: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
Message-ID: <iA_NaKUJ1VXeqS1QgBVroHC9BHhksX1Yo9aUtHsMd3U=.93e3a074-be3d-4759-84bc-5ba63fb427d7@github.com>

> Hi,
> 
> this is the continuation of the review of the implementation for:
> 
> https://bugs.openjdk.java.net/browse/JDK-8257145

Dongbo He has updated the pull request incrementally with one additional commit since the last revision:

  add include to g1EvacStats.cpp

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1474/files
  - new: https://git.openjdk.java.net/jdk/pull/1474/files/0aa22448..17aab275

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=03-04

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1474.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1474/head:pull/1474

PR: https://git.openjdk.java.net/jdk/pull/1474


From sjohanss at openjdk.java.net  Wed Dec  2 12:19:01 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Wed, 2 Dec 2020 12:19:01 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v5]
In-Reply-To: <iA_NaKUJ1VXeqS1QgBVroHC9BHhksX1Yo9aUtHsMd3U=.93e3a074-be3d-4759-84bc-5ba63fb427d7@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <iA_NaKUJ1VXeqS1QgBVroHC9BHhksX1Yo9aUtHsMd3U=.93e3a074-be3d-4759-84bc-5ba63fb427d7@github.com>
Message-ID: <-BgjGGuqHSGLVTkiYkLrcFK6hgrQQY-RsTrNGpE-vi4=.2b8a0a90-ac11-4f79-a944-e99853d378b0@github.com>

On Wed, 2 Dec 2020 11:24:08 GMT, Dongbo He <dongbohe at openjdk.org> wrote:

>> Hi,
>> 
>> this is the continuation of the review of the implementation for:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8257145
>
> Dongbo He has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add include to g1EvacStats.cpp

Looks good, thanks for fixing this.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1474


From stefank at openjdk.java.net  Wed Dec  2 12:31:54 2020
From: stefank at openjdk.java.net (Stefan Karlsson)
Date: Wed, 2 Dec 2020 12:31:54 GMT
Subject: Integrated: 8254877: GCLogPrecious::_lock rank constrains what locks
 you are allowed to have when crashing
In-Reply-To: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com>
References: <_mcGIfKtXbuDVOTisGl5s38hnMVVwAWIuaqG3mwlKj4=.8fe7adf8-cae8-4ccb-8d98-91ea7d308243@github.com>
Message-ID: <MkGA-fv05BeivMaO_ZGK5tpxTHyirmBYFB4fch5OIqM=.f91d517d-c371-47b7-9056-2ea8702b9ea6@github.com>

On Wed, 28 Oct 2020 13:49:15 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:

> This is an alternative version of the fix proposed in 900:
> https://github.com/openjdk/jdk/pull/900
> 
> Erik's description:
>> Today, when you crash, the GCLogPrecious::_lock is taken. This effectively limits you to only get clean crash reports if you crash or assert without holding a lock of rank tty or lower. It is arguably difficult to know what locks you are going to have when crashing. Therefore, I don't think the precious GC log should constrain possible crashing contexts in that fashion.
> 
> As Erik mentioned in that PR, I'd like to retain the ability to easily dump the precious log when debugging. The proposed fix changes the Mutex to a Semaphore, and use trywait to safely access the buffer. In the unlikely event that another thread is holding the lock, the hs_err printer skips printing the log.
> 
> This also makes it possible to call precious logging from within the stack watermark processing code. I think there's a possibility that we might call the following error logging, when we fail to commit memory for a ZPage, when relocating, during stack watermark processing:
> `log_error_p(gc)("Failed to commit memory (%s)", err.to_string());`

This pull request has now been integrated.

Changeset: 287b829c
Author:    Stefan Karlsson <stefank at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/287b829c
Stats:     19 lines in 1 file changed: 12 ins; 0 del; 7 mod

8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing

Reviewed-by: eosterlund

-------------

PR: https://git.openjdk.java.net/jdk/pull/903


From github.com+71302734+amitdpawar at openjdk.java.net  Wed Dec  2 14:14:59 2020
From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar)
Date: Wed, 2 Dec 2020 14:14:59 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <9Ajth2BhjNfDOxM_0EmQ_MU3ymGzd7t0VToXi7afkPc=.b285e9f5-10ba-43cf-9156-c913444087bc@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
 <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>
 <Pc_ZF9O_3la_br1DC73GHKChS9l0WEGxPCL95L2eon4=.7a0b66cb-d1b7-4ffd-964d-3b1b2cce9512@github.com>
 <9Ajth2BhjNfDOxM_0EmQ_MU3ymGzd7t0VToXi7afkPc=.b285e9f5-10ba-43cf-9156-c913444087bc@github.com>
Message-ID: <mm6Q5qjkBnhu7A4w_FHPHLMJZ7487oa55vj9gDbaayI=.f8b5db76-d70a-44e8-8125-43617a320a73@github.com>

On Wed, 2 Dec 2020 10:04:32 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Thanks Thomas and Stefan for reviewing and approving the changes. Will wait until csr approval.
>
> I did some performance runs and found that on Windows this change will not speed up pre-touching. I see some quite big regressions in some cases. So I don't think we can do this change for all platforms without doing more benchmarking. 
> 
> But since it looks good on Linux, one solution would be to make `PreTouchParallelChunkSize` a platform-dependent flag and set it to 4M for Linux and keep it at 1G for the others until we can do more investigations. 
> 
> For guidance on how to make it a platform-dependent flag you can look at how `UseLargePages` is handled.

I was doubtful about the improvement regarding other platforms and thanks for testing and verifying. I will make it platform-specific as per your suggestion. 

On other platform, this improvement is not seen for smaller or lesser memory range also right ? similar too SPECJbb_Summary sheet in Excel file.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From sjohanss at openjdk.java.net  Wed Dec  2 14:38:56 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Wed, 2 Dec 2020 14:38:56 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <mm6Q5qjkBnhu7A4w_FHPHLMJZ7487oa55vj9gDbaayI=.f8b5db76-d70a-44e8-8125-43617a320a73@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
 <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>
 <Pc_ZF9O_3la_br1DC73GHKChS9l0WEGxPCL95L2eon4=.7a0b66cb-d1b7-4ffd-964d-3b1b2cce9512@github.com>
 <9Ajth2BhjNfDOxM_0EmQ_MU3ymGzd7t0VToXi7afkPc=.b285e9f5-10ba-43cf-9156-c913444087bc@github.com>
 <mm6Q5qjkBnhu7A4w_FHPHLMJZ7487oa55vj9gDbaayI=.f8b5db76-d70a-44e8-8125-43617a320a73@github.com>
Message-ID: <edvwf2Piajkr7AxVaD4vx-8cjOm0JOPQfZ1wrYN18DU=.56c1e50e-5268-48c5-b5f3-25acd4330e19@github.com>

On Wed, 2 Dec 2020 14:11:57 GMT, Amit Pawar <github.com+71302734+amitdpawar at openjdk.org> wrote:

>> I did some performance runs and found that on Windows this change will not speed up pre-touching. I see some quite big regressions in some cases. So I don't think we can do this change for all platforms without doing more benchmarking. 
>> 
>> But since it looks good on Linux, one solution would be to make `PreTouchParallelChunkSize` a platform-dependent flag and set it to 4M for Linux and keep it at 1G for the others until we can do more investigations. 
>> 
>> For guidance on how to make it a platform-dependent flag you can look at how `UseLargePages` is handled.
>
> I was doubtful about the improvement regarding other platforms and thanks for testing and verifying. I will make it platform-specific as per your suggestion. 
> 
> On other platform, this improvement is not seen for smaller or lesser memory range also right ? similar too SPECJbb_Summary sheet in Excel file.

I have not done extensive measurements, I basically ran some startup benchmarks with:
-XX:+AlwaysPreTouch -Xms8g -Xmx8g
-XX:+AlwaysPreTouch -Xms8g -Xmx8g -XX:PreTouchParallelChunkSize=4m
-XX:+AlwaysPreTouch -Xms8g -Xmx8g -XX:PreTouchParallelChunkSize=128m

And on Windows going with the current default is the clear winner, while on Linux using 4M gives best results. Given that and your tests I think it is fairly safe to use 4M for Linux, but for the other OSes we need to do more measurements before changing to a different value.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From github.com+71302734+amitdpawar at openjdk.java.net  Wed Dec  2 14:54:59 2020
From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar)
Date: Wed, 2 Dec 2020 14:54:59 GMT
Subject: RFR: 8254699: Suboptimal PreTouchParallelChunkSize defaults and
 limits [v2]
In-Reply-To: <edvwf2Piajkr7AxVaD4vx-8cjOm0JOPQfZ1wrYN18DU=.56c1e50e-5268-48c5-b5f3-25acd4330e19@github.com>
References: <LPSs_U_v0wBslRM8nVb83qaTTGUJEEIXcasLHUXJMdg=.1a8c9461-615d-423b-82ec-6bedeb9ea162@github.com>
 <ppQKqa4qjAZbY6uIsiTF7pWYGHHrl-WTxbYGQGjYFqo=.56c0965b-d7cf-485c-a566-5b7ac7c87429@github.com>
 <-IgaNSYmBVVimEHic6I4rf6eAYnjJHhygRgvVhS1OgU=.f9c3b001-aba9-4c21-8a55-262860241ff5@github.com>
 <Pc_ZF9O_3la_br1DC73GHKChS9l0WEGxPCL95L2eon4=.7a0b66cb-d1b7-4ffd-964d-3b1b2cce9512@github.com>
 <9Ajth2BhjNfDOxM_0EmQ_MU3ymGzd7t0VToXi7afkPc=.b285e9f5-10ba-43cf-9156-c913444087bc@github.com>
 <mm6Q5qjkBnhu7A4w_FHPHLMJZ7487oa55vj9gDbaayI=.f8b5db76-d70a-44e8-8125-43617a320a73@github.com>
 <edvwf2Piajkr7AxVaD4vx-8cjOm0JOPQfZ1wrYN18DU=.56c1e50e-5268-48c5-b5f3-25acd4330e19@github.com>
Message-ID: <RqJmJuIM5jw-ewubFO2pY8JdwYdg5YFusBcQkwRUZGo=.17592209-e76b-4889-b911-612450c6986b@github.com>

On Wed, 2 Dec 2020 14:36:08 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> I was doubtful about the improvement regarding other platforms and thanks for testing and verifying. I will make it platform-specific as per your suggestion. 
>> 
>> On other platform, this improvement is not seen for smaller or lesser memory range also right ? similar too SPECJbb_Summary sheet in Excel file.
>
> I have not done extensive measurements, I basically ran some startup benchmarks with:
> -XX:+AlwaysPreTouch -Xms8g -Xmx8g
> -XX:+AlwaysPreTouch -Xms8g -Xmx8g -XX:PreTouchParallelChunkSize=4m
> -XX:+AlwaysPreTouch -Xms8g -Xmx8g -XX:PreTouchParallelChunkSize=128m
> 
> And on Windows going with the current default is the clear winner, while on Linux using 4M gives best results. Given that and your tests I think it is fairly safe to use 4M for Linux, but for the other OSes we need to do more measurements before changing to a different value.

OK and I will make it platform specific as suggested.

Thanks,
Amit

-------------

PR: https://git.openjdk.java.net/jdk/pull/1503


From akozlov at openjdk.java.net  Wed Dec  2 20:24:11 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Wed, 2 Dec 2020 20:24:11 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
Message-ID: <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>

> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
> 
> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
> 
> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
> 
> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
> 
> Tested: 
> * local tier1
> * jdk-submit
> * codesign[2] with hardened runtime and allow-jit but without 
> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
> 
> (adding GC group as suggested by @dholmes-ora)
> 
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2]
>  
>   codesign \
>     --sign - \
>     --options runtime \
>     --entitlements ents.plist \
>     --timestamp \
>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
> [3]
>   <?xml version="1.0" encoding="UTF-8"?>
>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>   <plist version="1.0">
>     <dict>
>       <key>com.apple.security.cs.allow-jit</key>
>       <true/>
>       <key>com.apple.security.cs.disable-library-validation</key>
>       <true/>
>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>       <true/>
>     </dict>
>   </plist>

Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits:

 - Separate executable_memory interface
 - Merge remote-tracking branch 'upstream/master' into 8234930
 - Revert everything
 - Fix test builds (nothing except macOS still buildable)
 - os::reserve to take exec parameter
 - Bookkeeping without interface changes
 - Minimal working example, no uncommit
 - Merge remote-tracking branch 'upstream/master' into 8234930
 - Revert "Use MAP_JIT for CodeCache pages"
   
   This reverts commit 114d9cffd62cab42790b65091648fe75345c4533.
 - Use MAP_JIT for CodeCache pages

-------------

Changes: https://git.openjdk.java.net/jdk/pull/294/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=05
  Stats: 372 lines in 28 files changed: 193 ins; 41 del; 138 mod
  Patch: https://git.openjdk.java.net/jdk/pull/294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Wed Dec  2 21:14:57 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Wed, 2 Dec 2020 21:14:57 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <qi3RAm6k_x7D16Qra9fdwDCQHUdOyKN3RebJgVJqRcw=.05e8b520-a940-4af8-960c-2c0ef21661ad@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <YzatK-1N43NaRdC_eFVBubbdiEcyM8mSNmMQDA1ijaE=.5e050e6f-e2b6-42b2-8989-e600c2155b69@github.com>
 <6iVRP-20baz0_46SouR-dj9SyspR5QvaL9iJMdeipDE=.92688b4e-ebd3-4681-8e63-a4aee752c407@github.com>
 <JC0_ubSzxbFzy4T-z8R7yYRWEe1fp-e4NgkziP44_s4=.b3344b27-cfce-4d28-9bcd-ca2c60db615b@github.com>
 <Ex2LM49zd4hKTYl845sEPpuaMwVaWPF-5FyCqSXWwPI=.74d61f86-3233-4a3d-8366-0f79d268f0f0@github.com>
 <i-_AwXR1i7_kNp-pYUv_PDLf_ptToMAWdkGape6MQZ8=.aab7bb15-462b-4ab8-a13a-4df9aea1d757@github.com>
 <_XaA5cQEInPMn5Q5gj2y7AFCRprFQiYfI6BeUN49FhA=.9f17ae05-b37e-4f40-a83f-fd34aa812575@github.com>
 <PFzznwIPYp2kFR5kXp6vLivLAtm6GJEihlOm6rljZsA=.a5de51e3-d595-46c0-83b6-dbd0e423ad2d@github.com>
 <VGBa5FHWmNaLG-br9EHoNhrmIF3OOcGUap0gT-zkInw=.10084e0a-f039-4971-8eaa-7a3ca7483191@github.com>
 <O2esolxx74O36BQBaFpkYmP75lWzuw102DznqHyPVBU=.b3a8fa4b-5bd3-479b-a2db-18760a879964@github.com>
 <KwVuqzmOKEZlL9vwLHLXuW6BSgezthWrOjH
 hKfHh3Xo=.71c0479f-1997-4c8c-942b-847b702db1e9@github.com>
 <GtK-psYmP-6p9S4jwG7UUo6x1vBhWZVCIRU5mPNILZQ=.6c9d4d41-1e4a-4172-90a2-660978d5fcd9@github.com>
 <qi3RAm6k_x7D16Qra9fdwDCQHUdOyKN3RebJgVJqRcw=.05e8b520-a940-4af8-960c-2c0ef21661ad@github.com>
Message-ID: <DLqR1WXHFWH0hS8TYitrn43EUf76Np-AOXR8WP11QFw=.bd88b796-9bd2-452f-971c-118f7948e832@github.com>

On Wed, 14 Oct 2020 10:56:25 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>>> > GrowableArray maybe not the best choice here since e.g. it requires you to search twice on add. A better solution may be a specialized BST.
>>> 
>>> I assume amount of executable mappings to be small. Depends on if exec parameter available at reserve, it is either only a single one for the CodeCache (see below) or plus several more for mappings with unknown mode (that were not committed yet)
>>> 
>>> > IMHO too heavvy weight for a platform only change.
>>> > If there are other uses for such a solution (managing memory regions, melting them together, splitting them maybe on remove)
>>> > we should not support setting and clearing exec on commit but only on a per-mapping base.
>>> 
>>> It is more simple when the whole mapping is executable or not. We don't need to split/merge on commit/uncommit then. But we need do to something when os::release_memory is called on a submapping of a mapping with unknown status. Like on AIX, uncommit is made https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2096. But here for macOS, I'm trying to avoid any change of behavior for non-exec mappings.
>>> 
>>> If the exec parameter is provided for reserve (as it eventually would be), then we don't need splitting/merging at all. This is what the latest patch is about. I haven't tested that thoroughly yet, but eventually it would be possible to deduce correct exec values for os::reserve based on subsequent os::commit. If we make a step back, we have exec parameter known for reserve and commit, I also pretty sure that it is possible to deduce that for any uncommit (which was one of the initial concerns)
>>> 
>>> Let's agree on some plan how to attack the problem? I would like to distinguish the work toward MAP_JIT and improving interface. Not sure what should come first. Are you still opposing to have exec parameter in os::reserve/commit/uncommit and obligating callers to provide consistent exec values for each, at least at this phase?
>>> 
>>> I mean, eventually we will have a platform-dependent `handle_t` for mapping or equivalent. Like if we provide size of the whole mapping (the context) for each commit_memory on AIX, we won't need to do the bookkeeping. What if os::commit to take ReservedSpace and do something conservative when that is not provided?
>> 
>> 
>> 
>>> > GrowableArray maybe not the best choice here since e.g. it requires you to search twice on add. A better solution may be a specialized BST.
>>> 
>>> I assume amount of executable mappings to be small. Depends on if exec parameter available at reserve, it is either only a single one for the CodeCache (see below) or plus several more for mappings with unknown mode (that were not committed yet)
>>> 
>>> > IMHO too heavvy weight for a platform only change.
>>> > If there are other uses for such a solution (managing memory regions, melting them together, splitting them maybe on remove)
>>> > we should not support setting and clearing exec on commit but only on a per-mapping base.
>>> 
>>> It is more simple when the whole mapping is executable or not. We don't need to split/merge on commit/uncommit then. But we need do to something when os::release_memory is called on a submapping of a mapping with unknown status. Like on AIX, uncommit is made https://github.com/openjdk/jdk/blob/master/src/hotspot/os/aix/os_aix.cpp#L2096. But here for macOS, I'm trying to avoid any change of behavior for non-exec mappings.
>>> 
>>> If the exec parameter is provided for reserve (as it eventually would be), then we don't need splitting/merging at all. This is what the latest patch is about. I haven't tested that thoroughly yet, but eventually it would be possible to deduce correct exec values for os::reserve based on subsequent os::commit. If we make a step back, we have exec parameter known for reserve and commit, I also pretty sure that it is possible to deduce that for any uncommit (which was one of the initial concerns)
>>> 
>>> Let's agree on some plan how to attack the problem? I would like to distinguish the work toward MAP_JIT and improving interface. Not sure what should come first. Are you still opposing to have exec parameter in os::reserve/commit/uncommit and obligating callers to provide consistent exec values for each, at least at this phase?
>>> 
>>> I mean, eventually we will have a platform-dependent `handle_t` for mapping or equivalent. Like if we provide size of the whole mapping (the context) for each commit_memory on AIX, we won't need to do the bookkeeping. What if os::commit to take ReservedSpace and do something conservative when that is not provided?
>> 
>> Are there any users of executable memory which cannot live with anonymous mapping on whatever address with small pages? Does anyone need large pages or a specific wish address?
>> 
>> If not, maybe we really should introduce a (reserve|commit|uncommit|release)_executable_memory() at least temporarily, as you suggested. At least that would be clear, and could provide a clear starting point for a new interface.
>
>> Are there any users of executable memory which cannot live with anonymous mapping on whatever address with small pages? Does anyone need large pages or a specific wish address?
> 
> Nothing jumps out immediately. 
> Recently we've come across CDS problems, which also requires executable permissions, but it uses file-based mapping and os::map_memory.
> 
>> If not, maybe we really should introduce a (reserve|commit|uncommit|release)_executable_memory() at least temporarily, as you suggested. At least that would be clear, and could provide a clear starting point for a new interface.
> 
> Then I'll start doing this. I'll create another JBS issue for the interface closer to the point when it is ready.
> 
> Thanks!

Hi, I've just pushed an update with the new executable_memory interface. Still WIP, few notes and one major problem description follow. 

The new interface does not allow reserve executable memory with a specific address (restriction of MAP_JIT). Before the change, such executable memory was required by
* workaround on 32bit linux-x86 https://github.com/openjdk/jdk/pull/294/files#diff-ec5e71a69afd99d4cfec5f5c657242bae9434656025694e67cae03a5e3722e84
* reading of CDS archive on windows https://github.com/openjdk/jdk/pull/294/files#diff-c93e710cbc38c989c0ab250cd4ac04d2ab157f44ee535bb035f473c42d0557c7
In both cases, it was changed to reserve non-executable memory and mprotect, which produce same result as before. Generally speaking, only platform-specific workarounds were requiring a specific address for executable mapping.

MAP_JIT implementation on top of the new interface is really tiny, so it does not need a separate commit, I think.

The major problem here is that executable memory now cannot be reserved at a certain alignment, although it may be required. I think the alignment should be an extra parameter to executable_memory_reserve and not e.g. reserve_executable_memory_aligned. So I'll work in this direction.

Two functions like reserve_memory and reserve_memory_aligned look excessive. ReservedSpace for some reason tries to use the unaligned version first and when it fails (how should it know the result should be aligned?), fallbacks to reserve_memory_alignment. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L227. It will be more straightforward to ask for alignment from the start when it's required. I'm going to make alignment a parameter for reserve_memory as well, with the default value to be "no specific alignment", and to remove reserve_memory_aligned. It will simplify the implementation of reserve_executable_memory with alignment argument, and I hope to propose the suggested refactoring separately from this PR.

Please let me know if I missed something, or going to miss :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From Charlie.Gracie at microsoft.com  Wed Dec  2 22:57:57 2020
From: Charlie.Gracie at microsoft.com (Charlie Gracie)
Date: Wed, 2 Dec 2020 22:57:57 +0000
Subject: [G1GC] Evacuation failures with bursts of humongous object
 allocations
Message-ID: <49534817-16FD-4527-AD7A-5D9B7D7AA10B@microsoft.com>

Hi,

Sorry for the delayed response.

I applied your suggestions to my prototype and things are working well. I am ready to
open a PR to help me capture and resolve further enhancements. You can find a log [1]
that contains most of the extra information you were looking for. Basically, 100% of
the time spent in "Evacuation Failure" is in "Remove self forwards". There are 4 cases
of "To-space exhausted" in the log I uploaded.

>> I believe this could be calculated at the end of a pause (young or mixed), if the next
>> GC will be a mixed collect, otherwise it is 0. Do you agree?

> Unfortunately, I need to disagree :) Because some survivors will get 
> spilled into old all the time (in a non-optimized general application).
>  
> However old gen allocation is already tracked per GC as a side-effect of 
> the existing PLAB allocation tracking mechanism. See 
> G1CollectedHeap::record_obj_copy_mem_stats() about these values in more 
> detail.

This is the only thing I am not sure I have addressed properly in my current changes. 
Hopefully, we can discuss this further in the PR once it is opened.

I will file a JBS issue for this and get the PR opened so that I can work towards a
final solution.

Thanks again for all of the help so far,
Charlie

[1] https://gist.github.com/charliegracie/16b51f9cc867f166cd5df4ebc4bee378

?On 2020-11-12, 5:13 AM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:

    Hi,
    
    On 11.11.20 23:50, Charlie Gracie wrote:
    > Hi Thomas,
    > 
    > Thanks for the detailed reply.
    > 
    >>   You probably are missing "... that are short-living" in your
    >>   description. Otherwise the suggested workaround does not... work.
    > 
    > Yes that is correct I should have said short lived humongous object allocations.
    > 
    >>   Why regular objects too?
    > 
    > Originally I only added it to the humongous allocation path. Occasionally, the issue
    > would still happen if the burst of short-lived humongous allocations finished before
    > the free region count dropped below my threshold. Eden regions would continue
    > to consume free regions and the following GC encountered To-space exhaustion
    > due to 0 free regions. Adding the same check to both slow allocation paths resolved
    > the issue and after reviewing it more I believe it needs to be there.
    
    Me too.
    
    > 
    >>   Maybe to completely obsolete G1ReservePercent for this purpose?
    > 
    > Yes I think that a change like this could obsolete G1ReservePercent. I have been
    > testing my changes with G1ReservedPercent set to 0 by default for the last few
    > days without any issue.
    
    G1ReservePercent is still used by adaptive IHOP for a slightly different 
    purpose, so we can't remove it/obsolete right away. But it is good to 
    find a better alternative for one of its uses.
    
    > 
    >>   This looks like: "do a gc if the amount of free space after evacuating
    >>   currently allocated regions is larger than the allocation".
    > 
    > Yes that is it.
    > 
    >>   - for the first term, eden regions, g1 already provides more accurate(?)
    >>   prediction of survived bytes after evac using survival rate predictors
    >>   (see G1SurvRateGroup), one use is in
    >>   G1Policy::predict_eden_copy_time_ms(), another in
    >>   G1Policy::predict_bytes_to_copy.
    > 
    > My initial prototype was being very conservative in its decisions to try to avoid the situation
    > while I continued the investigation. Improving the calculation will make this change much
    > better. Thanks for pointing me in the right direction!
    > 
    > I am now testing a version that queries eden_surv_rate_group->accum_surv_rate_pred
    > using the current count of Eden regions. This is providing very accurate values. Sometimes
    > it is a little off because of allocation fragmentation as you explained. To combat this I am
    > currently multiplying the result by 1.05 to compensate for the allocation fragmentation
    > in the PLABs. The extra 5% could likely be replaced with some historical data based on
    > PLAB wasted stats.
    
    There is already a TargetPLABWastePct option that is typically kept very 
    well by actual allocation.
    
    There is another complication that measure does not catch, that is 
    fragmentation/waste at the end of the last allocation region (for 
    survivors). This is due to that regions are required to be of a 
    particular type, i.e. you can't have regions of two different types, so 
    you need to waste the space at the end of that last region.
    
    This only applies to survivor regions, G1 reuses the last old gen 
    allocation region for the next gc (if useful and possible) there. You 
    obviously can't do that meaningfully for survivors as they are evacuated 
    in whole the next gc again.
    
    > 
    > Is this more along the lines of what you were thinking for this calculation?
    
    Yes. See above paragraph for some more thoughts on how to refine this.
    
    > 
    >>   Note that G1 does track survival rate for survivor regions too, but it's
    >>   not good in my experience - survival rate tracking assumes that objects
    >>   within a region are of approximately the same age (the survival rate
    >>   prediction mechanism assigns the same "age" to objects allocated in the
    >>   same region. This is not the object age in the object headers generally
    >>   used!), which the objects in survivor regions tend to simply not be. The
    >>   objects in there are typically jumbled together from many different ages
    >>   which completely violates that assumption.
    > 
    > I am currently investigating iterating the survivor regions at the end of a GC
    > and calling G1Policy::predict_bytes_to_copy on each of them to estimate
    > their survival rate for the next collect. For my tests the result seem accurate
    > but I will trust your assessment that it can be off due to the reasons you listed.
    > This is definitely an improvement of my initial prototype.
    
    We can start by using that and refine from there.
    
    > 
    >>   - potential surviving objects from old region evacuations are missing
    >>   completely in the formula. I presume in your case these were not
    >>   interesting because (likely) this application mostly does short living
    >>   humongous allocations and otherwise keeps a stable old gen?
    > 
    > You are correct. In my case old space is very stable if you exclude the short-lived
    > humongous allocations. A final solution should have included this to be complete,
    > but if G1ReservePercent is obsoleted as part of this old regions _need_ to be
    > incorporated to make sure no workloads are regressed.
    > 
    > I believe this could be calculated at the end of a pause (young or mixed), if the next
    > GC will be a mixed collect, otherwise it is 0. Do you agree?
    
    Unfortunately, I need to disagree :) Because some survivors will get 
    spilled into old all the time (in a non-optimized general application).
    
    However old gen allocation is already tracked per GC as a side-effect of 
    the existing PLAB allocation tracking mechanism. See 
    G1CollectedHeap::record_obj_copy_mem_stats() about these values in more 
    detail.
    
    > 
    > I have looked into walking N regions of G1Policy::_collect_set->candidates() and
    > calling G1Policy:predict_bytes_to_copy to calculate the amount that may be
    > evacuated. I have been using G1Policy::calc_min_old_cset_length() to decide how
    > many regions to include in the calculation. Is this a reasonable approach?
    
    For old gen candidates this seems to be a good approach to at least get 
    an idea about the spilling. Note that G1Policy:predict_bytes_to_copy() 
    just returns HeapRegion::used() for old gen regions, so this will be 
    quite conservative. Any improvements to that are welcome, but it will 
    work for a first heuristic obviously :)
    
    One could also designate this kind of gc as kind of 
    "evacuation-failure-prevention" gc, and use that to direct the old gen 
    region selection code in that gc to limit itself to the minimum old gen 
    cset length too for the first pass(!) now (I have not thought about 
    implications of that - it might not be a good idea because of the 
    expected timing of these failures usually being close to mark finish 
    anyway and typically the first few regions in the collection set 
    candidates are almost empty, so it would be a huge gain to clean out as 
    much as possible; but the possibility of the incremental old gen 
    evacuation provides a huge safety net here; just initial random thoughts 
    without having actually put time into it) - unless we want to add some 
    prediction, for example per region-age based, for it.
    
    > 
    >>   So basically the intent is to replace G1ReservePercent for this purpose,
    >>   making it automatic, which is not a bad idea at all.
    >>    
    >>   One problem I can see in this situation is that what if that GC does not
    >>   free humongous objects memory? Is the resulting behavior better than
    >>   before, or in which situation it is(n't)?
    > 
    > My initial expectation was that if the humongous objects are long lived this change
    > does not impact those situations. This is something that I have to think about more
    > to come up with a better / complete answer.
    > 
    >>   And, is there anything that can be done to speed up evacuation failure?
    >>   :) Answering my rhetorical question: very likely, see the issues with
    >>   evacuation failure collected using the gc-g1-pinned-regions labels
    >>   lately [1], in particular JDK-8254739 [2].
    > 
    > Thanks! Added them to my reading list as I think improving evacuation failure
    > would be great even if I wasn't fighting this current issue :)
    
    I just saw that I wasn't specific in the CR what I was thinking about. 
    Duh. I updated some CRs a bit, and added more with more ideas to the 
    list of those marked with gc-g1-pinned-regions. (If nobody asks about 
    it, I'm too lazy to write them down).
    
    Please tell me before starting working on any of those, I might have 
    prototype code or additional thoughts about all of them.
    
    > 
    >>   So it would be interesting to see the time distribution for evacuation
    >>   failure (gc+phases=trace) and occupancy distribution of these failures.
    > 
    > I will upload some full logs with that option enabled to my GitHub or
    > somewhere accessible soon and point you to them. As an FYI I am keeping
    > the GitHub branch listed in my first email up to date with my current changes.
    > 
    
    Okay. Thanks.
    
    > Thanks a lot for the comments and suggestions!
    > Charlie
    > 
    
    Thanks,
       Thomas
    

From iklam at openjdk.java.net  Wed Dec  2 22:58:01 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 2 Dec 2020 22:58:01 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
Message-ID: <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>

On Wed, 2 Dec 2020 20:24:11 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
>> 
>> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
>> 
>> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
>> 
>> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
>> 
>> Tested: 
>> * local tier1
>> * jdk-submit
>> * codesign[2] with hardened runtime and allow-jit but without 
>> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
>> 
>> (adding GC group as suggested by @dholmes-ora)
>> 
>> 
>> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
>> [2]
>>  
>>   codesign \
>>     --sign - \
>>     --options runtime \
>>     --entitlements ents.plist \
>>     --timestamp \
>>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
>> [3]
>>   <?xml version="1.0" encoding="UTF-8"?>
>>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>>   <plist version="1.0">
>>     <dict>
>>       <key>com.apple.security.cs.allow-jit</key>
>>       <true/>
>>       <key>com.apple.security.cs.disable-library-validation</key>
>>       <true/>
>>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>>       <true/>
>>     </dict>
>>   </plist>
>
> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits:
> 
>  - Separate executable_memory interface
>  - Merge remote-tracking branch 'upstream/master' into 8234930
>  - Revert everything
>  - Fix test builds (nothing except macOS still buildable)
>  - os::reserve to take exec parameter
>  - Bookkeeping without interface changes
>  - Minimal working example, no uncommit
>  - Merge remote-tracking branch 'upstream/master' into 8234930
>  - Revert "Use MAP_JIT for CodeCache pages"
>    
>    This reverts commit 114d9cffd62cab42790b65091648fe75345c4533.
>  - Use MAP_JIT for CodeCache pages

The CDS changes in filemap.cpp look reasonable to me. Today this code is used on Windows only, but we are thinking of using it for all platforms (sometime in JDK 17). Do you think `os::protect_memory(base, size, os::MEM_PROT_RWX)` will work on all platforms?

-------------

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/294


From dongbohe at openjdk.java.net  Thu Dec  3 03:16:54 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Thu, 3 Dec 2020 03:16:54 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v5]
In-Reply-To: <-BgjGGuqHSGLVTkiYkLrcFK6hgrQQY-RsTrNGpE-vi4=.2b8a0a90-ac11-4f79-a944-e99853d378b0@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <iA_NaKUJ1VXeqS1QgBVroHC9BHhksX1Yo9aUtHsMd3U=.93e3a074-be3d-4759-84bc-5ba63fb427d7@github.com>
 <-BgjGGuqHSGLVTkiYkLrcFK6hgrQQY-RsTrNGpE-vi4=.2b8a0a90-ac11-4f79-a944-e99853d378b0@github.com>
Message-ID: <o-kbK-GnksbiW5ZQHQiGvcnX4vanKqbqieUMNT59Gqs=.0a8b0f13-313a-4c5e-8766-77386039de84@github.com>

On Wed, 2 Dec 2020 12:16:35 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Dongbo He has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   add include to g1EvacStats.cpp
>
> Looks good, thanks for fixing this.

Thank you for your review, kstefanj.

As we saw in the test, this change will cause `./test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java` to fail.  I'm working on this case and will push it here for review when the work is done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From akozlov at openjdk.java.net  Thu Dec  3 08:51:01 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Thu, 3 Dec 2020 08:51:01 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
Message-ID: <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>

On Wed, 2 Dec 2020 22:55:32 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> Do you think os::protect_memory(base, size, os::MEM_PROT_RWX) will work on all platforms?

It looks so. Not sure about AIX, it ends up with ::mprotect, at least theoretically should be fine. On Linux and macOS without hardening it should be also OK. Without this patch, we don't support macOS hardened mode at all. But I have some private hacks for CDS with hardening, they reuse some of windows code for reading the archive content instead of mapping, so the future looks even more convenient.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From tschatzl at openjdk.java.net  Thu Dec  3 08:57:02 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 3 Dec 2020 08:57:02 GMT
Subject: RFR: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption)
Message-ID: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>

Hi all,

  can I have reviews for this little change to strengthen the requirements for calling G1HeapVerifier::verify(VerifyOption)?

In particular, instead of the failed attempt to abort verification if we are not at a safepoint, assert that we are at a safepoint.

Testing: tier1-5 with no failures

Thanks,
  Thomas

-------------

Commit messages:
 - Initial import, testing

Changes: https://git.openjdk.java.net/jdk/pull/1590/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1590&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257509
  Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1590.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1590/head:pull/1590

PR: https://git.openjdk.java.net/jdk/pull/1590


From thomas.schatzl at oracle.com  Thu Dec  3 09:29:56 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 3 Dec 2020 10:29:56 +0100
Subject: [G1GC] Evacuation failures with bursts of humongous object
 allocations
In-Reply-To: <49534817-16FD-4527-AD7A-5D9B7D7AA10B@microsoft.com>
References: <49534817-16FD-4527-AD7A-5D9B7D7AA10B@microsoft.com>
Message-ID: <f4955b91-76e9-8ec2-d831-d2a1575761fc@oracle.com>

Hi Charlie,

On 02.12.20 23:57, Charlie Gracie wrote:
> Hi,
> 
> Sorry for the delayed response.
> 
> I applied your suggestions to my prototype and things are working well. I am ready to
> open a PR to help me capture and resolve further enhancements. You can find a log [1]

Great!

> that contains most of the extra information you were looking for. Basically, 100% of
> the time spent in "Evacuation Failure" is in "Remove self forwards". There are 4 cases
> of "To-space exhausted" in the log I uploaded.
> 

Thanks. Some observations:

- generational hypothesis works very well for this application as you 
already indicated. I.e. in non-failing gcs the promotion is negligible. 
So there is a high likelihood that the failing regions are always almost 
empty.

- all or almost all young regions have failures, which explains the long 
evacuation failure handling. Unfortunately the current algorithm needs 
to iterate all (live and dead) objects during self-forward removal. 
Something like JDK-8254739 could certainly do wonders.

Also being less conservative about reclaiming failed regions could help 
in subsequent gcs.

>>> I believe this could be calculated at the end of a pause (young or mixed), if the next
>>> GC will be a mixed collect, otherwise it is 0. Do you agree?
> 
>> Unfortunately, I need to disagree :) Because some survivors will get
>> spilled into old all the time (in a non-optimized general application).
>>   
>> However old gen allocation is already tracked per GC as a side-effect of
>> the existing PLAB allocation tracking mechanism. See
>> G1CollectedHeap::record_obj_copy_mem_stats() about these values in more
>> detail.
> 
> This is the only thing I am not sure I have addressed properly in my current changes.
> Hopefully, we can discuss this further in the PR once it is opened.
> 
> I will file a JBS issue for this and get the PR opened so that I can work towards a
> final solution.

Okay.

Thanks,
   Thomas


From sjohanss at openjdk.java.net  Thu Dec  3 09:31:55 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 3 Dec 2020 09:31:55 GMT
Subject: RFR: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption)
In-Reply-To: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
References: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
Message-ID: <O73zMT6t8XO-Htv8LIXNLpP--8sdmAlV_FJIjL6TSdw=.9b426095-c140-4e93-88ef-52c1c4e1a34b@github.com>

On Thu, 3 Dec 2020 08:52:02 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this little change to strengthen the requirements for calling G1HeapVerifier::verify(VerifyOption)?
> 
> In particular, instead of the failed attempt to abort verification if we are not at a safepoint, assert that we are at a safepoint.
> 
> Testing: tier1-5 with no failures
> 
> Thanks,
>   Thomas

Looks good. I would prefer if you remove the blank line between the asserts (line 475) before integrating.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1590


From tschatzl at openjdk.java.net  Thu Dec  3 09:36:09 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 3 Dec 2020 09:36:09 GMT
Subject: RFR: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption) [v2]
In-Reply-To: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
References: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
Message-ID: <Rf9XK2wICL-VcTuc2z_MQWD48S411pvCbHBs7fsNRvo=.975795c3-9019-4359-a0f9-eb18f0d46406@github.com>

> Hi all,
> 
>   can I have reviews for this little change to strengthen the requirements for calling G1HeapVerifier::verify(VerifyOption)?
> 
> In particular, instead of the failed attempt to abort verification if we are not at a safepoint, assert that we are at a safepoint.
> 
> Testing: tier1-5 with no failures
> 
> Thanks,
>   Thomas

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  sjohanss review, remove newline

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1590/files
  - new: https://git.openjdk.java.net/jdk/pull/1590/files/2fba7cc3..93da0d17

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1590&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1590&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1590.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1590/head:pull/1590

PR: https://git.openjdk.java.net/jdk/pull/1590


From ayang at openjdk.java.net  Thu Dec  3 10:15:56 2020
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 3 Dec 2020 10:15:56 GMT
Subject: RFR: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption) [v2]
In-Reply-To: <Rf9XK2wICL-VcTuc2z_MQWD48S411pvCbHBs7fsNRvo=.975795c3-9019-4359-a0f9-eb18f0d46406@github.com>
References: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
 <Rf9XK2wICL-VcTuc2z_MQWD48S411pvCbHBs7fsNRvo=.975795c3-9019-4359-a0f9-eb18f0d46406@github.com>
Message-ID: <zCYUcxUUts7RvpDG6pDM-eifYbbdVgWObTLKSiRecy0=.822d9896-e50a-476e-8be8-ec3e4d57e375@github.com>

On Thu, 3 Dec 2020 09:36:09 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I have reviews for this little change to strengthen the requirements for calling G1HeapVerifier::verify(VerifyOption)?
>> 
>> In particular, instead of the failed attempt to abort verification if we are not at a safepoint, assert that we are at a safepoint.
>> 
>> Testing: tier1-5 with no failures
>> 
>> Thanks,
>>   Thomas
>
> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
> 
>   sjohanss review, remove newline

I wonder if it's possible/beneficial to use `assert_at_safepoint_on_vm_thread` to cover two assertions here.

-------------

Marked as reviewed by ayang (Author).

PR: https://git.openjdk.java.net/jdk/pull/1590


From tschatzl at openjdk.java.net  Thu Dec  3 10:31:09 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 3 Dec 2020 10:31:09 GMT
Subject: RFR: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption) [v3]
In-Reply-To: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
References: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
Message-ID: <6e3bEODFQoQ3KDFt91Q8CqSqXLP1FnKxurblZTtxjJE=.55170721-d083-4b97-97a0-93770e607267@github.com>

> Hi all,
> 
>   can I have reviews for this little change to strengthen the requirements for calling G1HeapVerifier::verify(VerifyOption)?
> 
> In particular, instead of the failed attempt to abort verification if we are not at a safepoint, assert that we are at a safepoint.
> 
> Testing: tier1-5 with no failures
> 
> Thanks,
>   Thomas

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  ayang review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1590/files
  - new: https://git.openjdk.java.net/jdk/pull/1590/files/93da0d17..260acbc1

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1590&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1590&range=01-02

  Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1590.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1590/head:pull/1590

PR: https://git.openjdk.java.net/jdk/pull/1590


From pliden at openjdk.java.net  Thu Dec  3 11:14:56 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Thu, 3 Dec 2020 11:14:56 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <ceLs7zyG4r7o19KFbzHIezK4JcMYG_Oc7FdZDw6bRSE=.cf52acd2-8fe8-4b63-8f44-78b716d6fe75@github.com>
References: <DI8UYHrS6qJyzLFHrKlvPalR6kwDbTUpET11NcIMwDY=.ffe75199-09a4-4482-ab9d-20faf9450fd6@github.com>
 <xE4_wLg7izPrcfhKnhnmTGqYCTc6vEwbbIHchE5BFuM=.b76bb01a-132d-4ac9-95ff-f486ea95ddc3@github.com>
 <yY6fFpPM5OIuFjitGBO1uZ12qkeGe7xpQo75Q0zOEKI=.10291485-3f50-41f0-8043-00573d9257f4@github.com>
 <ceLs7zyG4r7o19KFbzHIezK4JcMYG_Oc7FdZDw6bRSE=.cf52acd2-8fe8-4b63-8f44-78b716d6fe75@github.com>
Message-ID: <VejfJPlC4XzzOObtcnQi219VUGfSAa8mz_heomr5ABI=.90e3bd24-ffbe-406a-9f9e-13241fca5b08@github.com>

On Tue, 1 Dec 2020 12:33:31 GMT, Per Liden <pliden at openjdk.org> wrote:

>>> Just a friendly ping. Still looking for reviewers for this fix.
>> 
>> Until we resolve the discussion in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987), I don't think your suggested fix should be applied since it could be viewed as a workaround to a debug agent issue (not shutting down GC during `VM.suspendAll`) or as something that needs to be clarified in the JDI and JDWP specs (checking for `ObjectReference.disableCollection` failures, even when under `VM.suspendAll`, and retrying the allocation). I'd like to see the discussion resolved and follow-on bugs files.
>
> Sorry, I had missed your latest reply in the JDK-8255987. Let's continue the discussion there.

As a result of the discussion in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987), I'm withdrawing this PR and will open a new PR with the alternative solution discussed in the bug report.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1348


From pliden at openjdk.java.net  Thu Dec  3 11:14:57 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Thu, 3 Dec 2020 11:14:57 GMT
Subject: Withdrawn: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <DI8UYHrS6qJyzLFHrKlvPalR6kwDbTUpET11NcIMwDY=.ffe75199-09a4-4482-ab9d-20faf9450fd6@github.com>
References: <DI8UYHrS6qJyzLFHrKlvPalR6kwDbTUpET11NcIMwDY=.ffe75199-09a4-4482-ab9d-20faf9450fd6@github.com>
Message-ID: <8AXKoU5QZN1F-ZeqclS573nfppjfOw48k3VlUR0qVNU=.8832b6fc-44d0-49bc-856f-602f812bb0d4@github.com>

On Fri, 20 Nov 2020 13:23:28 GMT, Per Liden <pliden at openjdk.org> wrote:

> A number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests don't take this into account, and can hence get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is needed (but also note that the object could have been collected by the time we call `disableCollection()`).
> 
> In addition, `SDEDebuggee::executeTestMethods()` creates class loaders, which shortly after dies (and potentially gets collected). This creates problems on the debugger side, since code locations in this (now potentially unloaded class/method) gets invalidated. We must ensure that these class loaders stay alive to avoid these problems.
> 
> Normally, these problems are fairly hard to provoke, since you have to be unlucky and get the timing of a GC just right. However, it's fairly easy to provoke by forcing GC cycles to happen all the time (e.g. using ZGC with -XX:ZCollectionInterval=0.01) and/or inserting `Thread.sleep()` calls right after calls to `newInstance()`.
> 
> This patch fixes all instances of this problem that I managed to find.
> 
> Testing: All `vmTestbase/nsk/jdi/` tests now pass, even when using the above described measures to try to provoke the problem.

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1348


From jiefu at openjdk.java.net  Thu Dec  3 11:15:15 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Thu, 3 Dec 2020 11:15:15 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
Message-ID: <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>

> Hi all,
> 
> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
> For example, this assert [1] fired on our testing boxes.
> 
> It can be reproduced by the following two steps on Linux-64:
>   1) ulimit -v 8388608
>   2) java -XX:MinHeapSize=5g -version
> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
> 
> One more important fact is that this bug can be more common on Linux-32 systems.
> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
> 
> Testing:
>   - tier1 ~ tier3 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567

Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:

 - Merge branch 'master' into JDK-8257230
 - Refinement & jtreg test
 - Merge branch 'master' into JDK-8257230
 - Merge branch 'master' into JDK-8257230
 - 8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1492/files
  - new: https://git.openjdk.java.net/jdk/pull/1492/files/0389bc4d..92208d48

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=01-02

  Stats: 10006 lines in 347 files changed: 7909 ins; 1033 del; 1064 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1492.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1492/head:pull/1492

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Thu Dec  3 12:35:00 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Thu, 3 Dec 2020 12:35:00 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
Message-ID: <DeGqnh5_7pwpvj7h_7m0dGDeQyy1TCfiX0Dhv_t8NHE=.9a5e033b-df94-4cc4-865d-38c5c9e835a5@github.com>

On Mon, 30 Nov 2020 13:42:56 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>> 
>>  - Merge branch 'master' into JDK-8257230
>>  - Refinement & jtreg test
>>  - Merge branch 'master' into JDK-8257230
>>  - Merge branch 'master' into JDK-8257230
>>  - 8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes
>
> I think the change is good, but please add a test for this.
> 
> E.g. vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java shows how to run a command with an ulimit prepended.

Hi @tschatzl and @kstefanj , could you help to review this change?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From tschatzl at openjdk.java.net  Thu Dec  3 12:43:56 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 3 Dec 2020 12:43:56 GMT
Subject: RFR: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption) [v2]
In-Reply-To: <zCYUcxUUts7RvpDG6pDM-eifYbbdVgWObTLKSiRecy0=.822d9896-e50a-476e-8be8-ec3e4d57e375@github.com>
References: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
 <Rf9XK2wICL-VcTuc2z_MQWD48S411pvCbHBs7fsNRvo=.975795c3-9019-4359-a0f9-eb18f0d46406@github.com>
 <zCYUcxUUts7RvpDG6pDM-eifYbbdVgWObTLKSiRecy0=.822d9896-e50a-476e-8be8-ec3e4d57e375@github.com>
Message-ID: <zPqAXuQ2FCO7VLKMnX3hKhyI_kX9fHnymCzWpeSdCsk=.e00bfdb0-8ee2-4a3d-bef7-32ea512bc5d9@github.com>

On Thu, 3 Dec 2020 10:13:12 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   sjohanss review, remove newline
>
> I wonder if it's possible/beneficial to use `assert_at_safepoint_on_vm_thread` to cover two assertions here.

Changed to use `assert_at_safepoint_on_vm_thread`. Thanks. Good catch!

-------------

PR: https://git.openjdk.java.net/jdk/pull/1590


From pliden at openjdk.java.net  Thu Dec  3 13:19:00 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Thu, 3 Dec 2020 13:19:00 GMT
Subject: RFR: 8255987: JDI tests fail with com.sun.jdi.ObjectCollectedException
Message-ID: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>

This PR replaces the withdraw PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.

The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.

However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):

> Going back to the spec, ObjectReference.disableCollection() says:
> 
> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
> 
> and
> 
> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
> 
> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
> 
> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
> 
> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().

Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.

However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.

This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.

Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
 - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
 - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
 - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
 - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.

Testing:
- More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

-------------

Commit messages:
 - 8255987: JDI tests fail with com.sun.jdi.ObjectCollectedException

Changes: https://git.openjdk.java.net/jdk/pull/1595/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1595&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8255987
  Stats: 161 lines in 8 files changed: 132 ins; 0 del; 29 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1595.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1595/head:pull/1595

PR: https://git.openjdk.java.net/jdk/pull/1595


From sjohanss at openjdk.java.net  Thu Dec  3 13:50:00 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 3 Dec 2020 13:50:00 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
Message-ID: <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>

On Thu, 3 Dec 2020 11:15:15 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Hi all,
>> 
>> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
>> For example, this assert [1] fired on our testing boxes.
>> 
>> It can be reproduced by the following two steps on Linux-64:
>>   1) ulimit -v 8388608
>>   2) java -XX:MinHeapSize=5g -version
>> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
>> 
>> One more important fact is that this bug can be more common on Linux-32 systems.
>> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
>> 
>> Testing:
>>   - tier1 ~ tier3 on Linux/x64
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
>> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
>> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567
>
> Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8257230
>  - Refinement & jtreg test
>  - Merge branch 'master' into JDK-8257230
>  - Merge branch 'master' into JDK-8257230
>  - 8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

Took a closer look at the test now, some comment below.

test/hotspot/jtreg/gc/ergonomics/TestMinHeapSize.java line 33:

> 31:  *
> 32:  * @comment Not run on AIX as it does not support ulimit -v
> 33:  * @requires os.family != "aix"

I would change this to only be run on Linux:
 * @requires os.family == "linux"

test/hotspot/jtreg/gc/ergonomics/TestMinHeapSize.java line 46:

> 44:         String cmd = ProcessTools.getCommandLine(ProcessTools.createTestJvm(
> 45:                 "-XX:MinHeapSize=" + "260m", "-version"));
> 46:         cmd = escapeCmd(cmd);

If we change to only run on Linux, this is not needed.

test/hotspot/jtreg/gc/ergonomics/TestMinHeapSize.java line 45:

> 43:     public static void main(String[] args) throws Throwable {
> 44:         String cmd = ProcessTools.getCommandLine(ProcessTools.createTestJvm(
> 45:                 "-XX:MinHeapSize=" + "260m", "-version"));

`createTestJvm()` will pick up additional options passed by the test runner and there might be conflicting options. So I would suggest not picking up any additional options.

test/hotspot/jtreg/gc/ergonomics/TestMinHeapSize.java line 59:

> 57: 
> 58:         oa.shouldNotContain("hs_err")
> 59:           .shouldNotContain("Internal Error");

Should it also check that the exit status is ok, or can't we expect that for sure?

-------------

Changes requested by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1492


From zgu at openjdk.java.net  Thu Dec  3 15:11:03 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 15:11:03 GMT
Subject: RFR: 8257641: Shenandoah: Query is_at_shenandoah_safepoint() from
 control thread should return false
Message-ID: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>

Since Shenandoah GC safepoints are scheduled by control thread, so that, if querying comes from control thread, the answer should be false.

is_at_shenandoah_safepoint() is still not reliable, even after JDK-8253778, we may consider to scratch it.

- [x] hotspot_gc_shenandoah x86_64 and x86_32

-------------

Commit messages:
 - JDK-8257641

Changes: https://git.openjdk.java.net/jdk/pull/1600/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1600&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257641
  Stats: 11 lines in 2 files changed: 9 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1600.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1600/head:pull/1600

PR: https://git.openjdk.java.net/jdk/pull/1600


From stuefe at openjdk.java.net  Thu Dec  3 15:22:57 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Dec 2020 15:22:57 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
Message-ID: <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>

On Thu, 3 Dec 2020 08:48:13 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> The CDS changes in filemap.cpp look reasonable to me. Today this code is used on Windows only, but we are thinking of using it for all platforms (sometime in JDK 17). Do you think `os::protect_memory(base, size, os::MEM_PROT_RWX)` will work on all platforms?
>
>> Do you think os::protect_memory(base, size, os::MEM_PROT_RWX) will work on all platforms?
> 
> It looks so. Not sure about AIX, it ends up with ::mprotect, at least theoretically should be fine. On Linux and macOS without hardening it should be also OK. Without this patch, we don't support macOS hardened mode at all. But I have some private hacks for CDS with hardening, they reuse some of windows code for reading the archive content instead of mapping, so the future looks even more convenient.

Hi Anton,

Unfortunately I am not sure anymore that a separate API for reserving code is practical. See https://github.com/openjdk/jdk/pull/1153 (https://bugs.openjdk.java.net/browse/JDK-8256155). People want to be able to use large paged memory for code. Large paged memory gets allocated via os::reserve_memory_special(). 

Today we already split the API space into two groups: os::reserve_memory() and friends, and os::(reserve|release)_memory_special(). Adding an "executable" API group to that would multiply the number of APIs by two. I am afraid we are stuck with the exec flag on reserve and commit.

If you are interested, there was a lively discussion under https://github.com/openjdk/jdk/pull/1161 (https://bugs.openjdk.java.net/browse/JDK-8243315). Among other things it was discussed whether we should get rid of multi-page regions (mixing various page sizes). See Linux::reserve_memory_special_hugetlb_mixed. This would simplify coding.

Since I feel bad now for causing you work, I give up any opposition to extending the APIs with the exec parameter. So I had a closer look at your original change again:

https://github.com/openjdk/jdk/pull/294/commits/114d9cffd62cab42790b65091648fe75345c4533

I wonder whether we could simplify things, if we let go of the notion that the code heap gets only committed on demand. I do not know how MacOS memory overcommit works in detail. But on Linux, committing memory increases process footprint toward the commit charge limit, and may need swap space, but it does not increase RSS as long as the memory is not touched. I do not know how important on MacOS delaying memory commit really is. If its not important, then we could just do this:

reserve_memory :
	- not executable: mmap MAP_NORESERVE, PROT_NONE
	- executable: mmap MAP_JIT *without* MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)

commit_memory
	- not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
	- executable: (return, nothing to do)                   

uncommit_memory
	- not executable: mmap MAP_NORESERVE, PROT_NONE
	- executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)

Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.

> Two functions like reserve_memory and reserve_memory_aligned look excessive. ReservedSpace for some reason tries to use the unaligned version first and when it fails (how should it know the result should be aligned?), fallbacks to reserve_memory_alignment. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L227. It will be more straightforward to ask for alignment from the start when it's required. I'm going to make alignment a parameter for reserve_memory as well, with the default value to be "no specific alignment", and to remove reserve_memory_aligned. It will simplify the implementation of reserve_executable_memory with alignment argument, and I hope to propose the suggested refactoring separately from this PR.
> 

This makes sense, but is outside the scope of this RFE.

In general, I think in the API we need a separation between page size and alignment (both have been confused in the past, see discussion under https://github.com/openjdk/jdk/pull/1161). But page size is irrelevant for reserve_memory - which reserves per default just with os::vm_page_size() - but for reserve_memory_special we should specify both.

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From shade at openjdk.java.net  Thu Dec  3 15:25:57 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 3 Dec 2020 15:25:57 GMT
Subject: RFR: 8257641: Shenandoah: Query is_at_shenandoah_safepoint() from
 control thread should return false
In-Reply-To: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
References: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
Message-ID: <_-Jn06hZ_PDSEomM9BbFJemMght0a8AlBGI6njqhCmA=.0c2781a3-ffb0-4d4f-b348-a5d160aa31eb@github.com>

On Thu, 3 Dec 2020 15:05:33 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> Since Shenandoah GC safepoints are scheduled by control thread, so that, if querying comes from control thread, the answer should be false.
> 
> is_at_shenandoah_safepoint() is still not reliable, even after JDK-8253778, we may consider to scratch it.
> 
> - [x] hotspot_gc_shenandoah x86_64 and x86_32

This looks good, consider fixing a few nits below.

src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 458:

> 456: public:
> 457:   ShenandoahControlThread*   control_thread() const    { return _control_thread;    }
> 458: 

Maybe it is easier to `friend` the class that wants it, instead of exposing the control thread for everyone?

src/hotspot/share/gc/shenandoah/shenandoahUtils.hpp line 153:

> 151:     // Shenandoah GC specific safepoints are scheduled by control thread,
> 152:     // so that, querying from control thread can not happen during those
> 153:     // safepoints.

Consider this wording:

    // Shenandoah GC specific safepoints are scheduled by control thread.
    // So if we are enter here from control thread, then we are definitely not
    // at Shenandoah safepoint, but at something else.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1600


From zgu at openjdk.java.net  Thu Dec  3 15:43:17 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 15:43:17 GMT
Subject: RFR: 8257641: Shenandoah: Query is_at_shenandoah_safepoint() from
 control thread should return false [v2]
In-Reply-To: <_-Jn06hZ_PDSEomM9BbFJemMght0a8AlBGI6njqhCmA=.0c2781a3-ffb0-4d4f-b348-a5d160aa31eb@github.com>
References: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
 <_-Jn06hZ_PDSEomM9BbFJemMght0a8AlBGI6njqhCmA=.0c2781a3-ffb0-4d4f-b348-a5d160aa31eb@github.com>
Message-ID: <-x-Yed3UzC5izhFcls6RWg0ly7uXkvdXge50ULFXEWM=.a79de644-ec88-48ba-9f70-dab84fefc091@github.com>

On Thu, 3 Dec 2020 15:22:33 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update
>
> src/hotspot/share/gc/shenandoah/shenandoahUtils.hpp line 153:
> 
>> 151:     // Shenandoah GC specific safepoints are scheduled by control thread,
>> 152:     // so that, querying from control thread can not happen during those
>> 153:     // safepoints.
> 
> Consider this wording:
> 
>     // Shenandoah GC specific safepoints are scheduled by control thread.
>     // So if we are enter here from control thread, then we are definitely not
>     // at Shenandoah safepoint, but at something else.

Updated accordingly.

Thanks, Aleksey.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1600


From zgu at openjdk.java.net  Thu Dec  3 15:43:16 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 15:43:16 GMT
Subject: RFR: 8257641: Shenandoah: Query is_at_shenandoah_safepoint() from
 control thread should return false [v2]
In-Reply-To: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
References: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
Message-ID: <6it_g3FY8o8oe5P7Ldkpdj0s_uSOp5gMni0nykZ9NJE=.63766ae1-c142-415e-8710-25d8b55f7ae2@github.com>

> Since Shenandoah GC safepoints are scheduled by control thread, so that, if querying comes from control thread, the answer should be false.
> 
> is_at_shenandoah_safepoint() is still not reliable, even after JDK-8253778, we may consider to scratch it.
> 
> - [x] hotspot_gc_shenandoah x86_64 and x86_32

Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision:

  Update

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1600/files
  - new: https://git.openjdk.java.net/jdk/pull/1600/files/f87cc538..c394ceea

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1600&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1600&range=00-01

  Stats: 8 lines in 2 files changed: 2 ins; 3 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1600.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1600/head:pull/1600

PR: https://git.openjdk.java.net/jdk/pull/1600


From shade at openjdk.java.net  Thu Dec  3 15:50:59 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 3 Dec 2020 15:50:59 GMT
Subject: RFR: 8257641: Shenandoah: Query is_at_shenandoah_safepoint() from
 control thread should return false [v2]
In-Reply-To: <6it_g3FY8o8oe5P7Ldkpdj0s_uSOp5gMni0nykZ9NJE=.63766ae1-c142-415e-8710-25d8b55f7ae2@github.com>
References: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
 <6it_g3FY8o8oe5P7Ldkpdj0s_uSOp5gMni0nykZ9NJE=.63766ae1-c142-415e-8710-25d8b55f7ae2@github.com>
Message-ID: <EsS6a0gCUPaTfCv73EMktjQnTLR7y5M5j-eBMLdSWzg=.09885e68-0f11-4a40-b205-641b4009fddf@github.com>

On Thu, 3 Dec 2020 15:43:16 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> Since Shenandoah GC safepoints are scheduled by control thread, so that, if querying comes from control thread, the answer should be false.
>> 
>> is_at_shenandoah_safepoint() is still not reliable, even after JDK-8253778, we may consider to scratch it.
>> 
>> - [x] hotspot_gc_shenandoah x86_64 and x86_32
>
> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update

Looks good.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1600


From akozlov at openjdk.java.net  Thu Dec  3 17:23:55 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Thu, 3 Dec 2020 17:23:55 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
Message-ID: <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>

On Thu, 3 Dec 2020 15:19:51 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>>> Do you think os::protect_memory(base, size, os::MEM_PROT_RWX) will work on all platforms?
>> 
>> It looks so. Not sure about AIX, it ends up with ::mprotect, at least theoretically should be fine. On Linux and macOS without hardening it should be also OK. Without this patch, we don't support macOS hardened mode at all. But I have some private hacks for CDS with hardening, they reuse some of windows code for reading the archive content instead of mapping, so the future looks even more convenient.
>
> Hi Anton,
> 
> Unfortunately I am not sure anymore that a separate API for reserving code is practical. See https://github.com/openjdk/jdk/pull/1153 (https://bugs.openjdk.java.net/browse/JDK-8256155). People want to be able to use large paged memory for code. Large paged memory gets allocated via os::reserve_memory_special(). 
> 
> Today we already split the API space into two groups: os::reserve_memory() and friends, and os::(reserve|release)_memory_special(). Adding an "executable" API group to that would multiply the number of APIs by two. I am afraid we are stuck with the exec flag on reserve and commit.
> 
> If you are interested, there was a lively discussion under https://github.com/openjdk/jdk/pull/1161 (https://bugs.openjdk.java.net/browse/JDK-8243315). Among other things it was discussed whether we should get rid of multi-page regions (mixing various page sizes). See Linux::reserve_memory_special_hugetlb_mixed. This would simplify coding.
> 
> Since I feel bad now for causing you work, I give up any opposition to extending the APIs with the exec parameter. So I had a closer look at your original change again:
> 
> https://github.com/openjdk/jdk/pull/294/commits/114d9cffd62cab42790b65091648fe75345c4533
> 
> I wonder whether we could simplify things, if we let go of the notion that the code heap gets only committed on demand. I do not know how MacOS memory overcommit works in detail. But on Linux, committing memory increases process footprint toward the commit charge limit, and may need swap space, but it does not increase RSS as long as the memory is not touched. I do not know how important on MacOS delaying memory commit really is. If its not important, then we could just do this:
> 
> reserve_memory :
> 	- not executable: mmap MAP_NORESERVE, PROT_NONE
> 	- executable: mmap MAP_JIT *without* MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
> 
> commit_memory
> 	- not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
> 	- executable: (return, nothing to do)                   
> 
> uncommit_memory
> 	- not executable: mmap MAP_NORESERVE, PROT_NONE
> 	- executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
> 
> Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.
> 
>> Two functions like reserve_memory and reserve_memory_aligned look excessive. ReservedSpace for some reason tries to use the unaligned version first and when it fails (how should it know the result should be aligned?), fallbacks to reserve_memory_alignment. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L227. It will be more straightforward to ask for alignment from the start when it's required. I'm going to make alignment a parameter for reserve_memory as well, with the default value to be "no specific alignment", and to remove reserve_memory_aligned. It will simplify the implementation of reserve_executable_memory with alignment argument, and I hope to propose the suggested refactoring separately from this PR.
>> 
> 
> This makes sense, but is outside the scope of this RFE.
> 
> In general, I think in the API we need a separation between page size and alignment (both have been confused in the past, see discussion under https://github.com/openjdk/jdk/pull/1161). But page size is irrelevant for reserve_memory - which reserves per default just with os::vm_page_size() - but for reserve_memory_special we should specify both.
> 
> Cheers, Thomas

> Unfortunately I am not sure anymore that a separate API for reserving code is practical. See #1153 (https://bugs.openjdk.java.net/browse/JDK-8256155). People want to be able to use large paged memory for code. Large paged memory gets allocated via os::reserve_memory_special().

I think os::reserve_memory_special is substantially different, it does not allow commit/uncommit. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L170 basically defines it this way. I'm mostly concerned with interface for executable memory with commit.

> Today we already split the API space into two groups: os::reserve_memory() and friends, and os::(reserve|release)_memory_special(). Adding an "executable" API group to that would multiply the number of APIs by two. I am afraid we are stuck with the exec flag on reserve and commit.

It adds another API. And it's interesting that the only user of executable_memory are ReservedSpace and VirtualSpace. Removing executable argument from the rest of cases seems beneficial. I find it in somewhat more strict comparing with a boolean flag. But it's not really required for MAP_JIT.

>From interface implementation point of view, it's convenient to have a single interface with as many parameters as possible, even excessive and unused. It will enable tricky cases handling inside the implementation. executable_memory API is artificial separation in this case. It will be necessary if some combination of parameters are impossible to implement, but it's not our case, so we can live without it.

> I wonder whether we could simplify things, if we let go of the notion that the code heap gets only committed on demand. 

I'm not sure, what is the aim of the simplification below? Now access to uncommitted memory will cause a trap, just like on other platforms.

> I do not know how MacOS memory overcommit works in detail. But on Linux, committing memory increases process footprint toward the commit charge limit, and may need swap space, but it does not increase RSS as long as the memory is not touched. I do not know how important on MacOS delaying memory commit really is. If its not important, then we could just do this:
> 
> reserve_memory :
> - not executable: mmap MAP_NORESERVE, PROT_NONE
> - executable: mmap MAP_JIT _without_ MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
> 
> commit_memory
> - not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
> - executable: (return, nothing to do)
> 
> uncommit_memory
> - not executable: mmap MAP_NORESERVE, PROT_NONE
> - executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
> 

> Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.

madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.

> In general, I think in the API we need a separation between page size and alignment (both have been confused in the past, see discussion under #1161). But page size is irrelevant for reserve_memory - which reserves per default just with os::vm_page_size() - but for reserve_memory_special we should specify both.

If we talk about reveting to 114d9cf, there is no change beyond extra boolean argument, right?

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From zgu at openjdk.java.net  Thu Dec  3 17:56:06 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 17:56:06 GMT
Subject: RFR: 8257701: Shenandoah: objArrayKlass metadata is not marked during
 chunked array processing
Message-ID: <63v3wpkKQgK4pqL14kUHqDfQ1-5SYOGG5Th5Ov17XZQ=.0615b1e6-5858-4601-a048-293686aaef42@github.com>

Usually, marking code calls Klass::oop_oop_iterate(), where it marks object klass metadata.

Shenandoah introduced chunked array processing a while ago to breakup marking a large array into chunks, then call oop_iterate_range() to mark individual chunk. Unfortunately, oop_iterate_range() does not iterate over object klass metadata, so we end up missing the mark of object array klass metadata.

Thanks for @lmao reporting the bug.

- [x] hotspot_gc_shenandoah

-------------

Commit messages:
 - JDK-8257701

Changes: https://git.openjdk.java.net/jdk/pull/1602/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1602&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257701
  Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1602.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1602/head:pull/1602

PR: https://git.openjdk.java.net/jdk/pull/1602


From github.com+168222+mgkwill at openjdk.java.net  Thu Dec  3 18:05:59 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Thu, 3 Dec 2020 18:05:59 GMT
Subject: RFR: 8256155: 2M large pages for code when LargePageSizeInBytes
 is set to 1G for heap [v3]
In-Reply-To: <qCiWzyi5c27B_cHM0A4L86Z_ceDKbtXU0ngIlU27vEY=.0f9db238-7a18-402d-b2df-900a224a7342@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <-xtX9qSJHuD-qfp52XPToKhkl1HypRmNFHCJaupaync=.99285cd9-69a6-42cd-84b4-3c87fefc2cc5@github.com>
 <QkkkxGVLWKnMYEa-9h80rSH9vRTq3abEqTe4Pnc1qfI=.302e5810-82d5-4d85-bd1d-7932c291e123@github.com>
 <Jd_iwBSTv2hlDczMlbTOuzcURa6s5BWvI8feyWknxHA=.c18b1214-112c-444d-b34d-9bf497eec784@github.com>
 <8f-BJdFip5yf0Rv4uw-qcXVk2uM3Lb6Hrq9VPR6UzF4=.04966477-8834-4fb9-aa77-8da86f104176@github.com>
 <LPfyaalUbYRYT3IAG7pBYoUNyg1qARy-6jJInhEM3m0=.348a9941-0258-4b67-bc00-3f59c36cba9c@github.com>
 <QY7LuyIpTz-oBJ0zqgmsrXZnp-k8gexkj6mCmGM25Jw=.0903b642-293b-48a3-a71d-0de8a18c117b@github.com>
 <ZUzoZHLNb1IbHDEFbhcZiXXY2WVCaKYcaXUWZQS4_ns=.ab7b2726-b474-4d0e-a1d1-f3fa23117e45@github.com>
 <tsU3G7vnxeVLvSeAeuKZg1YKfdBYUpIAVi7RWPnqDFw=.b49d337c-8376-449e-8eff-ab1d631dce53@github.com>
 <qCiWzyi5c27B_cHM0A4L86Z_ceDKbtXU0ngIlU27vEY=.0f9db238-7a18-402d-b2df-900a224a7342@github.com>
Message-ID: <dTKUdvx-V4rcIbPlKzhidCo-tJheZSoEcWtfdszwalc=.418a41a6-c5ff-494c-bd68-2900355d8cef@github.com>

On Sun, 29 Nov 2020 08:17:09 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> I honestly don't even know why we have UseSHM. Seems redundant, and since it uses SystemV shared memory which has a different semantics from mmap, it is subtly broken in a number of places (eg https://bugs.openjdk.java.net/browse/JDK-8257040 or https://bugs.openjdk.java.net/browse/JDK-8257041).
>
> One thing I stumbled upon while looking at this code is why the CodeHeap always wants to have at least 8 pages covering its range:
> 
>   // If large page support is enabled, align code heaps according to large
>   // page size to make sure that code cache is covered by large pages.
>   const size_t alignment = MAX2(page_size(false, 8), (size_t) os::vm_allocation_granularity());
> 
> which means that for a wish pagesize of 1G, the code cache would have to cover at least 8G. I am not even sure this is possible, isn't it limited to 4G? 
> 
> Anyway, they don't uncommit. And the comment in codecache.cpp indicates this is to be able to step-wise commit, but with huge pages the space is committed right from the start anyway. So I do not see what good these 8 pages do. If we allowed the CodeCache to use just one page, it could be 1G in size and use a single 1G page. 
> 
> Note that there are similar min_page_size requests in GC, but I did not look closer into them.
> 
> Also, this does not take away the usefulness of this proposal.

Working on addressing comments in the code on this PR. Should have a tested change pushed later today and replies to comments.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From stuefe at openjdk.java.net  Thu Dec  3 18:13:01 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Thu, 3 Dec 2020 18:13:01 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
Message-ID: <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>

On Thu, 3 Dec 2020 17:21:38 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

> > Unfortunately I am not sure anymore that a separate API for reserving code is practical. See #1153 (https://bugs.openjdk.java.net/browse/JDK-8256155). People want to be able to use large paged memory for code. Large paged memory gets allocated via os::reserve_memory_special().
> 
> I think os::reserve_memory_special is substantially different, it does not allow commit/uncommit. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L170 basically defines it this way. I'm mostly concerned with interface for executable memory with commit.

That was not my point.

So you introduce os::reserve_executable_memory() - a new API which allocates small paged executable memory. If you want large paged executable memory you still need to call os::reserve_memory_special, which still needs to retain its executable parameter. So this new API does not cover all use cases for executable. Unless you also add a new os::reserve_executable_memory_special(). This is what I meant with doubling the API numbers.

> And it's interesting that the only user of executable_memory are ReservedSpace and VirtualSpace. Removing executable argument from the rest of cases seems beneficial. I find it in somewhat more strict comparing with a boolean flag. But it's not really required for MAP_JIT.

I'd like to keep the os::xxx APIs indepent from their use cases in ReservedSpace. In other words, they should stay consistent in themselves.

> 
> From interface implementation point of view, it's convenient to have a single interface with as many parameters as possible, even excessive and unused. It will enable tricky cases handling inside the implementation. executable_memory API is artificial separation in this case. It will be necessary if some combination of parameters are impossible to implement, but it's not our case, so we can live without it.
> 
> > I wonder whether we could simplify things, if we let go of the notion that the code heap gets only committed on demand.
> 
> I'm not sure, what is the aim of the simplification below?

To remove the coding depending on executable-ness from commit and uncommit.

> Now access to uncommitted memory will cause a trap, just like on other platforms.

Sorry, you lost me there. Where would I get a trap?

My point was, for executable=true: 
- on reserve, commit executable memory right away instead of on-demand committing later 
- on commit and uncommit, just do nothing

On commit, the OS makes sure the memory is underlayed with swap space. The memory also counts toward the commit charge of the process. What effects this has in practice highly depends on the OS. Hence my question. For the code heap, it may not matter that much.

And ignoring uncommit for executable memory was based on your observation that it did not show observable effects anyway for you, and that code heap does not shrink.

> 
> > I do not know how MacOS memory overcommit works in detail. But on Linux, committing memory increases process footprint toward the commit charge limit, and may need swap space, but it does not increase RSS as long as the memory is not touched. I do not know how important on MacOS delaying memory commit really is. If its not important, then we could just do this:
> > reserve_memory :
> > 
> > * not executable: mmap MAP_NORESERVE, PROT_NONE
> > * executable: mmap MAP_JIT _without_ MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
> > 
> > commit_memory
> > 
> > * not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
> > * executable: (return, nothing to do)
> > 
> > uncommit_memory
> > 
> > * not executable: mmap MAP_NORESERVE, PROT_NONE
> > * executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
> 
> > Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.
> 
> madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.

How so? What is the difference?

> 
> > In general, I think in the API we need a separation between page size and alignment (both have been confused in the past, see discussion under #1161). But page size is irrelevant for reserve_memory - which reserves per default just with os::vm_page_size() - but for reserve_memory_special we should specify both.
> 
> If we talk about reveting to [114d9cf](https://github.com/openjdk/jdk/commit/114d9cffd62cab42790b65091648fe75345c4533), there is no change beyond extra boolean argument, right?

Yes, I think so.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From shade at openjdk.java.net  Thu Dec  3 18:17:59 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 3 Dec 2020 18:17:59 GMT
Subject: RFR: 8257701: Shenandoah: objArrayKlass metadata is not marked
 with chunked arrays
In-Reply-To: <63v3wpkKQgK4pqL14kUHqDfQ1-5SYOGG5Th5Ov17XZQ=.0615b1e6-5858-4601-a048-293686aaef42@github.com>
References: <63v3wpkKQgK4pqL14kUHqDfQ1-5SYOGG5Th5Ov17XZQ=.0615b1e6-5858-4601-a048-293686aaef42@github.com>
Message-ID: <0v12JqCXy5t2zHtAy5Ujdeo3PG9JouCM380M9HCpIAo=.8f921b2b-abc5-41f9-9563-98883e1bfb72@github.com>

On Thu, 3 Dec 2020 17:51:05 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> Usually, marking code calls Klass::oop_oop_iterate(), where it marks object klass metadata.
> 
> Shenandoah introduced chunked array processing a while ago to breakup marking a large array into chunks, then call oop_iterate_range() to mark individual chunk. Unfortunately, oop_iterate_range() does not iterate over object klass metadata, so we end up missing the mark of object array klass metadata.
> 
> Thanks for @lmao (Liang Mao) reporting the bug.
> 
> - [x] hotspot_gc_shenandoah

Shenandoah change looks good. I wonder if G1 has the same bug! I think it does... Please submit a "potential bug" for it?

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1602


From zgu at openjdk.java.net  Thu Dec  3 18:45:57 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 18:45:57 GMT
Subject: Integrated: 8257641: Shenandoah: Query is_at_shenandoah_safepoint()
 from control thread should return false
In-Reply-To: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
References: <c683Ln_ZWT7g7antOOpoXa86Ycz4OawQWdYsJYrBFIM=.4b1190be-73d2-4344-bd85-f489843f00cc@github.com>
Message-ID: <rZGGtVQoXlRARA2znwVFm5o4gtQZPy1KGy1wz2UNFKM=.db53c1c0-1ca4-49c9-ad4f-99115955bd20@github.com>

On Thu, 3 Dec 2020 15:05:33 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> Since Shenandoah GC safepoints are scheduled by control thread, so that, if querying comes from control thread, the answer should be false.
> 
> is_at_shenandoah_safepoint() is still not reliable, even after JDK-8253778, we may consider to scratch it.
> 
> - [x] hotspot_gc_shenandoah x86_64 and x86_32

This pull request has now been integrated.

Changeset: e29ee5b8
Author:    Zhengyu Gu <zgu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/e29ee5b8
Stats:     8 lines in 2 files changed: 7 ins; 0 del; 1 mod

8257641: Shenandoah: Query is_at_shenandoah_safepoint() from control thread should return false

Reviewed-by: shade

-------------

PR: https://git.openjdk.java.net/jdk/pull/1600


From zgu at openjdk.java.net  Thu Dec  3 19:03:57 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 19:03:57 GMT
Subject: RFR: 8257701: Shenandoah: objArrayKlass metadata is not marked
 with chunked arrays
In-Reply-To: <0v12JqCXy5t2zHtAy5Ujdeo3PG9JouCM380M9HCpIAo=.8f921b2b-abc5-41f9-9563-98883e1bfb72@github.com>
References: <63v3wpkKQgK4pqL14kUHqDfQ1-5SYOGG5Th5Ov17XZQ=.0615b1e6-5858-4601-a048-293686aaef42@github.com>
 <0v12JqCXy5t2zHtAy5Ujdeo3PG9JouCM380M9HCpIAo=.8f921b2b-abc5-41f9-9563-98883e1bfb72@github.com>
Message-ID: <OIJtrf0qmSjm5sbEVMSxuaD6lXgKWhuI8h80HB5HBB0=.ed65f553-28d1-4671-a185-e461a7b3db2e@github.com>

On Thu, 3 Dec 2020 18:14:56 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Shenandoah change looks good. I wonder if G1 has the same bug! I think it does... Please submit a "potential bug" for it?

G1 does things a little different, for each array slice (chunk), it calls obj->oop_iterate(closure, MemRegion), in turn, maps to ObjArrayKlass::oop_oop_iterate_bounded(obj, closure, MemRegion), which does walk over metadata.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1602


From zgu at openjdk.java.net  Thu Dec  3 20:01:57 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 20:01:57 GMT
Subject: Integrated: 8257701: Shenandoah: objArrayKlass metadata is not marked
 with chunked arrays
In-Reply-To: <63v3wpkKQgK4pqL14kUHqDfQ1-5SYOGG5Th5Ov17XZQ=.0615b1e6-5858-4601-a048-293686aaef42@github.com>
References: <63v3wpkKQgK4pqL14kUHqDfQ1-5SYOGG5Th5Ov17XZQ=.0615b1e6-5858-4601-a048-293686aaef42@github.com>
Message-ID: <pD2Gl_-C6avvsWXkM3_itPkbPl_V2I6ursjl2yNXfZw=.88357600-2746-43c4-a302-fcca7539829a@github.com>

On Thu, 3 Dec 2020 17:51:05 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> Usually, marking code calls Klass::oop_oop_iterate(), where it marks object klass metadata.
> 
> Shenandoah introduced chunked array processing a while ago to breakup marking a large array into chunks, then call oop_iterate_range() to mark individual chunk. Unfortunately, oop_iterate_range() does not iterate over object klass metadata, so we end up missing the mark of object array klass metadata.
> 
> Thanks for @lmao (Liang Mao) reporting the bug.
> 
> - [x] hotspot_gc_shenandoah

This pull request has now been integrated.

Changeset: 7c7facc2
Author:    Zhengyu Gu <zgu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/7c7facc2
Stats:     5 lines in 1 file changed: 5 ins; 0 del; 0 mod

8257701: Shenandoah: objArrayKlass metadata is not marked with chunked arrays

Reviewed-by: shade

-------------

PR: https://git.openjdk.java.net/jdk/pull/1602


From akozlov at openjdk.java.net  Thu Dec  3 20:28:55 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Thu, 3 Dec 2020 20:28:55 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
Message-ID: <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>

On Thu, 3 Dec 2020 18:10:18 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>>> Unfortunately I am not sure anymore that a separate API for reserving code is practical. See #1153 (https://bugs.openjdk.java.net/browse/JDK-8256155). People want to be able to use large paged memory for code. Large paged memory gets allocated via os::reserve_memory_special().
>> 
>> I think os::reserve_memory_special is substantially different, it does not allow commit/uncommit. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L170 basically defines it this way. I'm mostly concerned with interface for executable memory with commit.
>> 
>>> Today we already split the API space into two groups: os::reserve_memory() and friends, and os::(reserve|release)_memory_special(). Adding an "executable" API group to that would multiply the number of APIs by two. I am afraid we are stuck with the exec flag on reserve and commit.
>> 
>> It adds another API. And it's interesting that the only user of executable_memory are ReservedSpace and VirtualSpace. Removing executable argument from the rest of cases seems beneficial. I find it in somewhat more strict comparing with a boolean flag. But it's not really required for MAP_JIT.
>> 
>> From interface implementation point of view, it's convenient to have a single interface with as many parameters as possible, even excessive and unused. It will enable tricky cases handling inside the implementation. executable_memory API is artificial separation in this case. It will be necessary if some combination of parameters are impossible to implement, but it's not our case, so we can live without it.
>> 
>>> I wonder whether we could simplify things, if we let go of the notion that the code heap gets only committed on demand. 
>> 
>> I'm not sure, what is the aim of the simplification below? Now access to uncommitted memory will cause a trap, just like on other platforms.
>> 
>>> I do not know how MacOS memory overcommit works in detail. But on Linux, committing memory increases process footprint toward the commit charge limit, and may need swap space, but it does not increase RSS as long as the memory is not touched. I do not know how important on MacOS delaying memory commit really is. If its not important, then we could just do this:
>>> 
>>> reserve_memory :
>>> - not executable: mmap MAP_NORESERVE, PROT_NONE
>>> - executable: mmap MAP_JIT _without_ MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
>>> 
>>> commit_memory
>>> - not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
>>> - executable: (return, nothing to do)
>>> 
>>> uncommit_memory
>>> - not executable: mmap MAP_NORESERVE, PROT_NONE
>>> - executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
>>> 
>> 
>>> Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.
>> 
>> madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.
>> 
>>> In general, I think in the API we need a separation between page size and alignment (both have been confused in the past, see discussion under #1161). But page size is irrelevant for reserve_memory - which reserves per default just with os::vm_page_size() - but for reserve_memory_special we should specify both.
>> 
>> If we talk about reveting to 114d9cf, there is no change beyond extra boolean argument, right?
>
>> > Unfortunately I am not sure anymore that a separate API for reserving code is practical. See #1153 (https://bugs.openjdk.java.net/browse/JDK-8256155). People want to be able to use large paged memory for code. Large paged memory gets allocated via os::reserve_memory_special().
>> 
>> I think os::reserve_memory_special is substantially different, it does not allow commit/uncommit. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/virtualspace.cpp#L170 basically defines it this way. I'm mostly concerned with interface for executable memory with commit.
> 
> That was not my point.
> 
> So you introduce os::reserve_executable_memory() - a new API which allocates small paged executable memory. If you want large paged executable memory you still need to call os::reserve_memory_special, which still needs to retain its executable parameter. So this new API does not cover all use cases for executable. Unless you also add a new os::reserve_executable_memory_special(). This is what I meant with doubling the API numbers.
> 
>> And it's interesting that the only user of executable_memory are ReservedSpace and VirtualSpace. Removing executable argument from the rest of cases seems beneficial. I find it in somewhat more strict comparing with a boolean flag. But it's not really required for MAP_JIT.
> 
> I'd like to keep the os::xxx APIs indepent from their use cases in ReservedSpace. In other words, they should stay consistent in themselves.
> 
>> 
>> From interface implementation point of view, it's convenient to have a single interface with as many parameters as possible, even excessive and unused. It will enable tricky cases handling inside the implementation. executable_memory API is artificial separation in this case. It will be necessary if some combination of parameters are impossible to implement, but it's not our case, so we can live without it.
>> 
>> > I wonder whether we could simplify things, if we let go of the notion that the code heap gets only committed on demand.
>> 
>> I'm not sure, what is the aim of the simplification below?
> 
> To remove the coding depending on executable-ness from commit and uncommit.
> 
>> Now access to uncommitted memory will cause a trap, just like on other platforms.
> 
> Sorry, you lost me there. Where would I get a trap?
> 
> My point was, for executable=true: 
> - on reserve, commit executable memory right away instead of on-demand committing later 
> - on commit and uncommit, just do nothing
> 
> On commit, the OS makes sure the memory is underlayed with swap space. The memory also counts toward the commit charge of the process. What effects this has in practice highly depends on the OS. Hence my question. For the code heap, it may not matter that much.
> 
> And ignoring uncommit for executable memory was based on your observation that it did not show observable effects anyway for you, and that code heap does not shrink.
> 
>> 
>> > I do not know how MacOS memory overcommit works in detail. But on Linux, committing memory increases process footprint toward the commit charge limit, and may need swap space, but it does not increase RSS as long as the memory is not touched. I do not know how important on MacOS delaying memory commit really is. If its not important, then we could just do this:
>> > reserve_memory :
>> > 
>> > * not executable: mmap MAP_NORESERVE, PROT_NONE
>> > * executable: mmap MAP_JIT _without_ MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
>> > 
>> > commit_memory
>> > 
>> > * not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
>> > * executable: (return, nothing to do)
>> > 
>> > uncommit_memory
>> > 
>> > * not executable: mmap MAP_NORESERVE, PROT_NONE
>> > * executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
>> 
>> > Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.
>> 
>> madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.
> 
> How so? What is the difference?
> 
>> 
>> > In general, I think in the API we need a separation between page size and alignment (both have been confused in the past, see discussion under #1161). But page size is irrelevant for reserve_memory - which reserves per default just with os::vm_page_size() - but for reserve_memory_special we should specify both.
>> 
>> If we talk about reveting to [114d9cf](https://github.com/openjdk/jdk/commit/114d9cffd62cab42790b65091648fe75345c4533), there is no change beyond extra boolean argument, right?
> 
> Yes, I think so.

> > > reserve_memory :
> > > 
> > > * not executable: mmap MAP_NORESERVE, PROT_NONE
> > > * executable: mmap MAP_JIT _without_ MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
> > > 
> > > commit_memory
> > > 
> > > * not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
> > > * executable: (return, nothing to do)
> > > 
> > > uncommit_memory
> > > 
> > > * not executable: mmap MAP_NORESERVE, PROT_NONE
> > > * executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
> > >
> > I'm not sure, what is the aim of the simplification [above]?
> 
> To remove the coding depending on executable-ness from commit and uncommit.

Sorry, how can e.g uncommit choose executable or not-executable case, if executable parameter is not provided?

---

> > Now access to uncommitted memory will cause a trap, just like on other platforms.
> 
> Sorry, you lost me there. Where would I get a trap?

I mean, after a call to pd_uncommit_memory on linux the memory mprotected with PROT_NONE. Any access to that region will generate a signal.

---

> 
> My point was, for executable=true:
> 
> * on reserve, commit executable memory right away instead of on-demand committing later
> * on commit and uncommit, just do nothing

As far as I understand you propose to remove lines 2010 - 2014 https://github.com/openjdk/jdk/blob/114d9cffd62cab42790b65091648fe75345c4533/src/hotspot/os/bsd/os_bsd.cpp#L2010

but later you suggest (in different context, but the statement is correct)

> The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this

The current implementation does mprotect(NONE). madvice(FREE) is not accounted immediately, but this hint is better than nothing.

---

> > > Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.
> > 
> > 
> > madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.
> 
> How so? What is the difference?

madvise is really a hint and doesn't have exact effect as a real uncommit. A real, immediately accountable uncommit is a https://github.com/openjdk/jdk/blob/114d9cffd62cab42790b65091648fe75345c4533/src/hotspot/os/bsd/os_bsd.cpp#L2015. As I cannot do this for executable memory, I use madvise as an alternative to nothing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From zgu at openjdk.java.net  Thu Dec  3 22:00:13 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 3 Dec 2020 22:00:13 GMT
Subject: RFR: 8255019: Shenandoah: Split STW and concurrent mark into
 separate classes [v18]
In-Reply-To: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
References: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
Message-ID: <mk-0_7ntWSZy1--XVhcaU1LAHYkZKFKpp3n3x9Wz0Z8=.ab288cc1-09ed-47aa-8ee6-67c63fd27dcb@github.com>

> This is the first part of refactoring, that aims to isolate three Shenandoah GC modes (concurrent, degenerated and full gc).
> 
> Shenandoah started with two GC modes, concurrent and full gc, with minimal shared code, mainly in mark phase. After introducing degenerated GC, it shared quite large portion of code with concurrent GC, with the concept that degenerated GC can simply pick up remaining work of concurrent GC in STW mode.
> 
> It was not a big problem at that time, since concurrent GC also processed roots STW. Since Shenandoah gradually moved root processing into concurrent phase, code started to diverge, that made code hard to reason and maintain.
> 
> First step, I would like to split STW and concurrent mark, so that:
> 1) Code has to special case for STW and concurrent mark.
> 2) STW mark does not need to rendezvous workers between root mark and the rest of mark
> 3) STW mark does not need to activate SATB barrier and drain SATB buffers.
> 4) STW mark does not need to remark some of roots.
> 
> The patch mainly just shuffles code.  Creates a base class ShenandoahMark, and moved shared code (from current shenandoahConcurrentMark) into this base class. I did 'git mv shenandoahConcurrentMark.inline.hpp  shenandoahMark.inline.hpp, but git does not seem to reflect that.
> 
> A few changes:
> 1) Moved task queue set from ShenandoahConcurrentMark to ShenandoahHeap. ShenandoahMark and its subclasses are stateless. Instead, mark states are maintained in task queue, mark bitmap and SATB buffers, so that they can be created on demand.
> 2) Split ShenandoahConcurrentRootScanner template to ShenandoahConcurrentRootScanner and ShenandoahSTWRootScanner
> 3) Split code inside op_final_mark code into finish_mark and prepare_evacuation helper functions.
> 4) Made ShenandoahMarkCompact stack allocated (as well as ShenandoahConcurrentGC and ShenandoahDegeneratedGC in upcoming refactoring)

Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits:

 - Merge branch 'master' into JDK-8255019-sh-mark
 - Silent valgrind on potential memory leak
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Removed ShenandoahConcurrentMark parameter from concurrent GC entry/op, etc.
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge
 - Moved task queues to marking context
 - Merge
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge branch 'master' into JDK-8255019-sh-mark
 - ... and 15 more: https://git.openjdk.java.net/jdk/compare/7c7facc2...c16fd77c

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1009/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1009&range=17
  Stats: 1957 lines in 22 files changed: 1072 ins; 747 del; 138 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1009.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1009/head:pull/1009

PR: https://git.openjdk.java.net/jdk/pull/1009


From github.com+168222+mgkwill at openjdk.java.net  Thu Dec  3 23:18:57 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Thu, 3 Dec 2020 23:18:57 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v3]
In-Reply-To: <8f-BJdFip5yf0Rv4uw-qcXVk2uM3Lb6Hrq9VPR6UzF4=.04966477-8834-4fb9-aa77-8da86f104176@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <-xtX9qSJHuD-qfp52XPToKhkl1HypRmNFHCJaupaync=.99285cd9-69a6-42cd-84b4-3c87fefc2cc5@github.com>
 <QkkkxGVLWKnMYEa-9h80rSH9vRTq3abEqTe4Pnc1qfI=.302e5810-82d5-4d85-bd1d-7932c291e123@github.com>
 <Jd_iwBSTv2hlDczMlbTOuzcURa6s5BWvI8feyWknxHA=.c18b1214-112c-444d-b34d-9bf497eec784@github.com>
 <8f-BJdFip5yf0Rv4uw-qcXVk2uM3Lb6Hrq9VPR6UzF4=.04966477-8834-4fb9-aa77-8da86f104176@github.com>
Message-ID: <-34e3icSZgu8A4-oTQcGCdLuQYWFFWNKe_btQifE7IY=.fa92e2f8-f2a7-4438-959c-ce544b644c05@github.com>

On Thu, 19 Nov 2020 08:19:59 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> > Hi Stefan,
> > Thanks so much for your review.
> > > Hi and welcome :)
> > > I haven't started reviewing the code in detail but a first quick glance raised a couple of questions/comments:
> > > 
> > > * Why do we have a special case for `exec` when selecting a large page size?
> > 
> > 
> > To my knowledge 2M is the smallest large pages size supported by Linux at the moment. Hardcoding 2M pages was an attempt to simplify the reservation of code memory using LargePages. In most cases currently code memory is reserved in default page size of the system when using 1G LargePages because it does not require 1G or larger reservations. In modern Linux variants default page size seems to be 4k on x86_64. In other architectures it could be up to 64k. The purpose of the patch is to enable the use of smaller LargePages for reservations less than 1G when LargePages are enabled and 1G is set as LargePageSizeInBytes, so as not to fall back to 4k-64k pages for these reservations.
> > Perhaps I should just select the page size <= bytes requested and remove 'exec' special case.
> 
> Yes, I see no reason to keep that special case and we want to keep this code as general as possible. Looking at the code in `os::Linux::find_default_large_page_size()` it looks like S390 supports 1M large pages, so we cannot assume 2M. I suggest using a technique similar to the one used in `os::Linux::find_large_page_size` to find supported page sizes. If you scan `/sys/kernel/mm/hugepages` and populate `_page_sizes` using the information found we know we only get supported sizes.
> 
> > > * If we need the special case, `exec_large_page_size()` should not be hard code to return 2m but rather use `os::default_large_page_size()`.
> > 
> > 
> > os::default_large_page_size() will not necessarily be small enough for code memory reservations if the os::default_large_page_size() = 1G, in those cases we would get 4k on most linux x86_64 variants. My attempt is to ensure the smallest large_page_size availabe is used for code memory reservations. Perhaps my 2M hardcoding was a mistake and I should discover this size and select it based on the bytes being reserved.
> 
> You are correct that the default size might indeed be 1G, so using something like I suggest above to figure out the available page sizes and then using an appropriate one given the size of the mapping sounds good.
> 

Agreed.

> Please also avoid force pushing changes to open PRs since it makes it harder to follow what changes between updates. It is fine for a PR to contain multiple commits and if you need to update with things from the main branch you should merge rather than rebase.
> 

Thanks for letting me know. I'll work in that workflow from now on. In other open source communities I've worked in we used a gerrit (example: [gerrit.onap.org](https://gerrit.onap.org/)) workflow where each patch-set was tracked by gerrit using a change-id 
`Change-Id: I51625afb91451ee95c051c4edc8e7c30589f3831`. Using this workflow we would avoid merges and would use rebasing for simplicity and to avoid other workflow/CI issues and gerrit maintains change history and comments etc.

As you say, it looks like merging and flattening into one commit is the way these pull requests get into openjdk/jdk master.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Thu Dec  3 23:24:56 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Thu, 3 Dec 2020 23:24:56 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v3]
In-Reply-To: <qCiWzyi5c27B_cHM0A4L86Z_ceDKbtXU0ngIlU27vEY=.0f9db238-7a18-402d-b2df-900a224a7342@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <-xtX9qSJHuD-qfp52XPToKhkl1HypRmNFHCJaupaync=.99285cd9-69a6-42cd-84b4-3c87fefc2cc5@github.com>
 <QkkkxGVLWKnMYEa-9h80rSH9vRTq3abEqTe4Pnc1qfI=.302e5810-82d5-4d85-bd1d-7932c291e123@github.com>
 <Jd_iwBSTv2hlDczMlbTOuzcURa6s5BWvI8feyWknxHA=.c18b1214-112c-444d-b34d-9bf497eec784@github.com>
 <8f-BJdFip5yf0Rv4uw-qcXVk2uM3Lb6Hrq9VPR6UzF4=.04966477-8834-4fb9-aa77-8da86f104176@github.com>
 <LPfyaalUbYRYT3IAG7pBYoUNyg1qARy-6jJInhEM3m0=.348a9941-0258-4b67-bc00-3f59c36cba9c@github.com>
 <QY7LuyIpTz-oBJ0zqgmsrXZnp-k8gexkj6mCmGM25Jw=.0903b642-293b-48a3-a71d-0de8a18c117b@github.com>
 <ZUzoZHLNb1IbHDEFbhcZiXXY2WVCaKYcaXUWZQS4_ns=.ab7b2726-b474-4d0e-a1d1-f3fa23117e45@github.com>
 <tsU3G7vnxeVLvSeAeuKZg1YKfdBYUpIAVi7RWPnqDFw=.b49d337c-8376-449e-8eff-ab1d631dce53@github.com>
 <qCiWzyi5c27B_cHM0A4L86Z_ceDKbtXU0ngIlU27vEY=.0f9db238-7a18-402d-b2df-900a224a7342@github.com>
Message-ID: <O2tdyp65bKDMSls0UOPu54EYbmBhZBFHVSfw8-wa3KY=.5f9172ed-9819-49c6-8e41-4c2ba471fc0c@github.com>

On Sun, 29 Nov 2020 08:17:09 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> One thing I stumbled upon while looking at this code is why the CodeHeap always wants to have at least 8 pages covering its range:
> 
> ```
>   // If large page support is enabled, align code heaps according to large
>   // page size to make sure that code cache is covered by large pages.
>   const size_t alignment = MAX2(page_size(false, 8), (size_t) os::vm_allocation_granularity());
> ```
> 
> which means that for a wish pagesize of 1G, the code cache would have to cover at least 8G. I am not even sure this is possible, isn't it limited to 4G?
> 
> Anyway, they don't uncommit. And the comment in codecache.cpp indicates this is to be able to step-wise commit, but with huge pages the space is committed right from the start anyway. So I do not see what good these 8 pages do. If we allowed the CodeCache to use just one page, it could be 1G in size and use a single 1G page.
> 
> Note that there are similar min_page_size requests in GC, but I did not look closer into them.
> 
> Also, this does not take away the usefulness of this proposal.

Interesting. I'll look at min_page_size requests in GC and codecache in relation to large pages and see what kind of optimization can be done in another JDK-Issue/PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Thu Dec  3 23:48:15 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Thu, 3 Dec 2020 23:48:15 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v4]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <uHg7AAdeMEcfJACXKXUpTq-_BZq0fvG3euSE73gdBVo=.85348712-b0c1-4bad-bf3e-5d464d57709e@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:

 - Adress Comments, Rework changes for PagesizeSet
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - JDK-8257588: Make os::_page_sizes a bitmask #1522
 - Merge branch 'master' into update_hlp
 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Add 2M LargePages to _page_sizes
   
   Use 2m pages for large page requests
   less than 1g on linux when 1G are default
   pages
   
   - Add os::Linux::large_page_size_2m() that
   returns 2m as size
   - Add os::Linux::select_large_page_size() to return
   correct large page size for size_t bytes
   - Add 2m size to _page_sizes array
   - Update reserve_memory_special methods
   to set/use large_page_size based on bytes reserved
   - Update large page not reserved warnings
   to include large_page_size attempted
   - Update TestLargePageUseForAuxMemory.java
   to expect 2m large pages in some instances
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge remote-tracking branch 'upstream/master' into update_hlp
 - Add 2M LargePages to _page_sizes
   
   Use 2m pages for large page requests
   less than 1g on linux when 1G are default
   pages
   
   - Add os::Linux::large_page_size_2m() that
   returns 2m as size
   - Add os::Linux::select_large_page_size() to return
   correct large page size for size_t bytes
   - Add 2m size to _page_sizes array
   - Update reserve_memory_special methods
   to set/use large_page_size based on bytes reserved
   - Update large page not reserved warnings
   to include large_page_size attempted
   - Update TestLargePageUseForAuxMemory.java
   to expect 2m large pages in some instances
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/57e54963..0901e70e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=02-03

  Stats: 2740 lines in 107 files changed: 1814 ins; 632 del; 294 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Thu Dec  3 23:53:56 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Thu, 3 Dec 2020 23:53:56 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v4]
In-Reply-To: <O2tdyp65bKDMSls0UOPu54EYbmBhZBFHVSfw8-wa3KY=.5f9172ed-9819-49c6-8e41-4c2ba471fc0c@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <-xtX9qSJHuD-qfp52XPToKhkl1HypRmNFHCJaupaync=.99285cd9-69a6-42cd-84b4-3c87fefc2cc5@github.com>
 <QkkkxGVLWKnMYEa-9h80rSH9vRTq3abEqTe4Pnc1qfI=.302e5810-82d5-4d85-bd1d-7932c291e123@github.com>
 <Jd_iwBSTv2hlDczMlbTOuzcURa6s5BWvI8feyWknxHA=.c18b1214-112c-444d-b34d-9bf497eec784@github.com>
 <8f-BJdFip5yf0Rv4uw-qcXVk2uM3Lb6Hrq9VPR6UzF4=.04966477-8834-4fb9-aa77-8da86f104176@github.com>
 <LPfyaalUbYRYT3IAG7pBYoUNyg1qARy-6jJInhEM3m0=.348a9941-0258-4b67-bc00-3f59c36cba9c@github.com>
 <QY7LuyIpTz-oBJ0zqgmsrXZnp-k8gexkj6mCmGM25Jw=.0903b642-293b-48a3-a71d-0de8a18c117b@github.com>
 <ZUzoZHLNb1IbHDEFbhcZiXXY2WVCaKYcaXUWZQS4_ns=.ab7b2726-b474-4d0e-a1d1-f3fa23117e45@github.com>
 <tsU3G7vnxeVLvSeAeuKZg1YKfdBYUpIAVi7RWPnqDFw=.b49d337c-8376-449e-8eff-ab1d631dce53@github.com>
 <qCiWzyi5c27B_cHM0A4L86Z_ceDKbtXU0ngIlU27vEY=.0f9db238-7a18-402d-b2df-900a224a7342@github.com>
 <O2tdyp65bKDMSls0UOPu54EYbmBhZBFHVSf
 w8-wa3KY=.5f9172ed-9819-49c6-8e41-4c2ba471fc0c@github.com>
Message-ID: <tLAQCRCFXt3EMqjFmznQzWkUVSrX6XTTve82v__h14s=.cac77b07-f334-450e-b9ba-931864f9ec60@github.com>

On Thu, 3 Dec 2020 23:22:17 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> One thing I stumbled upon while looking at this code is why the CodeHeap always wants to have at least 8 pages covering its range:
>> 
>>   // If large page support is enabled, align code heaps according to large
>>   // page size to make sure that code cache is covered by large pages.
>>   const size_t alignment = MAX2(page_size(false, 8), (size_t) os::vm_allocation_granularity());
>> 
>> which means that for a wish pagesize of 1G, the code cache would have to cover at least 8G. I am not even sure this is possible, isn't it limited to 4G? 
>> 
>> Anyway, they don't uncommit. And the comment in codecache.cpp indicates this is to be able to step-wise commit, but with huge pages the space is committed right from the start anyway. So I do not see what good these 8 pages do. If we allowed the CodeCache to use just one page, it could be 1G in size and use a single 1G page. 
>> 
>> Note that there are similar min_page_size requests in GC, but I did not look closer into them.
>> 
>> Also, this does not take away the usefulness of this proposal.
>
>> One thing I stumbled upon while looking at this code is why the CodeHeap always wants to have at least 8 pages covering its range:
>> 
>> ```
>>   // If large page support is enabled, align code heaps according to large
>>   // page size to make sure that code cache is covered by large pages.
>>   const size_t alignment = MAX2(page_size(false, 8), (size_t) os::vm_allocation_granularity());
>> ```
>> 
>> which means that for a wish pagesize of 1G, the code cache would have to cover at least 8G. I am not even sure this is possible, isn't it limited to 4G?
>> 
>> Anyway, they don't uncommit. And the comment in codecache.cpp indicates this is to be able to step-wise commit, but with huge pages the space is committed right from the start anyway. So I do not see what good these 8 pages do. If we allowed the CodeCache to use just one page, it could be 1G in size and use a single 1G page.
>> 
>> Note that there are similar min_page_size requests in GC, but I did not look closer into them.
>> 
>> Also, this does not take away the usefulness of this proposal.
> 
> Interesting. I'll look at min_page_size requests in GC and codecache in relation to large pages and see what kind of optimization can be done in another JDK-Issue/PR.

Recent push is dependent on and includes #1522 - when it is updated, I will update here.

Removed 2M/exec specific code. Re-wrote to take advantage of #1522. Attempted to address other comments. Please let me know if I've missed something.

My apologies this took so long to update, we had a long holiday weekend.

Thanks again for all of the review @kstefanj and @tstuefe!

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Fri Dec  4 00:01:02 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Fri, 4 Dec 2020 00:01:02 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v4]
In-Reply-To: <uHg7AAdeMEcfJACXKXUpTq-_BZq0fvG3euSE73gdBVo=.85348712-b0c1-4bad-bf3e-5d464d57709e@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <uHg7AAdeMEcfJACXKXUpTq-_BZq0fvG3euSE73gdBVo=.85348712-b0c1-4bad-bf3e-5d464d57709e@github.com>
Message-ID: <4RB8I8eA6hYo_ORAUdzd9EtIMAhIi_3G1689-5DKndE=.811d784a-a81d-4ca2-8c87-5540cca5d585@github.com>

On Thu, 3 Dec 2020 23:48:15 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
>> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
>> 
>> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
>> 
>> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.
>
> Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
> 
>  - Adress Comments, Rework changes for PagesizeSet
>    
>    Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
>  - JDK-8257588: Make os::_page_sizes a bitmask #1522
>  - Merge branch 'master' into update_hlp
>  - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
>  - Add 2M LargePages to _page_sizes
>    
>    Use 2m pages for large page requests
>    less than 1g on linux when 1G are default
>    pages
>    
>    - Add os::Linux::large_page_size_2m() that
>    returns 2m as size
>    - Add os::Linux::select_large_page_size() to return
>    correct large page size for size_t bytes
>    - Add 2m size to _page_sizes array
>    - Update reserve_memory_special methods
>    to set/use large_page_size based on bytes reserved
>    - Update large page not reserved warnings
>    to include large_page_size attempted
>    - Update TestLargePageUseForAuxMemory.java
>    to expect 2m large pages in some instances
>    
>    Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
>  - Merge remote-tracking branch 'upstream/master' into update_hlp
>  - Add 2M LargePages to _page_sizes
>    
>    Use 2m pages for large page requests
>    less than 1g on linux when 1G are default
>    pages
>    
>    - Add os::Linux::large_page_size_2m() that
>    returns 2m as size
>    - Add os::Linux::select_large_page_size() to return
>    correct large page size for size_t bytes
>    - Add 2m size to _page_sizes array
>    - Update reserve_memory_special methods
>    to set/use large_page_size based on bytes reserved
>    - Update large page not reserved warnings
>    to include large_page_size attempted
>    - Update TestLargePageUseForAuxMemory.java
>    to expect 2m large pages in some instances
>    
>    Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

src/hotspot/os/linux/os_linux.cpp line 3900:

> 3898:     err);                                               \
> 3899:   } while (0)
> 3900: 

Remove this remnant of UseSHM changes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Fri Dec  4 00:07:12 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Fri, 4 Dec 2020 00:07:12 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:

  Remove remnant UseSHM change
  
  Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/0901e70e..b5bd144d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=03-04

  Stats: 13 lines in 1 file changed: 0 ins; 10 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From jiefu at openjdk.java.net  Fri Dec  4 03:57:13 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Fri, 4 Dec 2020 03:57:13 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v4]
In-Reply-To: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
Message-ID: <PgqRvdrSSrSAAjvcz26lDYW4GzzJE5EnuNgTwd1kNqI=.7e5a9916-5cea-48fd-b9c0-7aa794030237@github.com>

> Hi all,
> 
> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
> For example, this assert [1] fired on our testing boxes.
> 
> It can be reproduced by the following two steps on Linux-64:
>   1) ulimit -v 8388608
>   2) java -XX:MinHeapSize=5g -version
> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
> 
> One more important fact is that this bug can be more common on Linux-32 systems.
> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
> 
> Testing:
>   - tier1 ~ tier3 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567

Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:

 - Don't check the exit value
 - Update the test
 - Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8257230
 - Merge branch 'master' into JDK-8257230
 - Refinement & jtreg test
 - Merge branch 'master' into JDK-8257230
 - Merge branch 'master' into JDK-8257230
 - 8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1492/files
  - new: https://git.openjdk.java.net/jdk/pull/1492/files/92208d48..2c6ff74c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=02-03

  Stats: 1035 lines in 91 files changed: 566 ins; 345 del; 124 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1492.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1492/head:pull/1492

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Fri Dec  4 04:11:57 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Fri, 4 Dec 2020 04:11:57 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
 <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>
Message-ID: <XiBdUlkQB0nQS6nOfpon45Jbr_myuYGqB_tzzx07T6E=.fdb2d7ea-e6c5-471b-96da-2cc5ccf817c9@github.com>

On Thu, 3 Dec 2020 13:47:23 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Jie Fu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>> 
>>  - Merge branch 'master' into JDK-8257230
>>  - Refinement & jtreg test
>>  - Merge branch 'master' into JDK-8257230
>>  - Merge branch 'master' into JDK-8257230
>>  - 8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes
>
> Took a closer look at the test now, some comment below.

Hi @kstefanj ,

Thanks for your review.

The fix has been updated according to your comments.

I don't check the exit value because I found different platforms may return different values (Zero on MacOS, but non-zero on Linux).
Maybe, It should return 0 on Linux too, which I think is more reasonable.
And I'd like to file another bug to fix it in the future.

Thanks.
Best regards,
Jie

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From thomas.stuefe at gmail.com  Fri Dec  4 07:31:44 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 4 Dec 2020 08:31:44 +0100
Subject: linux, large page initialization code question
Message-ID: <CAA-vtUz1_EXPDQ_njC3DFTOng0UMEP69sWmiy71Jqd68AyPiFw@mail.gmail.com>

Hi,

looking a the large-page initialization code, in
os::Linux::find_default_large_page_size(). I see the following segment:

```
  // If we can't determine the value (e.g. /proc is not mounted, or the text
  // format has been changed), we'll use the largest page size supported by
  // the processor.

#ifndef ZERO
  large_page_size =
    AARCH64_ONLY(2 * M)
    AMD64_ONLY(2 * M)
    ARM32_ONLY(2 * M)
    IA32_ONLY(4 * M)
    IA64_ONLY(256 * M)
    PPC_ONLY(4 * M)
    S390_ONLY(1 * M);
#endif // ZERO
```

This seems so strange:
- can we simply assume a huge page size if the proc file system gives no
indication for it?
- planning on the proc file system format changing is probably unnecessary;
it's well defined and will probably never change in an incompatible way.
- and if /proc is not mounted, a lot of other things would not work, no? Is
that even possible? I never saw a linux box without /proc
- and why disabled for ZERO?

What is the story behind this? The original change is lost to me (the
earliest I looked into was jdk8), and the #ifndef zero change came in with
"6890308: integrate zero assembler hotspot changes" without any explanation
I could find.

Bottom line, do we still need it? I think just assuming a large page size
if the system is not configured for it is wrong.

Thanks, Thomas


From fweimer at redhat.com  Fri Dec  4 08:30:24 2020
From: fweimer at redhat.com (Florian Weimer)
Date: Fri, 04 Dec 2020 09:30:24 +0100
Subject: linux, large page initialization code question
In-Reply-To: <CAA-vtUz1_EXPDQ_njC3DFTOng0UMEP69sWmiy71Jqd68AyPiFw@mail.gmail.com>
 ("Thomas =?utf-8?Q?St=C3=BCfe=22's?= message of "Fri, 4 Dec 2020 08:31:44
 +0100")
References: <CAA-vtUz1_EXPDQ_njC3DFTOng0UMEP69sWmiy71Jqd68AyPiFw@mail.gmail.com>
Message-ID: <87y2ieq8gv.fsf@oldenburg2.str.redhat.com>

* Thomas St?fe:

> - and if /proc is not mounted, a lot of other things would not work, no? Is
> that even possible? I never saw a linux box without /proc

It used to work on uniprocessor CPUs only, due to this glibc bug (if you
want to call it that):

  <https://sourceware.org/bugzilla/show_bug.cgi?id=21542>

Before the fix, the JVM would crash pretty quickly on multi-processor
systems because necessary barriers were missing.  With the glibc fix
(which went into glibc 2.26), it seemed to work, based on my testing at
the time.

Whether that makes sense is a different question, of course.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


From tschatzl at openjdk.java.net  Fri Dec  4 08:42:58 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 4 Dec 2020 08:42:58 GMT
Subject: Integrated: 8257509: Strengthen requirements to call
 G1HeapVerifier::verify(VerifyOption)
In-Reply-To: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
References: <m4_eCgNJbbz56Fm8iAHAUn0xPVAeSS_PLYftqxCLWw4=.11cb7272-0caf-405f-a675-823f296ba03b@github.com>
Message-ID: <n_xjQvKXZP9ibc-c_XDyA-RN1tn_2CZwvhf69ji7t2Y=.bb409197-f775-4c18-981c-cdc325694cf3@github.com>

On Thu, 3 Dec 2020 08:52:02 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this little change to strengthen the requirements for calling G1HeapVerifier::verify(VerifyOption)?
> 
> In particular, instead of the failed attempt to abort verification if we are not at a safepoint, assert that we are at a safepoint.
> 
> Testing: tier1-5 with no failures
> 
> Thanks,
>   Thomas

This pull request has now been integrated.

Changeset: ca402671
Author:    Thomas Schatzl <tschatzl at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/ca402671
Stats:     6 lines in 1 file changed: 0 ins; 5 del; 1 mod

8257509: Strengthen requirements to call G1HeapVerifier::verify(VerifyOption)

Reviewed-by: sjohanss, ayang

-------------

PR: https://git.openjdk.java.net/jdk/pull/1590


From sjohanss at openjdk.java.net  Fri Dec  4 09:17:55 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 4 Dec 2020 09:17:55 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <XiBdUlkQB0nQS6nOfpon45Jbr_myuYGqB_tzzx07T6E=.fdb2d7ea-e6c5-471b-96da-2cc5ccf817c9@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
 <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>
 <XiBdUlkQB0nQS6nOfpon45Jbr_myuYGqB_tzzx07T6E=.fdb2d7ea-e6c5-471b-96da-2cc5ccf817c9@github.com>
Message-ID: <HTtK0AeWoqWNsONcg1bcizTfYqw9yk6dL7q3dbBHpTE=.6f348996-755d-4a2e-9b1a-eec238f60144@github.com>

On Fri, 4 Dec 2020 04:09:14 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Took a closer look at the test now, some comment below.
>
> Hi @kstefanj ,
> 
> Thanks for your review.
> 
> The fix has been updated according to your comments.
> 
> I don't check the exit value because I found different platforms may return different values (Zero on MacOS, but non-zero on Linux).
> Maybe, It should return 0 on Linux too, which I think is more reasonable.
> And I'd like to file another bug to fix it in the future.
> 
> Thanks.
> Best regards,
> Jie

It should be 0 on Linux and after the addition of `@requires os.family == "linux"` it should only be run on Linux. Doing some manual runs show that the JVM can't start with an ulimit as low as in the test. If the startup in the test is not successful I don't think the test has any value, so we need to find values that make it reliable.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From kbarrett at openjdk.java.net  Fri Dec  4 09:42:05 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 4 Dec 2020 09:42:05 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
Message-ID: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>

Please review this reimplementation of WeakProcessorPhase.  It is changed to
a scoped enum at namespace scope, and uses the recently added EnumIterator
facility to provide iteration, rather than a bespoke iterator class.

This is a step toward eliminating it entirely.  I've split it out into a
separate PR to make the review of the follow-up work a bit easier.

As part of this the file weakProcessorPhases.hpp is renamed to
weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
rename and (majorly) edit, instead treating it as a remove and add a new
file.

Testing: mach5 tier1

-------------

Commit messages:
 - simplify phases and use enum class

Changes: https://git.openjdk.java.net/jdk/pull/1620/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1620&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257676
  Stats: 219 lines in 8 files changed: 38 ins; 171 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1620/head:pull/1620

PR: https://git.openjdk.java.net/jdk/pull/1620


From jiefu at openjdk.java.net  Fri Dec  4 09:42:56 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Fri, 4 Dec 2020 09:42:56 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <HTtK0AeWoqWNsONcg1bcizTfYqw9yk6dL7q3dbBHpTE=.6f348996-755d-4a2e-9b1a-eec238f60144@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
 <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>
 <XiBdUlkQB0nQS6nOfpon45Jbr_myuYGqB_tzzx07T6E=.fdb2d7ea-e6c5-471b-96da-2cc5ccf817c9@github.com>
 <HTtK0AeWoqWNsONcg1bcizTfYqw9yk6dL7q3dbBHpTE=.6f348996-755d-4a2e-9b1a-eec238f60144@github.com>
Message-ID: <lqwFmb-osAv3arVfMwOXZVVw6SCQXYU77Ka5lyC6hlM=.df260e56-20a2-4468-9a4d-b1ab18967237@github.com>

On Fri, 4 Dec 2020 09:14:53 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> It should be 0 on Linux and after the addition of `@requires os.family == "linux"` it should only be run on Linux. Doing some manual runs show that the JVM can't start with an ulimit as low as in the test. If the startup in the test is not successful I don't think the test has any value, so we need to find values that make it reliable.

Hi @kstefanj ,

The test is used to check whether the assert is triggered.
Before the fix, it failed.
After the fix, it passed.

As I mentioned above, there seems to be another bug on Linux.
It does return 0 on MacOS.
And I also think it should return 0 on Linux.
I'll file another bug to fix it. 

What do you think?

Thanks.
Best regards,
Jie

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From sjohanss at openjdk.java.net  Fri Dec  4 10:16:18 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 4 Dec 2020 10:16:18 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <lqwFmb-osAv3arVfMwOXZVVw6SCQXYU77Ka5lyC6hlM=.df260e56-20a2-4468-9a4d-b1ab18967237@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
 <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>
 <XiBdUlkQB0nQS6nOfpon45Jbr_myuYGqB_tzzx07T6E=.fdb2d7ea-e6c5-471b-96da-2cc5ccf817c9@github.com>
 <HTtK0AeWoqWNsONcg1bcizTfYqw9yk6dL7q3dbBHpTE=.6f348996-755d-4a2e-9b1a-eec238f60144@github.com>
 <lqwFmb-osAv3arVfMwOXZVVw6SCQXYU77Ka5lyC6hlM=.df260e56-20a2-4468-9a4d-b1ab18967237@github.com>
Message-ID: <-lpdukPLhMHF_rrpGiUpg81Gaa8S_u9IycRmwd30I3I=.32943dab-d48f-488b-aa3f-c77c166f4063@github.com>

On Fri, 4 Dec 2020 09:40:16 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> It should be 0 on Linux and after the addition of `@requires os.family == "linux"` it should only be run on Linux. Doing some manual runs show that the JVM can't start with an ulimit as low as in the test. If the startup in the test is not successful I don't think the test has any value, so we need to find values that make it reliable.
>
>> It should be 0 on Linux and after the addition of `@requires os.family == "linux"` it should only be run on Linux. Doing some manual runs show that the JVM can't start with an ulimit as low as in the test. If the startup in the test is not successful I don't think the test has any value, so we need to find values that make it reliable.
> 
> Hi @kstefanj ,
> 
> The test is used to check whether the assert is triggered.
> Before the fix, it failed.
> After the fix, it passed.
> 
> As I mentioned above, there seems to be another bug on Linux.
> It does return 0 on MacOS.
> And I also think it should return 0 on Linux.
> I'll file another bug to fix it. 
> 
> What do you think?
> 
> Thanks.
> Best regards,
> Jie

Yes it might check that the assert doesn't trigger, but if the test is not robust enough to always manage to execute `java -version` we might start to see other failures in that test. In some sense the test is just lucky that it fails in a way that a hs_err-file is not created.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From stuefe at openjdk.java.net  Fri Dec  4 12:18:15 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Fri, 4 Dec 2020 12:18:15 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
Message-ID: <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>

On Thu, 3 Dec 2020 20:26:23 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

> > > 
> > > madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.
> > 
> > 
> > How so? What is the difference?
> 
> madvise is really a hint and doesn't have exact effect as a real uncommit. A real, immediately accountable uncommit is a
> 
> https://github.com/openjdk/jdk/blob/114d9cffd62cab42790b65091648fe75345c4533/src/hotspot/os/bsd/os_bsd.cpp#L2015
> 
> . As I cannot do this for executable memory, I use madvise as an alternative to nothing.

I may have not been clear, sorry. My point was:

For uncommit, we seem to have the option of either:

1) mmap(MAP_FIXED|MAP_NORESERVE, PROT_NONE)
2) madvise(MADV_FREE) + mprotect(PROT_NONE)

You do (1) for !exec, (2) for exec. Why?

Either (2) works - reclaims memory for the OS. Then it can be used in all cases, exec or !exec. The commit would be a simple mprotect(PROT_RW). For uncommit, we would need no exec parameter.

Or (2) does not work, as you claim. Then why bother at all?

Interestingly, your initial proposal would have resulted in the following sequence of calls when committing executable memory:
anon_mmap() 	-> mmap(MAP_PRIVATE | MAP_NORESERVE | MAP_ANONYMOUS | MAP_JIT, PROT_NONE)
commit_memory() -> mprotect(PROT_READ|PROT_WRITE|PROT_EXEC)
Note how on commit_memory() we never clear the MAP_NORESERVE flag. And still commit works? And does not trap on access? Because if that works this is an indication that MAP_NORESERVE has no meaning on MacOS.

I found nothing about MAP_NORESERVE in the kernel source you posted [1], in the MacOS manpage for mmap [2] nor the OpenBSD mmap manpage [3]. MAP_NORESERVE is a non-Posix extension, so I wonder if it gets even honored on MacOS or if they just provided the flag to avoid build errors.

If MAP_NORESERVE has no meaning, we do not need to call mmap() for committing and uncommitting; mprotect, maybe combined with madvise(MADV_FREE), should suffice.

Thanks, Thomas

[1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
[2] https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html
[3] https://man.openbsd.org/mmap.2

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From jiefu at openjdk.java.net  Fri Dec  4 12:31:29 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Fri, 4 Dec 2020 12:31:29 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
Message-ID: <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>

> Hi all,
> 
> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
> For example, this assert [1] fired on our testing boxes.
> 
> It can be reproduced by the following two steps on Linux-64:
>   1) ulimit -v 8388608
>   2) java -XX:MinHeapSize=5g -version
> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
> 
> One more important fact is that this bug can be more common on Linux-32 systems.
> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
> 
> Testing:
>   - tier1 ~ tier3 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567

Jie Fu has updated the pull request incrementally with one additional commit since the last revision:

  Check the exit status

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1492/files
  - new: https://git.openjdk.java.net/jdk/pull/1492/files/2c6ff74c..5017f63c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1492&range=03-04

  Stats: 5 lines in 1 file changed: 1 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1492.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1492/head:pull/1492

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Fri Dec  4 12:35:12 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Fri, 4 Dec 2020 12:35:12 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v3]
In-Reply-To: <-lpdukPLhMHF_rrpGiUpg81Gaa8S_u9IycRmwd30I3I=.32943dab-d48f-488b-aa3f-c77c166f4063@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <zNkPhA_xVFXNKJuVegyAumrzyBbVq8sjXOYlH8W7Ixw=.41e1e6e8-08f3-4ab4-9c4e-59d95c40b0fc@github.com>
 <8Sh9g66n72t2BiS27DB0ucD-Do2zZU_Uo2Wxn97lows=.084f8bc4-f4e4-4ca2-a3ed-7320feeae03f@github.com>
 <XiBdUlkQB0nQS6nOfpon45Jbr_myuYGqB_tzzx07T6E=.fdb2d7ea-e6c5-471b-96da-2cc5ccf817c9@github.com>
 <HTtK0AeWoqWNsONcg1bcizTfYqw9yk6dL7q3dbBHpTE=.6f348996-755d-4a2e-9b1a-eec238f60144@github.com>
 <lqwFmb-osAv3arVfMwOXZVVw6SCQXYU77Ka5lyC6hlM=.df260e56-20a2-4468-9a4d-b1ab18967237@github.com>
 <-lpdukPLhMHF_rrpGiUpg81Gaa8S_u9IycRmwd30I3I=.32943dab-d48f-488b-aa3f-c77c166f4063@github.com>
Message-ID: <xqR-cS62MCTZPYFOCesAis4rv_8xLetu337GX4AW8Oo=.fdac0a4a-4083-41e0-968d-9a5c419c0fcb@github.com>

On Fri, 4 Dec 2020 10:13:19 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Yes it might check that the assert doesn't trigger, but if the test is not robust enough to always manage to execute `java -version` we might start to see other failures in that test. In some sense the test is just lucky that it fails in a way that a hs_err-file is not created.

Hi @kstefanj ,

After some experiments, I finally got a configuration which can return 0 on Linux.
Could you please review it again?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From sjohanss at openjdk.java.net  Fri Dec  4 12:40:19 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 4 Dec 2020 12:40:19 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
Message-ID: <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>

On Fri, 4 Dec 2020 12:31:29 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Hi all,
>> 
>> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
>> For example, this assert [1] fired on our testing boxes.
>> 
>> It can be reproduced by the following two steps on Linux-64:
>>   1) ulimit -v 8388608
>>   2) java -XX:MinHeapSize=5g -version
>> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
>> 
>> One more important fact is that this bug can be more common on Linux-32 systems.
>> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
>> 
>> Testing:
>>   - tier1 ~ tier3 on Linux/x64
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
>> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
>> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567
>
> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Check the exit status

This looks better. I'll run it through or testing env to make sure it passes there as well.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1492


From akozlov at openjdk.java.net  Fri Dec  4 15:33:14 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Fri, 4 Dec 2020 15:33:14 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
Message-ID: <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>

On Fri, 4 Dec 2020 12:15:33 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>>> > > reserve_memory :
>>> > > 
>>> > > * not executable: mmap MAP_NORESERVE, PROT_NONE
>>> > > * executable: mmap MAP_JIT _without_ MAP_NORESERVE, PROT_READ|PROT_WRITE|PROT_EXEC (so its committed and accessible right away)
>>> > > 
>>> > > commit_memory
>>> > > 
>>> > > * not executable: mmap without MAP_NORESERVE, PROT_READ|PROT_WRITE
>>> > > * executable: (return, nothing to do)
>>> > > 
>>> > > uncommit_memory
>>> > > 
>>> > > * not executable: mmap MAP_NORESERVE, PROT_NONE
>>> > > * executable: (return, nothing to do, since you indicate that this is that memory does not get returned to the OS immediately)
>>> > >
>>> > I'm not sure, what is the aim of the simplification [above]?
>>> 
>>> To remove the coding depending on executable-ness from commit and uncommit.
>> 
>> Sorry, how can e.g uncommit choose executable or not-executable case, if executable parameter is not provided?
>> 
>> ---
>> 
>>> > Now access to uncommitted memory will cause a trap, just like on other platforms.
>>> 
>>> Sorry, you lost me there. Where would I get a trap?
>> 
>> I mean, after a call to pd_uncommit_memory on linux the memory mprotected with PROT_NONE. Any access to that region will generate a signal.
>> 
>> ---
>> 
>>> 
>>> My point was, for executable=true:
>>> 
>>> * on reserve, commit executable memory right away instead of on-demand committing later
>>> * on commit and uncommit, just do nothing
>> 
>> As far as I understand you propose to remove lines 2010 - 2014 https://github.com/openjdk/jdk/blob/114d9cffd62cab42790b65091648fe75345c4533/src/hotspot/os/bsd/os_bsd.cpp#L2010
>> 
>> but later you suggest (in different context, but the statement is correct)
>> 
>>> The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this
>> 
>> The current implementation does mprotect(NONE). madvice(FREE) is not accounted immediately, but this hint is better than nothing.
>> 
>> ---
>> 
>>> > > Furthermore, about uncommit: I wonder whether madvice(MADV_FREE) would alone be already sufficient to release the memory. I have no Mac and cannot test this. The range would still be accessible though, but combining that with mprotect(PROT_NONE) should take care of this. Then we could just in general avoid the mmap(MAP_NORESERVE|MAP_FIXED) call. Then we do not need the exec parameter for uncommit at all.
>>> > 
>>> > 
>>> > madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.
>>> 
>>> How so? What is the difference?
>> 
>> madvise is really a hint and doesn't have exact effect as a real uncommit. A real, immediately accountable uncommit is a https://github.com/openjdk/jdk/blob/114d9cffd62cab42790b65091648fe75345c4533/src/hotspot/os/bsd/os_bsd.cpp#L2015. As I cannot do this for executable memory, I use madvise as an alternative to nothing.
>
>> > > 
>> > > madvise(FREE) is not sufficient unfortunately. For executable memory, it's best we can do. But we should not use it for regular memory.
>> > 
>> > 
>> > How so? What is the difference?
>> 
>> madvise is really a hint and doesn't have exact effect as a real uncommit. A real, immediately accountable uncommit is a
>> 
>> https://github.com/openjdk/jdk/blob/114d9cffd62cab42790b65091648fe75345c4533/src/hotspot/os/bsd/os_bsd.cpp#L2015
>> 
>> . As I cannot do this for executable memory, I use madvise as an alternative to nothing.
> 
> I may have not been clear, sorry. My point was:
> 
> For uncommit, we seem to have the option of either:
> 
> 1) mmap(MAP_FIXED|MAP_NORESERVE, PROT_NONE)
> 2) madvise(MADV_FREE) + mprotect(PROT_NONE)
> 
> You do (1) for !exec, (2) for exec. Why?
> 
> Either (2) works - reclaims memory for the OS. Then it can be used in all cases, exec or !exec. The commit would be a simple mprotect(PROT_RW). For uncommit, we would need no exec parameter.
> 
> Or (2) does not work, as you claim. Then why bother at all?
> 
> Interestingly, your initial proposal would have resulted in the following sequence of calls when committing executable memory:
> anon_mmap() 	-> mmap(MAP_PRIVATE | MAP_NORESERVE | MAP_ANONYMOUS | MAP_JIT, PROT_NONE)
> commit_memory() -> mprotect(PROT_READ|PROT_WRITE|PROT_EXEC)
> Note how on commit_memory() we never clear the MAP_NORESERVE flag. And still commit works? And does not trap on access? Because if that works this is an indication that MAP_NORESERVE has no meaning on MacOS.
> 
> I found nothing about MAP_NORESERVE in the kernel source you posted [1], in the MacOS manpage for mmap [2] nor the OpenBSD mmap manpage [3]. MAP_NORESERVE is a non-Posix extension, so I wonder if it gets even honored on MacOS or if they just provided the flag to avoid build errors.
> 
> If MAP_NORESERVE has no meaning, we do not need to call mmap() for committing and uncommitting; mprotect, maybe combined with madvise(MADV_FREE), should suffice.
> 
> Thanks, Thomas
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2] https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html
> [3] https://man.openbsd.org/mmap.2

> 1. mmap(MAP_FIXED|MAP_NORESERVE, PROT_NONE)
> 2. madvise(MADV_FREE) + mprotect(PROT_NONE)
> 
> Or (2) does not work, as you claim. Then why bother at all?

Right, (2) is an actual state. It does not work like we'd want to (at least not immediately accounted in RSS). But the OS provides this interface and claims to react in some way.

>      MADV_FREE        Indicates that the application will not need the information contained in this address range, so the pages may be reused right away.  The address range will remain valid.  This is used with madvise() system call.

It enables the OS release the memory sometime later, for example. In contrast, doing nothing will keep the memory that is garbage.

>  I wonder if [MAP_NORESERVE] gets even honored on MacOS

Right, it's defined and not used, the only occurrence is https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/bsd/sys/mman.h#L116

The use of MAP_NORESRVE slipped in from the original BSD pd_commit_memory code. I can delete it, if it bothers.

> If MAP_NORESERVE has no meaning, we do not need to call mmap() for committing and uncommitting; mprotect, maybe combined with madvise(MADV_FREE), should suffice.

I could not follow, sorry. Why it's so?

Thanks,
Anton

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From stuefe at openjdk.java.net  Fri Dec  4 16:15:14 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Fri, 4 Dec 2020 16:15:14 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
Message-ID: <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>

On Fri, 4 Dec 2020 15:30:09 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

> > 1. mmap(MAP_FIXED|MAP_NORESERVE, PROT_NONE)
> > 2. madvise(MADV_FREE) + mprotect(PROT_NONE)
> > 
> > Or (2) does not work, as you claim. Then why bother at all?
> 
> Right, (2) is an actual state. It does not work like we'd want to (at least not immediately accounted in RSS). But the OS provides this interface and claims to react in some way.
> 
> > ```
> >  MADV_FREE        Indicates that the application will not need the information contained in this address range, so the pages may be reused right away.  The address range will remain valid.  This is used with madvise() system call.
> > ```
> 
> It enables the OS release the memory sometime later, for example. In contrast, doing nothing will keep the memory that is garbage.
>
> > I wonder if [MAP_NORESERVE] gets even honored on MacOS
> 
> Right, it's defined and not used, the only occurrence is https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/bsd/sys/mman.h#L116
> 
> The use of MAP_NORESRVE slipped in from the original BSD pd_commit_memory code. I can delete it, if it bothers.
> 
> > If MAP_NORESERVE has no meaning, we do not need to call mmap() for committing and uncommitting; mprotect, maybe combined with madvise(MADV_FREE), should suffice.
> 
> I could not follow, sorry. Why it's so?
> 

Your original chain of thought, if I understand you correctly, was like this:
- We want to provide MAP_JIT on a mapping. Apple tells us to.
- On reserve, we add it to the initial mmap call. Easy.
- But hotspot later - commit/uncommit - replaces that mapping again. With subsequent mmap calls. On those, MAP_FIXED is specified since the original mapping gets replaced.
- But on those secondary mmap calls we cannot add MAP_JIT. Because it cannot be combined with MAP_FIXED.
- Therefore we switch to: on commit, do mprotect(RW). On uncommit, madvise(MADV_FREE) + mprotect(NONE).

Am I right so far?

My thought is this: 

In existing code, the technique used to commit and uncommit memory is to switch the MAP_NORESERVE flag on the mapping, combined with a change in protection.

E.g. mmap(MAP_NORESERVE, PROT_NONE) to uncommit. No documentation says this releases memory. But from experience we see it work on Linux - the kernel takes the hint that the pages are not needed anymore.

Changing the MAP_NORESERVE flag is the only reason we have those subsequent mmap calls. Those that need MAP_FIXED. And conflict with MAP_JIT.

If MAP_NORESERVE is a noop on Mac, the only other effect those mmap calls have is the protection change: in uncommit, the mmap(MAP_NORESERVE|MAP_FIXED, PROT_NONE) would be equivalent to a mprotect(PROT_NONE). Since todays os::uncommit_memory() works on MacOS (at least I assume it does??), then that PROT_NONE protection change must be the reason for it working, since the MAP_NORESERVE is a noop.

So, if MAP_NORESERVE is a noop, there is no reason to have secondary calls to mmap(MAP_FIXED), which conflict with MAP_JIT. Then, the code you proposed for the exec=true path should work equally well for the standard path. That would make the coding a lot easier.

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From mgronlun at openjdk.java.net  Fri Dec  4 18:08:59 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Fri, 4 Dec 2020 18:08:59 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default)
Message-ID: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>

Greetings,

please help review this enhancement to let JFR sample object allocations by default.

A description is provided in the JIRA issue.

Thanks
Markus

-------------

Commit messages:
 - defensive initialization check
 - Whitespace errors
 - JFR Event Throttling

Changes: https://git.openjdk.java.net/jdk/pull/1624/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257602
  Stats: 2607 lines in 43 files changed: 2346 ins; 238 del; 23 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From cjplummer at openjdk.java.net  Fri Dec  4 20:18:12 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Fri, 4 Dec 2020 20:18:12 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>

On Thu, 3 Dec 2020 12:55:04 GMT, Per Liden <pliden at openjdk.org> wrote:

> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>> 
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>> 
>> and
>> 
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>> 
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>> 
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>> 
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
> 
> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
> 
> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
> 
> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java line 194:

> 192:                     debuggee.resume();
> 193:                     checkDebugeeAnswer_instances(className, baseInstances);
> 194:                     debuggee.suspend();

Before the changes in this PR, what was triggering the (expected) collection of the objects?

test/hotspot/jtreg/vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java line 85:

> 83:                     array.disableCollection();
> 84:                 } catch (ObjectCollectedException e) {
> 85:                     continue;

Maybe add a comment: "Since the VM is not suspended, the object may have been collected before disableCollection() could be called on it. Just ignore and continue doing allocations until we run out of memory."

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From cjplummer at openjdk.java.net  Fri Dec  4 21:04:15 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Fri, 4 Dec 2020 21:04:15 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <Wm71BbfKVs7DACh7TIrWmmZHwq0gBosjw55EusSGItw=.e4531645-512f-4423-8945-bdc5d5991f8e@github.com>

On Thu, 3 Dec 2020 12:55:04 GMT, Per Liden <pliden at openjdk.org> wrote:

> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>> 
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>> 
>> and
>> 
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>> 
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>> 
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>> 
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
> 
> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
> 
> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
> 
> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 632:

> 630:                     if (weakRef == NULL) {
> 631:                         EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewWeakGlobalRef");
> 632:                     }

I'm not so sure I agree that having a fatal error here is the right thing to do. The only other user of `weakenNode()` is `ObjectReference.disableCollection()`. It returns an error to the debugger if `weakenNode()` returns `NULL`. However, I'm not so sure that's a good thing to do here either, since it means the `VM.resume()` will need to fail. Possibly the error should just be ignored, and we live with the ref staying strong.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From akozlov at openjdk.java.net  Fri Dec  4 21:22:15 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Fri, 4 Dec 2020 21:22:15 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
 <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN
 6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
Message-ID: <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>

On Fri, 4 Dec 2020 16:12:24 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> > > 1. mmap(MAP_FIXED|MAP_NORESERVE, PROT_NONE)
> > > 2. madvise(MADV_FREE) + mprotect(PROT_NONE)
> > > 
> > > Or (2) does not work, as you claim. Then why bother at all?
> > 
> > 
> > Right, (2) is an actual state. It does not work like we'd want to (at least not immediately accounted in RSS). But the OS provides this interface and claims to react in some way.
> > > ```
> > >  MADV_FREE        Indicates that the application will not need the information contained in this address range, so the pages may be reused right away.  The address range will remain valid.  This is used with madvise() system call.
> > > ```
> > 
> > 
> > It enables the OS release the memory sometime later, for example. In contrast, doing nothing will keep the memory that is garbage.
> > > I wonder if [MAP_NORESERVE] gets even honored on MacOS
> > 
> > 
> > Right, it's defined and not used, the only occurrence is https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/bsd/sys/mman.h#L116
> > The use of MAP_NORESRVE slipped in from the original BSD pd_commit_memory code. I can delete it, if it bothers.
> > > If MAP_NORESERVE has no meaning, we do not need to call mmap() for committing and uncommitting; mprotect, maybe combined with madvise(MADV_FREE), should suffice.
> > 
> > 
> > I could not follow, sorry. Why it's so?
> 
> Your original chain of thought, if I understand you correctly, was like this:
> 
> * We want to provide MAP_JIT on a mapping. Apple tells us to.
> * On reserve, we add it to the initial mmap call. Easy.
> * But hotspot later - commit/uncommit - replaces that mapping again. With subsequent mmap calls. On those, MAP_FIXED is specified since the original mapping gets replaced.
> * But on those secondary mmap calls we cannot add MAP_JIT. Because it cannot be combined with MAP_FIXED.
> * Therefore we switch to: on commit, do mprotect(RW). On uncommit, madvise(MADV_FREE) + mprotect(NONE).
> 
> Am I right so far?

That's correct.

> 
> My thought is this:
> 
> In existing code, the technique used to commit and uncommit memory is to switch the MAP_NORESERVE flag on the mapping, combined with a change in protection.

I read this as MAP_NORESERVE is supposed to be an attribute of the mapping. I don't think this is true. On Linux it specifies, can the system overcommit in statisfying this mapping (https://man7.org/linux/man-pages/man2/mmap.2.html) By default overcommit is enabled and MAP_NORESERVE is also a "noop" on Linux, i.e. it specifies the mode that is enabled for all mappings anyway. This also mean that you can drop MAP_NORESERVE flag from uncommit and it will work anyway, on Linux and macOS.

What definition of MAP_NORESERVE do you use? 

> E.g. mmap(MAP_NORESERVE, PROT_NONE) to uncommit. No documentation says this releases memory. But from experience we see it work on Linux - the kernel takes the hint that the pages are not needed anymore.

It's not a hint. From https://man7.org/linux/man-pages/man2/mmap.2.html:

>  If the memory region specified by addr and len
              overlaps pages of any existing mapping(s), then the overlapped
              part of the existing mapping(s) *will be discarded.*

So one of uncommit's tasks is to release the memory.

> If MAP_NORESERVE is a noop on Mac, the only other effect those mmap calls have is the protection change: in uncommit, the mmap(MAP_NORESERVE|MAP_FIXED, PROT_NONE) would be equivalent to a mprotect(PROT_NONE).

mmap(MAP_NORESERVE, ...) is equal to mmap(...), but it does not mean that mmap(MAP_NORESERVE) is noop.

Also, after you do mprotect(NONE), you can do mprotect(RWX) and get back the original content. But after you do mmap(FIXED) the memory content is discarded, you cannot reverse this operation.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From cjplummer at openjdk.java.net  Fri Dec  4 21:26:12 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Fri, 4 Dec 2020 21:26:12 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <Wm71BbfKVs7DACh7TIrWmmZHwq0gBosjw55EusSGItw=.e4531645-512f-4423-8945-bdc5d5991f8e@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <Wm71BbfKVs7DACh7TIrWmmZHwq0gBosjw55EusSGItw=.e4531645-512f-4423-8945-bdc5d5991f8e@github.com>
Message-ID: <-2Yx99rM6jO7OHIzIlaHfdeojwgwwl7QthEqINqxiu4=.ae8922de-ea23-47c6-adff-3618cecd7eaf@github.com>

On Fri, 4 Dec 2020 21:01:13 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 632:
> 
>> 630:                     if (weakRef == NULL) {
>> 631:                         EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewWeakGlobalRef");
>> 632:                     }
> 
> I'm not so sure I agree that having a fatal error here is the right thing to do. The only other user of `weakenNode()` is `ObjectReference.disableCollection()`. It returns an error to the debugger if `weakenNode()` returns `NULL`. However, I'm not so sure that's a good thing to do here either, since it means the `VM.resume()` will need to fail. Possibly the error should just be ignored, and we live with the ref staying strong.

Another options is to save away the weakref in the node when strengthening. This would benefit `ObjectReference.disableCollection()` also, since it would no longer need to deal with a potential OOM. However, I'm not so sure it's actually worth doing. Trying to keep the debug session alive will having allocation errors is probably a fools errand.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From akozlov at openjdk.java.net  Fri Dec  4 22:29:25 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Fri, 4 Dec 2020 22:29:25 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v7]
In-Reply-To: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
Message-ID: <4mw_qwllDU7qLgqcm7Z_kxyGICpv18HZ_LrbidneSw4=.891574d8-45a6-4ecd-9dc9-be2070bdc3e6@github.com>

> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
> 
> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
> 
> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
> 
> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
> 
> Tested: 
> * local tier1
> * jdk-submit
> * codesign[2] with hardened runtime and allow-jit but without 
> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
> 
> (adding GC group as suggested by @dholmes-ora)
> 
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2]
>  
>   codesign \
>     --sign - \
>     --options runtime \
>     --entitlements ents.plist \
>     --timestamp \
>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
> [3]
>   <?xml version="1.0" encoding="UTF-8"?>
>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>   <plist version="1.0">
>     <dict>
>       <key>com.apple.security.cs.allow-jit</key>
>       <true/>
>       <key>com.apple.security.cs.disable-library-validation</key>
>       <true/>
>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>       <true/>
>     </dict>
>   </plist>

Anton Kozlov has updated the pull request incrementally with three additional commits since the last revision:

 - Fix style
 - JDK-8234930 v4: Use MAP_JIT when allocating pages for code cache on macOS
 - Revert "Separate executable_memory interface"
   
   This reverts commit 49253d8fe8963ce069f10783dcea5327079ba848.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/294/files
  - new: https://git.openjdk.java.net/jdk/pull/294/files/49253d8f..b3eb5b01

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=05-06

  Stats: 363 lines in 29 files changed: 46 ins; 175 del; 142 mod
  Patch: https://git.openjdk.java.net/jdk/pull/294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294

PR: https://git.openjdk.java.net/jdk/pull/294


From jiefu at openjdk.java.net  Sat Dec  5 04:43:13 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Sat, 5 Dec 2020 04:43:13 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
Message-ID: <QIeWb-Ujyq6dH-zkrkCYan_MKSLvx45It2QS5gmNY_Y=.aa99635d-2052-47be-824f-7b81a36ca155@github.com>

On Mon, 30 Nov 2020 13:42:56 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Check the exit status
>
> I think the change is good, but please add a test for this.
> 
> E.g. vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java shows how to run a command with an ulimit prepended.

Hi @tschatzl ,

Are you OK with the latest change?
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From stuefe at openjdk.java.net  Sat Dec  5 05:38:14 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 5 Dec 2020 05:38:14 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
 <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN
 6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
 <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>
Message-ID: <W8Etnx2h78q8BW3Qagqs3zNgJq1wxOYkRrMcGAIe9KI=.177031f4-20d4-4341-89f9-d651418e4072@github.com>

On Fri, 4 Dec 2020 21:19:53 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

> > 
> > Your original chain of thought, if I understand you correctly, was like this:
> > 
> > * We want to provide MAP_JIT on a mapping. Apple tells us to.
> > * On reserve, we add it to the initial mmap call. Easy.
> > * But hotspot later - commit/uncommit - replaces that mapping again. With subsequent mmap calls. On those, MAP_FIXED is specified since the original mapping gets replaced.
> > * But on those secondary mmap calls we cannot add MAP_JIT. Because it cannot be combined with MAP_FIXED.
> > * Therefore we switch to: on commit, do mprotect(RW). On uncommit, madvise(MADV_FREE) + mprotect(NONE).
> > 
> > Am I right so far?
> 
> That's correct.
> 
> > My thought is this:
> > In existing code, the technique used to commit and uncommit memory is to switch the MAP_NORESERVE flag on the mapping, combined with a change in protection.
> 
> I read this as MAP_NORESERVE is supposed to be an attribute of the mapping. I don't think this is true. On Linux it specifies, can the system overcommit in statisfying this mapping (https://man7.org/linux/man-pages/man2/mmap.2.html) By default overcommit is enabled and MAP_NORESERVE is also a "noop" on Linux, i.e. it specifies the mode that is enabled for all mappings anyway.

Thats not true. On Linux, if you mmap without specifying MAP_NORESERVE, the mmap size will increase your process commit charge. What that means depends on the overcommit setting. The default setting is a heuristic one where you are allowed some overcharging but there is a limit. My experience is that this mode gives us about 150% of the overcharge limit, after that you get allocation failures.

> This also mean that you can drop MAP_NORESERVE flag from uncommit and it will work anyway, on Linux and macOS.
> 
> What definition of MAP_NORESERVE do you use?
> 
> > E.g. mmap(MAP_NORESERVE, PROT_NONE) to uncommit. No documentation says this releases memory. But from experience we see it work on Linux - the kernel takes the hint that the pages are not needed anymore.
> 
> It's not a hint. From https://man7.org/linux/man-pages/man2/mmap.2.html:
> 
> > If the memory region specified by addr and len
> > overlaps pages of any existing mapping(s), then the overlapped
> > part of the existing mapping(s) _will be discarded._
> 
> So one of uncommit's tasks is to release the memory.
> 
> > If MAP_NORESERVE is a noop on Mac, the only other effect those mmap calls have is the protection change: in uncommit, the mmap(MAP_NORESERVE|MAP_FIXED, PROT_NONE) would be equivalent to a mprotect(PROT_NONE).
> 
> mmap(MAP_NORESERVE, ...) is equal to mmap(...), but it does not mean that mmap(MAP_NORESERVE) is noop.
> 
> Also, after you do mprotect(NONE), you can do mprotect(RWX) and get back the original content. But after you do mmap(FIXED) the memory content is discarded, you cannot reverse this operation.

Oh you are right! I had this completely wrong.

The reclaim effect is not caused by the fact that we specify MAP_NORELEASE. That's incidental. It comes from us plain replacing the old mapping with a blank new one and thereby discarding the old mapping?

So MAP_NORESERVE may not matter but mmap is still needed. Strictly speaking probably not even PROT_NONE would matter to have a reclaim effect. You could just map the new mapping with full rights and the memory would still reclaimed and stay that way until next time you touch it. Then commit could be a no-op (this is how things work on AIX).

Okay, I think I get closer to understanding the problem: 
- on reserve we create mapping 1 with MAP_JIT
- on uncommit, we replace the mapping with mapping 2, discarding mapping 1. mapping 2 does not need MAP_JIT. If we never recommit thats fine.
- but then, on re-commit, we replace mapping 2 with mapping 3. mapping 3 again needs MAP_JIT. But since this needs MAP_FIXED, MAP_JIT will not work.

In other words, once a mapping with MAP_JIT is established, we must never replace it, since any subsequent mmap calls would need to be established with MAP_FIXED. Sigh.

My only remaining question is: is there really an observable difference between replacing the mapping with mmap and calling madvice(MADV_FREE)? And if there is, does it matter in practice?  I wonder if the perceived difference between madvise(MADV_FREE) and mmap() is just a display problem. Seems the kernel is lazy about reclaiming that memory - but that is fine and makes sense performance wise, it just throws off statistics. By that logic, using mmap() tries to second-guess what the kernel does could maybe be inferior to using madvise.

I found this: https://stackoverflow.com/questions/7718964/how-can-i-force-macos-to-release-madv-freed-pages. One remark recommends MADV_FREE_REUSABLE to deal with the display problem; could that be a solution (still aiming for using madvise() for all, not just executable, mappings, thereby removing the need to pass exec to commit/uncommit).

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Sat Dec  5 10:56:13 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Sat, 5 Dec 2020 10:56:13 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <W8Etnx2h78q8BW3Qagqs3zNgJq1wxOYkRrMcGAIe9KI=.177031f4-20d4-4341-89f9-d651418e4072@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
 <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN
 6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
 <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>
 <W8Etnx2h78q8BW3Qagqs3zNgJq1wxOYkRrMcGAIe9KI=.177031f4-20d4-4341-89f9-d651418e4072@github.com>
Message-ID: <0Rl1rgfPK8tQJ9KPwMTTTqlN_GjyxjIwBSUXtHIvUyo=.11610ef7-95f8-4d8d-872a-f38960d320ff@github.com>

On Sat, 5 Dec 2020 05:34:57 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> I found this: https://stackoverflow.com/questions/7718964/how-can-i-force-macos-to-release-madv-freed-pages. One remark recommends MADV_FREE_REUSABLE to deal with the display problem; could that be a solution 

I'd found MADV_FREE_REUSABLE as well. One problem is that it's barely documented. The only description from the vendor I could find was
#define MADV_FREE               5       /* pages unneeded, discard contents */
#define MADV_ZERO_WIRED_PAGES   6       /* zero the wired pages that have not been unwired before the entry is deleted */
#define MADV_FREE_REUSABLE      7       /* pages can be reused (by anyone) */
#define MADV_FREE_REUSE         8       /* caller wants to reuse those pages */

The other problem, it cannot substitute mmap completely, see below.

> My only remaining question is: is there really an observable difference between replacing the mapping with mmap and calling madvice(MADV_FREE)? And if there is, does it matter in practice?

Yes, it is. For a sample program after uncommit implemented by different ways, mmap the only way to reduce occupied memory size in Activity Monitor (system GUI application user will likely look to).

* no uncommit
./test noop              
do not uncommit          
Physical footprint:         512.3M
Physical footprint (peak):  512.3M
VM_ALLOCATE                 109951000-111951000    [128.0M 128.0M 128.0M     0K] rw-/rwx SM=COW          
VM_ALLOCATE                 111951000-119951000    [128.0M 128.0M 128.0M     0K] rw-/rwx SM=COW          
VM_ALLOCATE                 119951000-121951000    [128.0M 128.0M 128.0M     0K] rw-/rwx SM=COW          
VM_ALLOCATE                 121951000-129951000    [128.0M 128.0M 128.0M     0K] rw-/rwx SM=COW          
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION   
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced) 
===========                     ======= ========    =====  ======= ========   ======    =====  =======   
VM_ALLOCATE                      512.0M   512.0M   512.0M       0K       0K       0K       0K        4  

* MADV_FREE reduces Dirty size, but does not affect Rss and Physical footprint
./test madv_free
madvise 
Physical footprint:         512.3M
Physical footprint (peak):  512.3M
VM_ALLOCATE                 108269000-110269000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW
VM_ALLOCATE                 110269000-118269000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW
VM_ALLOCATE                 118269000-120269000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW  
VM_ALLOCATE                 120269000-128269000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW  
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced)
===========                     ======= ========    =====  ======= ========   ======    =====  =======
VM_ALLOCATE                      512.0M   512.0M       0K       0K       0K       0K       0K        4
* MADV_FREE_REUSABLE reduces Physical footprint (whatever it is)
./test madv_free_reuse          
madvise reuse                    
Physical footprint:         292K
Physical footprint (peak):  512.3M
VM_ALLOCATE                 10d568000-115568000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW
VM_ALLOCATE                 115568000-11d568000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW
VM_ALLOCATE                 11d568000-125568000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW
VM_ALLOCATE                 125568000-12d568000    [128.0M 128.0M     0K     0K] rw-/rwx SM=COW
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced)
===========                     ======= ========    =====  ======= ========   ======    =====  =======
VM_ALLOCATE                      512.0M   512.0M       0K       0K       0K       0K       0K        4
There is a problem that Activity Monitor apparently looks for Rss and reports no change:

![2020-12-05-130210_956x42_scrot](https://user-images.githubusercontent.com/919084/101240204-d51d6a80-36fe-11eb-806b-30c06ad90a81.png)

* and mmap reduces Rss as well
./test mmap              
new mmap                 
Physical footprint:         292K
Physical footprint (peak):  512.3M
VM_ALLOCATE                 10df01000-12df01000    [512.0M     0K     0K     0K] ---/rwx SM=NUL          
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION   
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced) 
===========                     ======= ========    =====  ======= ========   ======    =====  =======   
VM_ALLOCATE                      512.0M       0K       0K       0K       0K       0K       0K      

Activity Monitor (results are OK now)

![2020-12-05-131429_973x36_scrot](https://user-images.githubusercontent.com/919084/101240215-ea929480-36fe-11eb-858a-cef8a0f3327e.png)

So, when possible, we should do new mmap for uncommit.

---
Source code for the test:
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <sys/mman.h>

int main(int argc, char *argv[]) {
        char r;

        int pagesize = 4096;
        int size = 512 * 1024 * 1024;

        char *a = mmap(NULL, size,
                        PROT_NONE,
                        MAP_ANON | MAP_PRIVATE, -1, 0);
        if (a == MAP_FAILED) {
                perror("mmap()");
                return 1;
        }

        if (mprotect(a, size, PROT_READ | PROT_WRITE)) {
                perror("mprotect");
                return 1;
        }

        for (int i = 0; i < size; i += 4096) {
                a[i] = 1;
        }

        if (!strcmp(argv[1], "madv_free")) {
                printf("madvise\n");
                if (mprotect(a, PROT_NONE, size)) {
                        perror("mprotect");
                        return 1;
                }
                if (madvise(a, size, MADV_FREE)) {
                        perror("madvise");
                        return 1;
                }
        } else if (!strcmp(argv[1], "madv_free_reuse")) {
                printf("madvise reuse\n");
                if (mprotect(a, PROT_NONE, size)) {
                        perror("mprotect");
                        return 1;
                }
                if (madvise(a, size, MADV_FREE_REUSABLE)) {
                        perror("madvise");
                        return 1;
                }
        } else if (!strcmp(argv[1], "mmap")) {
                printf("new mmap\n");
                if (MAP_FAILED == mmap(a, size, PROT_NONE, MAP_ANON | MAP_PRIVATE| MAP_FIXED, -1, 0)) {
                        perror("mmap2");
                        return 1;
                }
        } else {
                printf("do not uncommit\n");
        }
        fflush(stdout);

        char cmd[128];
        snprintf(cmd, sizeof(cmd), "vmmap %d | awk '/Phys/ || /VM_ALLOCATE/'", getpid());
        system(cmd);

        read(0, &r, 1);

        return 0;
}

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From stuefe at openjdk.java.net  Sat Dec  5 13:26:14 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 5 Dec 2020 13:26:14 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <0Rl1rgfPK8tQJ9KPwMTTTqlN_GjyxjIwBSUXtHIvUyo=.11610ef7-95f8-4d8d-872a-f38960d320ff@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
 <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN
 6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
 <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>
 <W8Etnx2h78q8BW3Qagqs3zNgJq1wxOYkRrMcGAIe9KI=.177031f4-20d4-4341-89f9-d651418e4072@github.com>
 <0Rl1rgfPK8tQJ9KPwMTTTqlN_GjyxjIwBSUXtHIvUyo=.11610ef7-95f8-4d8d-872a-f38960d320ff@github.com>
Message-ID: <q-CzfteSs71yUdQb3IWkUVNwU66J5TPBa7VRVcgn9nE=.3cca0e99-4da7-4376-86e4-01d787939c66@github.com>

On Sat, 5 Dec 2020 10:52:03 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

> > I found this: https://stackoverflow.com/questions/7718964/how-can-i-force-macos-to-release-madv-freed-pages. One remark recommends MADV_FREE_REUSABLE to deal with the display problem; could that be a solution
> 
> I'd found MADV_FREE_REUSABLE as well. One problem is that it's barely documented. The only description from the vendor I could find was
> 
> ```
> #define MADV_FREE               5       /* pages unneeded, discard contents */
> #define MADV_ZERO_WIRED_PAGES   6       /* zero the wired pages that have not been unwired before the entry is deleted */
> #define MADV_FREE_REUSABLE      7       /* pages can be reused (by anyone) */
> #define MADV_FREE_REUSE         8       /* caller wants to reuse those pages */
> ```
> 
> The other problem, it cannot substitute mmap completely, see below.
> 
> > My only remaining question is: is there really an observable difference between replacing the mapping with mmap and calling madvice(MADV_FREE)? And if there is, does it matter in practice?
> 
> Yes, it is. For a sample program after uncommit implemented by different ways, mmap the only way to reduce occupied memory size in Activity Monitor (system GUI application user will likely look to).
> 

Okay, I see. Thanks for these tests, they are valuable. My one remaining doubt would be if the numbers were different in the face of memory pressure.

But I don't like to block this PR anymore, I caused enough work and discussions. So I am fine with the general thrust of the change:
- add exec to reserve and uncommit
- with the contract being that the exec parameter handed in with commit and uncommit has to match the one used with reserve.
Maybe we can have future improvements with these interfaces and reduce the complexity again (e.g. having an opaque handle structure holding mapping creation information). 

Is the current version review-worthy?

Thanks a lot for your patience,

..Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Sat Dec  5 14:55:12 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Sat, 5 Dec 2020 14:55:12 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <q-CzfteSs71yUdQb3IWkUVNwU66J5TPBa7VRVcgn9nE=.3cca0e99-4da7-4376-86e4-01d787939c66@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
 <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN
 6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
 <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>
 <W8Etnx2h78q8BW3Qagqs3zNgJq1wxOYkRrMcGAIe9KI=.177031f4-20d4-4341-89f9-d651418e4072@github.com>
 <0Rl1rgfPK8tQJ9KPwMTTTqlN_GjyxjIwBSUXtHIvUyo=.11610ef7-95f8-4d8d-872a-f38960d320ff@github.com>
 <q-CzfteSs71yUdQb3IWkUVNwU66J5TPBa7VRVcgn9nE=.3cca0e99-4da7-4376-86e4-01d787939c66@github.com>
Message-ID: <3tNI7G1GOXjH1xIJQoGswrg3DC63zq6FE3_wSnhAd4Y=.952df04c-71d9-48a0-aff2-7c2d64dbfeda@github.com>

On Sat, 5 Dec 2020 13:23:26 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> So I am fine with the general thrust of the change:
> * add exec to reserve and uncommit
> * with the contract being that the exec parameter handed in with commit and uncommit has to match the one used with reserve.

The latest version implements this approach. It's ready for review.

Thanks,
Anton

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From mgronlun at openjdk.java.net  Sun Dec  6 13:37:03 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Sun, 6 Dec 2020 13:37:03 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v2]
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <f1FMPaRO8Sykya6eSlUicMd_5DO0tcv8yJDs-akvAME=.967629fd-56ce-4d61-82c6-c7d1d1ed542f@github.com>

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:

  General ObjectAllocationSample event definition

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1624/files
  - new: https://git.openjdk.java.net/jdk/pull/1624/files/dba878aa..196d254d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=00-01

  Stats: 23 lines in 2 files changed: 0 ins; 18 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From sjohanss at openjdk.java.net  Sun Dec  6 16:02:13 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Sun, 6 Dec 2020 16:02:13 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <QIeWb-Ujyq6dH-zkrkCYan_MKSLvx45It2QS5gmNY_Y=.aa99635d-2052-47be-824f-7b81a36ca155@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <42WTAHqNoLjc1ycTfLeDZr9pSjwx17sNYcYW_6y4gNQ=.900a2927-ee29-4583-9761-4c69080793a8@github.com>
 <QIeWb-Ujyq6dH-zkrkCYan_MKSLvx45It2QS5gmNY_Y=.aa99635d-2052-47be-824f-7b81a36ca155@github.com>
Message-ID: <EwzcJMnUmyCXsbT0eeY2wFaFrKAxL3AnCb0kD4UupyY=.216aa207-b169-4f19-bd02-1d5de7953c6c@github.com>

On Sat, 5 Dec 2020 04:40:15 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> I think the change is good, but please add a test for this.
>> 
>> E.g. vmTestbase/nsk/jvmti/Allocate/alloc001/alloc001.java shows how to run a command with an ulimit prepended.
>
> Hi @tschatzl ,
> 
> Are you OK with the latest change?
> Thanks.

Didn't see any problems with the test in the testing environment, so I'm good.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From cgracie at openjdk.java.net  Sun Dec  6 17:45:19 2020
From: cgracie at openjdk.java.net (Charlie Gracie)
Date: Sun, 6 Dec 2020 17:45:19 GMT
Subject: RFR: 8257774: G1: Trigger collect when free region count drops below
 threshold to prevent evacuation failures
Message-ID: <JJHix9SgIjJWYZtIIrz3WDudrV1jApcffIsVanoNNqc=.8d8555dd-68c8-44f9-87eb-a6b9fbf10780@github.com>

Bursts of short lived Humongous object allocations can cause GCs to be initiated with 0 free regions. When these GCs happen they take significantly longer to complete. No objects are evacuated so there is a large amount of time spent in reversing self forwarded pointers and the only memory recovered is from the short lived humongous objects. My proposal is to add a check to the slow allocation path which will force a GC to happen if the number of free regions drops below the amount that would be required to complete the GC if it happened at that moment. The threshold will be based on the survival rates from Eden and survivor spaces along with the space required for Tenure space evacuations.

The goal is to resolve the issue with bursts of short lived humongous objects without impacting other workloads negatively. I would appreciate reviews and any feedback that you might have. Thanks.

Here are the links to the threads on the mailing list where I initially discussion the issue and my idea to resolve it:
https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-November/032189.html
https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-December/032677.html

-------------

Commit messages:
 - Improve G1GC accounting for humongous objects

Changes: https://git.openjdk.java.net/jdk/pull/1650/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1650&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257774
  Stats: 129 lines in 7 files changed: 96 ins; 10 del; 23 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1650.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1650/head:pull/1650

PR: https://git.openjdk.java.net/jdk/pull/1650


From mgronlun at openjdk.java.net  Sun Dec  6 21:03:05 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Sun, 6 Dec 2020 21:03:05 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v3]
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <L6FcXhruKLoooKrw6toj0w-1yRTgGftRblhIVPCrS6g=.a08e2f94-2214-43ad-a1ed-efbac3e10610@github.com>

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:

  ObjectAllocationSample event definition with weight

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1624/files
  - new: https://git.openjdk.java.net/jdk/pull/1624/files/196d254d..6918f0c8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=01-02

  Stats: 30 lines in 6 files changed: 3 ins; 11 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From zgu at openjdk.java.net  Mon Dec  7 01:12:18 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 7 Dec 2020 01:12:18 GMT
Subject: RFR: 8257793: Shenandoah: SATB barrier should only filter out already
 strongly marked oops
Message-ID: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>

SATB barrier intercepts oops for later marking, and those oops will be marked as strongly reachable. So that, it can only filter out oops that already strongly marked, not if they are only weakly marked.

- [x] hotspot_gc_shenandoah
- [x] nightly pipeline

-------------

Commit messages:
 - Merge branch 'master' into JDK-8257793-satb-filter
 - JDK-8257793

Changes: https://git.openjdk.java.net/jdk/pull/1655/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1655&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257793
  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1655.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1655/head:pull/1655

PR: https://git.openjdk.java.net/jdk/pull/1655


From david.holmes at oracle.com  Mon Dec  7 04:59:04 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Dec 2020 14:59:04 +1000
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <74bdc286-64be-1da1-e5df-a894f7912aff@oracle.com>

Hi Per,

On 3/12/2020 11:19 pm, Per Liden wrote:
> This PR replaces the withdraw PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>>
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>
>> and
>>
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().

I think we can quite reasonably infer that "suspending a VM" means 
calling VirtualMachine.suspend to suspend all the threads of the target VM.

> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.

You can imagine though that 25 years ago it was not an unreasonable 
assumption that GC only runs in response to (failed) allocation requests 
from running application threads - i.e. that it is a synchronous 
response to application code execution. Hence all threads suspended 
implies no allocation and thus no GC. (Someone can correct me if I'm 
wrong but way way back didn't running in JDI debug mode force use of 
SerialGC?)

I'm somewhat surprised that it has taken this long to discover that our 
GC's are no longer operating in a way that JDI requires them to.

> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.

I assume that the GC folk would be horrified if I were to suggest a 
global flag to enable/disable GC? ;-)

Doing what is suggested sounds reasonable, from a functional 
perspective, to get the desired effect of not collecting any objects of 
interest. But I do have to wonder how many objects we are typically 
dealing with and what the performance impact of this might be if we have 
to iterate through each all the objects?

Thanks,
David
-----

> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>   - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>   - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>   - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>   - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
> 
> -------------
> 
> Commit messages:
>   - 8255987: JDI tests fail with com.sun.jdi.ObjectCollectedException
> 
> Changes: https://git.openjdk.java.net/jdk/pull/1595/files
>   Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1595&range=00
>    Issue: https://bugs.openjdk.java.net/browse/JDK-8255987
>    Stats: 161 lines in 8 files changed: 132 ins; 0 del; 29 mod
>    Patch: https://git.openjdk.java.net/jdk/pull/1595.diff
>    Fetch: git fetch https://git.openjdk.java.net/jdk pull/1595/head:pull/1595
> 
> PR: https://git.openjdk.java.net/jdk/pull/1595
> 


From dholmes at openjdk.java.net  Mon Dec  7 06:07:14 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 7 Dec 2020 06:07:14 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>

On Thu, 3 Dec 2020 12:55:04 GMT, Per Liden <pliden at openjdk.org> wrote:

> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>> 
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>> 
>> and
>> 
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>> 
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>> 
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>> 
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
> 
> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
> 
> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
> 
> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

Overall seems okay. Some comments on tests as I think the existing test logic is quite confused in places.

Thanks,
David

src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 586:

> 584:                     jobject strongRef;
> 585: 
> 586:                     strongRef = strengthenNode(env, node);

This can just be one line.

src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 629:

> 627:                     jweak weakRef;
> 628: 
> 629:                     weakRef = weakenNode(env, node);

Again this can be a single line.

src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 1560:

> 1558:              * garbage collected while the VM is suspended.
> 1559:              */
> 1560:             commonRef_pinAll();

Can we have multiple VM.suspend calls? The  suspendAllCount seems to suggest that. In which case shouldn't we only pin on the 0->1 transition, and only unpin on the 1->0 transition?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From dholmes at openjdk.java.net  Mon Dec  7 06:07:17 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 7 Dec 2020 06:07:17 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
Message-ID: <0dKZv-rhWELcC1ig9H32zb_irP_ksIehm-dlY3njJP4=.94a95183-12f5-4443-8cc4-223a66111699@github.com>

On Fri, 4 Dec 2020 20:12:11 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java line 194:
> 
>> 192:                     debuggee.resume();
>> 193:                     checkDebugeeAnswer_instances(className, baseInstances);
>> 194:                     debuggee.suspend();
> 
> Before the changes in this PR, what was triggering the (expected) collection of the objects?

These changes aren't making sense to me - but then this test is not making much sense to me either. The testArrayType logic is quite different to testClassType and now seems invalid. It suspends the VM, then calls disableCollection on all the object refs of interest, then later calls enableCollection and then resumes the VM. The calls to disableCollection/enableCollection seem pointless if GC is disabled while the VM is suspended. I suspect this was added because VM suspension was not in fact stopping the GC.
The testClassType test is doing what? I can't tell what it expects to be checking with checkDebugeeAnswer_instances, but there's no VM suspension (presently) and no disableCollection calls. ???

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From dholmes at openjdk.java.net  Mon Dec  7 06:07:15 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 7 Dec 2020 06:07:15 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <-2Yx99rM6jO7OHIzIlaHfdeojwgwwl7QthEqINqxiu4=.ae8922de-ea23-47c6-adff-3618cecd7eaf@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <Wm71BbfKVs7DACh7TIrWmmZHwq0gBosjw55EusSGItw=.e4531645-512f-4423-8945-bdc5d5991f8e@github.com>
 <-2Yx99rM6jO7OHIzIlaHfdeojwgwwl7QthEqINqxiu4=.ae8922de-ea23-47c6-adff-3618cecd7eaf@github.com>
Message-ID: <_mg_tfWiVdggsrpKEDFeFAR1-22yUsy-tTe0foBDdC4=.191dc88a-df48-4127-805d-62fba59d2750@github.com>

On Fri, 4 Dec 2020 21:22:53 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 632:
>> 
>>> 630:                     if (weakRef == NULL) {
>>> 631:                         EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewWeakGlobalRef");
>>> 632:                     }
>> 
>> I'm not so sure I agree that having a fatal error here is the right thing to do. The only other user of `weakenNode()` is `ObjectReference.disableCollection()`. It returns an error to the debugger if `weakenNode()` returns `NULL`. However, I'm not so sure that's a good thing to do here either, since it means the `VM.resume()` will need to fail. Possibly the error should just be ignored, and we live with the ref staying strong.
>
> Another options is to save away the weakref in the node when strengthening. This would benefit `ObjectReference.disableCollection()` also, since it would no longer need to deal with a potential OOM. However, I'm not so sure it's actually worth doing. Trying to keep the debug session alive will having allocation errors is probably a fools errand.

I agree a fatal error here seems excessive. Simply maintaining the strong ref seems reasonable.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From cjplummer at openjdk.java.net  Mon Dec  7 06:30:15 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Mon, 7 Dec 2020 06:30:15 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
Message-ID: <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>

On Mon, 7 Dec 2020 05:18:12 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 1560:
> 
>> 1558:              * garbage collected while the VM is suspended.
>> 1559:              */
>> 1560:             commonRef_pinAll();
> 
> Can we have multiple VM.suspend calls? The  suspendAllCount seems to suggest that. In which case shouldn't we only pin on the 0->1 transition, and only unpin on the 1->0 transition?

That was something I pointed out in the pre-review, and it has been addressed in `commonRef_pinAll/unpinAll`:

`568         if (gdata->pinAllCount == 1) {`
`618         if (gdata->pinAllCount == 0) {`

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From david.holmes at oracle.com  Mon Dec  7 07:04:47 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Dec 2020 17:04:47 +1000
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
 <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>
Message-ID: <0eb7161b-278d-53f9-5072-9dd39273c31f@oracle.com>

On 7/12/2020 4:30 pm, Chris Plummer wrote:
> On Mon, 7 Dec 2020 05:18:12 GMT, David Holmes <dholmes at openjdk.org> wrote:
>>> 1558:              * garbage collected while the VM is suspended.
>>> 1559:              */
>>> 1560:             commonRef_pinAll();
>>
>> Can we have multiple VM.suspend calls? The  suspendAllCount seems to suggest that. In which case shouldn't we only pin on the 0->1 transition, and only unpin on the 1->0 transition?
> 
> That was something I pointed out in the pre-review, and it has been addressed in `commonRef_pinAll/unpinAll`:
> 
> `568         if (gdata->pinAllCount == 1) {`
> `618         if (gdata->pinAllCount == 0) {`

Okay. I would not have handled it at that level, but would have had 
pinAll/unpinAll operate unconditionally, but the calls to those methods 
being conditional based on the suspendAllCount.

David
-----

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/1595
> 


From shade at openjdk.java.net  Mon Dec  7 07:18:14 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 7 Dec 2020 07:18:14 GMT
Subject: RFR: 8257793: Shenandoah: SATB barrier should only filter out
 already strongly marked oops
In-Reply-To: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>
References: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>
Message-ID: <7fauCvSrD1qRn-H__nBpUlftaeTwJMVm89cf3aSO_M4=.3179a304-f0ef-4cc3-949e-48cc9fda4dd0@github.com>

On Mon, 7 Dec 2020 01:07:25 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> SATB barrier intercepts oops for later marking, and those oops will be marked as strongly reachable. So that, it can only filter out oops that already strongly marked, not if they are only weakly marked.
> 
> - [x] hotspot_gc_shenandoah
> - [x] nightly pipeline

Awww. That looks obvious in hindsight. Looks good!

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1655


From cjplummer at openjdk.java.net  Mon Dec  7 07:44:14 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Mon, 7 Dec 2020 07:44:14 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
 <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>
Message-ID: <a-vPq47MYFNrAD3OtDIihvLDXSuPW0qjNJfl9d3R0lk=.da5d7da8-bd60-4421-aa42-fed8126c929b@github.com>

On Mon, 7 Dec 2020 06:27:20 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 1560:
>> 
>>> 1558:              * garbage collected while the VM is suspended.
>>> 1559:              */
>>> 1560:             commonRef_pinAll();
>> 
>> Can we have multiple VM.suspend calls? The  suspendAllCount seems to suggest that. In which case shouldn't we only pin on the 0->1 transition, and only unpin on the 1->0 transition?
>
> That was something I pointed out in the pre-review, and it has been addressed in `commonRef_pinAll/unpinAll`:
> 
> `568         if (gdata->pinAllCount == 1) {`
> `618         if (gdata->pinAllCount == 0) {`

> Okay. I would not have handled it at that level, but would have had
pinAll/unpinAll operate unconditionally, but the calls to those methods
being conditional based on the suspendAllCount.
>
>David

Well, that's assuming `pinAll()` will only ever be used by by `suspendAll()`. One could imaging a future use, such as if `VirtualMachine.disableCollection()` were ever to be added.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From tschatzl at openjdk.java.net  Mon Dec  7 09:24:15 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 7 Dec 2020 09:24:15 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
Message-ID: <QunaVHCAuySIoJki5GBFJr6IppgWZhjyxuhtQfpSjl4=.64594381-e23b-4ef6-bfc6-6650a12a37df@github.com>

On Fri, 4 Dec 2020 12:31:29 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Hi all,
>> 
>> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
>> For example, this assert [1] fired on our testing boxes.
>> 
>> It can be reproduced by the following two steps on Linux-64:
>>   1) ulimit -v 8388608
>>   2) java -XX:MinHeapSize=5g -version
>> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
>> 
>> One more important fact is that this bug can be more common on Linux-32 systems.
>> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
>> 
>> Testing:
>>   - tier1 ~ tier3 on Linux/x64
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
>> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
>> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567
>
> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Check the exit status

Marked as reviewed by tschatzl (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Mon Dec  7 09:30:13 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Dec 2020 09:30:13 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
Message-ID: <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>

On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Check the exit status
>
> This looks better. I'll run it through or testing env to make sure it passes there as well.

Thanks @kstefanj and @tschatzl for your review and help.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Mon Dec  7 09:30:17 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Dec 2020 09:30:17 GMT
Subject: Integrated: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes
In-Reply-To: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
Message-ID: <4Vg0448OBEokoCLHwbq4_z96jtvgemeBIxaboZJUyv0=.2689c0b6-12aa-41cf-99ed-1b6110fae075@github.com>

On Sat, 28 Nov 2020 13:08:38 GMT, Jie Fu <jiefu at openjdk.org> wrote:

> Hi all,
> 
> Ergonomics for InitialHeapSize can be broken if the memory resource is limited by the administrator.
> For example, this assert [1] fired on our testing boxes.
> 
> It can be reproduced by the following two steps on Linux-64:
>   1) ulimit -v 8388608
>   2) java -XX:MinHeapSize=5g -version
> The reason was that limit_by_allocatable_memory() [2] returns a value less than MinHeapSize.
> 
> One more important fact is that this bug can be more common on Linux-32 systems.
> Since the virtual memory is limited to 3800M [3] on Linux-32, it can be always reproduced when MinHeapSize > 1900M.
> 
> Testing:
>   - tier1 ~ tier3 on Linux/x64
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shared/gcArguments.cpp#L96
> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L1907
> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/posix/os_posix.cpp#L567

This pull request has now been integrated.

Changeset: 7620124e
Author:    Jie Fu <jiefu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/7620124e
Stats:     62 lines in 2 files changed: 60 ins; 2 del; 0 mod

8257230: assert(InitialHeapSize >= MinHeapSize) failed: Ergonomics decided on incompatible initial and minimum heap sizes

Reviewed-by: tschatzl, sjohanss

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From stuefe at openjdk.java.net  Mon Dec  7 10:40:13 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Mon, 7 Dec 2020 10:40:13 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
Message-ID: <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>

On Fri, 4 Dec 2020 00:07:12 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
>> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
>> 
>> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
>> 
>> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.
>
> Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove remnant UseSHM change
>   
>   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

Hi Marcus,

I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.

Cheers, Thomas

src/hotspot/os/linux/os_linux.cpp line 3743:

> 3741:       // The kernel is using kB, hotspot uses bytes
> 3742:       if (page_size * K > (size_t)Linux::page_size()) {
> 3743:         if (!os::page_sizes().is_set(page_size * K)) {

is_set is not needed, just call add

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From rkennke at openjdk.java.net  Mon Dec  7 10:43:13 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 7 Dec 2020 10:43:13 GMT
Subject: RFR: 8257793: Shenandoah: SATB barrier should only filter out
 already strongly marked oops
In-Reply-To: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>
References: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>
Message-ID: <qcwIdoYy63vNCZ6WW1H3P6RGvnLgNbhKoWQ7VOsr5Z8=.d909a63a-6ce0-4772-8363-fc8e80d8abb3@github.com>

On Mon, 7 Dec 2020 01:07:25 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> SATB barrier intercepts oops for later marking, and those oops will be marked as strongly reachable. So that, it can only filter out oops that already strongly marked, not if they are only weakly marked.
> 
> - [x] hotspot_gc_shenandoah
> - [x] nightly pipeline

Looks good to me.

-------------

Marked as reviewed by rkennke (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1655


From pliden at openjdk.java.net  Mon Dec  7 10:49:16 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Mon, 7 Dec 2020 10:49:16 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
Message-ID: <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>

On Fri, 4 Dec 2020 20:12:11 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java line 194:
> 
>> 192:                     debuggee.resume();
>> 193:                     checkDebugeeAnswer_instances(className, baseInstances);
>> 194:                     debuggee.suspend();
> 
> Before the changes in this PR, what was triggering the (expected) collection of the objects?

@plummercj Nothing was explicitly triggering collection of these objects. However, the test is explicitly checking the number of objects "reachable for the purposes of garbage collection" in `checkDebugeeAnswer_instances()`. The tests sets up a breakpoint (with SUSPEND_ALL), which suspends the VM. Then it creates a number of new instances and expects these to be weakly reachable. However, with this change, suspending the VM will make all objects "reachable for the purposes of garbage collection". So, to let the test continue to create objects which are weakly reachable we need to first resume the VM, create the new instances, and then suspend it again.

@dholmes-ora I have no idea why these tests are so different. The VM suspend is implicit in the breakpoint in this test, which is set up using SUSPEND_ALL.

> test/hotspot/jtreg/vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java line 85:
> 
>> 83:                     array.disableCollection();
>> 84:                 } catch (ObjectCollectedException e) {
>> 85:                     continue;
> 
> Maybe add a comment: "Since the VM is not suspended, the object may have been collected before disableCollection() could be called on it. Just ignore and continue doing allocations until we run out of memory."

Sounds good, will fix.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Mon Dec  7 11:00:15 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Mon, 7 Dec 2020 11:00:15 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <_mg_tfWiVdggsrpKEDFeFAR1-22yUsy-tTe0foBDdC4=.191dc88a-df48-4127-805d-62fba59d2750@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <Wm71BbfKVs7DACh7TIrWmmZHwq0gBosjw55EusSGItw=.e4531645-512f-4423-8945-bdc5d5991f8e@github.com>
 <-2Yx99rM6jO7OHIzIlaHfdeojwgwwl7QthEqINqxiu4=.ae8922de-ea23-47c6-adff-3618cecd7eaf@github.com>
 <_mg_tfWiVdggsrpKEDFeFAR1-22yUsy-tTe0foBDdC4=.191dc88a-df48-4127-805d-62fba59d2750@github.com>
Message-ID: <SMBYcHVkHtS7ejzhxcW91YFlAOQTROdzJ8VDsByVLkA=.49051ea6-90e7-4232-b479-929e27f3e6c5@github.com>

On Mon, 7 Dec 2020 05:14:36 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Another options is to save away the weakref in the node when strengthening. This would benefit `ObjectReference.disableCollection()` also, since it would no longer need to deal with a potential OOM. However, I'm not so sure it's actually worth doing. Trying to keep the debug session alive will having allocation errors is probably a fools errand.
>
> I agree a fatal error here seems excessive. Simply maintaining the strong ref seems reasonable.

I was trying to mimic what we already do in `strengthenNode()`, i.e. it's a fatal error if we can't create a JNI ref. Here:

        strongRef = JNI_FUNC_PTR(env,NewGlobalRef)(env, node->ref);
        /*
         * NewGlobalRef on a weak ref will return NULL if the weak
         * reference has been collected or if out of memory.
         * It never throws OOM.
         * We need to distinguish those two occurrences.
         */
        if ((strongRef == NULL) && !isSameObject(env, node->ref, NULL)) {
            EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewGlobalRef");
        }

So it seems appropriate to do the same thing if we fail to create a JNI weak ref. Also, as @plummercj mentioned, if we can't create a JNI ref, continuing the debug session seems rather pointless as we're about to go down anyway (the next allocation in the JVM will be fatal).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Mon Dec  7 11:12:20 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Mon, 7 Dec 2020 11:12:20 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
Message-ID: <CBhnu9ZOq91BYS_mXTJl8r2Z9cmBg1XLSvI5-_g74qU=.ee61a1d6-c2ee-43b3-bf1f-cfea3d83c1ec@github.com>

On Mon, 7 Dec 2020 05:10:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 586:
> 
>> 584:                     jobject strongRef;
>> 585: 
>> 586:                     strongRef = strengthenNode(env, node);
> 
> This can just be one line.

I was actually trying to carefully to follow the coding style currently used in this file/library. If you have a quick look at this file you'll see the pattern above in multiple places, where as combined declaration+assignment style isn't used. So while I personally agree about this style question, I also think following the style already present in a file has precedence over introducing a new style. Don't you agree?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Mon Dec  7 11:22:30 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Mon, 7 Dec 2020 11:22:30 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <wmfIq8ZFMkGnfao7PezYJjF40geLpsezKUKrooe45IQ=.5884a4cd-44eb-43f0-b549-ef4102ae230e@github.com>

> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>> 
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>> 
>> and
>> 
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>> 
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>> 
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>> 
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
> 
> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
> 
> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
> 
> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

Per Liden has updated the pull request incrementally with one additional commit since the last revision:

  Add comment

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1595/files
  - new: https://git.openjdk.java.net/jdk/pull/1595/files/5b32d271..8fe1e52d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1595&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1595&range=00-01

  Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1595.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1595/head:pull/1595

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Mon Dec  7 11:22:31 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Mon, 7 Dec 2020 11:22:31 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <a-vPq47MYFNrAD3OtDIihvLDXSuPW0qjNJfl9d3R0lk=.da5d7da8-bd60-4421-aa42-fed8126c929b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
 <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>
 <a-vPq47MYFNrAD3OtDIihvLDXSuPW0qjNJfl9d3R0lk=.da5d7da8-bd60-4421-aa42-fed8126c929b@github.com>
Message-ID: <MuM6qA0GdG20Edk9eYiWMkBYYm23Z3heBkCrMh1iXTI=.abc0e499-1b8d-4354-8dc6-b0e9244b19bc@github.com>

On Mon, 7 Dec 2020 07:41:46 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> That was something I pointed out in the pre-review, and it has been addressed in `commonRef_pinAll/unpinAll`:
>> 
>> `568         if (gdata->pinAllCount == 1) {`
>> `618         if (gdata->pinAllCount == 0) {`
>
>> Okay. I would not have handled it at that level, but would have had
> pinAll/unpinAll operate unconditionally, but the calls to those methods
> being conditional based on the suspendAllCount.
>>
>>David
> 
> Well, that's assuming `pinAll()` will only ever be used by by `suspendAll()`. One could imaging a future use, such as if `VirtualMachine.disableCollection()` were ever to be added.

I was also thinking `pinAll()/unpinAll()` should stand on their own, and not implicitly depend/rely on `suspendAllCount`. As @plummercj says, one could imagine we want to use these functions in other contexts in the future.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From rkennke at openjdk.java.net  Mon Dec  7 11:47:20 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 7 Dec 2020 11:47:20 GMT
Subject: RFR: 8257817: Shenandoah: Don't race with conc-weak-in-progress flag
 in weak-LRB
Message-ID: <N5rAIYgFx6RNGLok_5KKJIG7-a10uncd_75IgA5cRcw=.e0182756-5fa4-4697-a4a7-293aa58aa440@github.com>

The weak-LRB code is currently subject to a race. Consider this sequence of events between a Java thread and GC threads:
During conc-weak-root-in-progress:
- Java: Load referent out of Reference, it is unreachable but not-yet-cleared
- GC: Clears referent
- GC: Concurrently turn off conc-weak-root-in-progress
- Java: Checks conc-weak-root-in-progress, sees that it's false, continues to use/evac it -> successfully resurrected unreachable object. This must not happen.

AFAICT, this also affects conc-class-unloading and weak-roots. 

Proposed fix is to check for evac-in-progress instead. This should be acceptable because this is not a very common path and not very performance-sensitive.

 - [x] hotspot_gc_shenandoah

-------------

Commit messages:
 - Shenandoah: Don't race with conc-weak-in-progress flag in weak-LRB

Changes: https://git.openjdk.java.net/jdk/pull/1662/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1662&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257817
  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1662.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1662/head:pull/1662

PR: https://git.openjdk.java.net/jdk/pull/1662


From david.holmes at oracle.com  Mon Dec  7 11:52:47 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Dec 2020 21:52:47 +1000
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <a-vPq47MYFNrAD3OtDIihvLDXSuPW0qjNJfl9d3R0lk=.da5d7da8-bd60-4421-aa42-fed8126c929b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
 <oXjAA8qrdw7exR3rFu6qe-sqzfrudwxqG41cXyZVWy0=.ac63890f-d120-4f0e-b707-7f6f3e89820c@github.com>
 <a-vPq47MYFNrAD3OtDIihvLDXSuPW0qjNJfl9d3R0lk=.da5d7da8-bd60-4421-aa42-fed8126c929b@github.com>
Message-ID: <33061a53-2f7e-8b8c-6688-bc43aff2fb42@oracle.com>

On 7/12/2020 5:44 pm, Chris Plummer wrote:
> On Mon, 7 Dec 2020 06:27:20 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:
> 
>>> src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 1560:
>>>
>>>> 1558:              * garbage collected while the VM is suspended.
>>>> 1559:              */
>>>> 1560:             commonRef_pinAll();
>>>
>>> Can we have multiple VM.suspend calls? The  suspendAllCount seems to suggest that. In which case shouldn't we only pin on the 0->1 transition, and only unpin on the 1->0 transition?
>>
>> That was something I pointed out in the pre-review, and it has been addressed in `commonRef_pinAll/unpinAll`:
>>
>> `568         if (gdata->pinAllCount == 1) {`
>> `618         if (gdata->pinAllCount == 0) {`
> 
>> Okay. I would not have handled it at that level, but would have had
> pinAll/unpinAll operate unconditionally, but the calls to those methods
> being conditional based on the suspendAllCount.
>>
>> David
> 
> Well, that's assuming `pinAll()` will only ever be used by by `suspendAll()`. One could imaging a future use, such as if `VirtualMachine.disableCollection()` were ever to be added.

Not really. I consider pinAll should pin-all as the name implies. The 
question of when to pin should be handled by the caller of pinAll. If 
there were ever to be a second reason to pinAll then you would have to 
decide what semantics that has: does it maintain a count, or is it like 
thread suspension.

David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/1595
> 


From david.holmes at oracle.com  Mon Dec  7 12:04:06 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Dec 2020 22:04:06 +1000
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
 <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
Message-ID: <32ef94ec-30e8-43a1-5f32-6d43a0dcde07@oracle.com>

On 7/12/2020 7:30 pm, Jie Fu wrote:
> On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:
> 
>>> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>>>
>>>    Check the exit status
>>
>> This looks better. I'll run it through or testing env to make sure it passes there as well.
> 
> Thanks @kstefanj and @tschatzl for your review and help.

We are seeing the new test crash on Aarch64 due to native OOM.

https://bugs.openjdk.java.net/browse/JDK-8257820

Cheers,
David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/1492
> 


From david.holmes at oracle.com  Mon Dec  7 12:11:59 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Dec 2020 22:11:59 +1000
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <CBhnu9ZOq91BYS_mXTJl8r2Z9cmBg1XLSvI5-_g74qU=.ee61a1d6-c2ee-43b3-bf1f-cfea3d83c1ec@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
 <CBhnu9ZOq91BYS_mXTJl8r2Z9cmBg1XLSvI5-_g74qU=.ee61a1d6-c2ee-43b3-bf1f-cfea3d83c1ec@github.com>
Message-ID: <430200ec-5fc2-e76e-06bb-9248990aa19f@oracle.com>

On 7/12/2020 9:12 pm, Per Liden wrote:
> On Mon, 7 Dec 2020 05:10:34 GMT, David Holmes <dholmes at openjdk.org> wrote:
>>> 584:                     jobject strongRef;
>>> 585:
>>> 586:                     strongRef = strengthenNode(env, node);
>>
>> This can just be one line.
> 
> I was actually trying to carefully to follow the coding style currently used in this file/library. If you have a quick look at this file you'll see the pattern above in multiple places, where as combined declaration+assignment style isn't used. So while I personally agree about this style question, I also think following the style already present in a file has precedence over introducing a new style. Don't you agree?

This file uses an archaic C-style, so while I agree it would be 
inappropriate to over modernise the new code, this particular example 
stuck out because even in archaic C there is no reason to split this 
onto two lines. I didn't go looking to see if this mimicked existing 
code. :) Keep it or change it as you see fit.

Cheers,
David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/1595
> 


From jiefu at openjdk.java.net  Mon Dec  7 12:15:14 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Dec 2020 12:15:14 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
 <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
Message-ID: <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>

On Mon, 7 Dec 2020 09:27:02 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> This looks better. I'll run it through or testing env to make sure it passes there as well.
>
> Thanks @kstefanj and @tschatzl for your review and help.

> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_
> 
> On 7/12/2020 7:30 pm, Jie Fu wrote:
> 
> > On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:
> > > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
> > > > Check the exit status
> > > 
> > > 
> > > This looks better. I'll run it through or testing env to make sure it passes there as well.
> > 
> > 
> > Thanks @kstefanj and @tschatzl for your review and help.
> 
> We are seeing the new test crash on Aarch64 due to native OOM.
> 
> https://bugs.openjdk.java.net/browse/JDK-8257820
> 
> Cheers,
> David

Is it always reproducible?
I'll try to find an aarch64 machine to reproduce it.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From sjohanss at openjdk.java.net  Mon Dec  7 12:20:13 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Mon, 7 Dec 2020 12:20:13 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
 <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
 <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>
Message-ID: <zs2cnGatLffoEaWyU26uujsmDfHFh4oiObTTqVG_nYI=.50c64ecd-9abe-4090-a403-693d2be70f6a@github.com>

On Mon, 7 Dec 2020 12:12:35 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Thanks @kstefanj and @tschatzl for your review and help.
>
>> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_
>> 
>> On 7/12/2020 7:30 pm, Jie Fu wrote:
>> 
>> > On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:
>> > > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>> > > > Check the exit status
>> > > 
>> > > 
>> > > This looks better. I'll run it through or testing env to make sure it passes there as well.
>> > 
>> > 
>> > Thanks @kstefanj and @tschatzl for your review and help.
>> 
>> We are seeing the new test crash on Aarch64 due to native OOM.
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8257820
>> 
>> Cheers,
>> David
> 
> Is it always reproducible?
> I'll try to find an aarch64 machine to reproduce it.
> Thanks.

I had hoped my testing would have caught this, but might be that the test is to brittle after all. I saw it pass on aarch64, so not happening every time. 

I would be good with removing the test as a fix.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From david.holmes at oracle.com  Mon Dec  7 12:22:14 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Dec 2020 22:22:14 +1000
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
 <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
 <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>
Message-ID: <e02e8f7a-9fbc-3bb9-1fc0-b29667a39c70@oracle.com>

On 7/12/2020 10:15 pm, Jie Fu wrote:
> On Mon, 7 Dec 2020 09:27:02 GMT, Jie Fu <jiefu at openjdk.org> wrote:
> 
>>> This looks better. I'll run it through or testing env to make sure it passes there as well.
>>
>> Thanks @kstefanj and @tschatzl for your review and help.
> 
>> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-runtime-dev](mailto:hotspot-runtime-dev at openjdk.java.net):_
>>
>> On 7/12/2020 7:30 pm, Jie Fu wrote:
>>
>>> On Fri, 4 Dec 2020 12:37:42 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:
>>>>> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
>>>>> Check the exit status
>>>>
>>>>
>>>> This looks better. I'll run it through or testing env to make sure it passes there as well.
>>>
>>>
>>> Thanks @kstefanj and @tschatzl for your review and help.
>>
>> We are seeing the new test crash on Aarch64 due to native OOM.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8257820
>>
>> Cheers,
>> David
> 
> Is it always reproducible?

No. It passed when this commit was integrated, but then failed in later 
test runs.

David

> I'll try to find an aarch64 machine to reproduce it.
> Thanks.
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/1492
> 


From jiefu at openjdk.java.net  Mon Dec  7 12:27:13 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Dec 2020 12:27:13 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <zs2cnGatLffoEaWyU26uujsmDfHFh4oiObTTqVG_nYI=.50c64ecd-9abe-4090-a403-693d2be70f6a@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
 <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
 <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>
 <zs2cnGatLffoEaWyU26uujsmDfHFh4oiObTTqVG_nYI=.50c64ecd-9abe-4090-a403-693d2be70f6a@github.com>
Message-ID: <2zbxEKMoFBOqompGsQ1hV3CXHFquMXBlorQi8De3sZY=.6b2da533-8d51-4516-b3da-9f7fa6db426d@github.com>

On Mon, 7 Dec 2020 12:17:53 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> I had hoped my testing would have caught this, but might be that the test is to brittle after all. I saw it pass on aarch64, so not happening every time.
> 
> I would be good with removing the test as a fix.

OK.
I'm fine to remove it and will do it soon.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From jiefu at openjdk.java.net  Mon Dec  7 12:31:14 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Dec 2020 12:31:14 GMT
Subject: RFR: 8257230: assert(InitialHeapSize >= MinHeapSize) failed:
 Ergonomics decided on incompatible initial and minimum heap sizes [v5]
In-Reply-To: <2zbxEKMoFBOqompGsQ1hV3CXHFquMXBlorQi8De3sZY=.6b2da533-8d51-4516-b3da-9f7fa6db426d@github.com>
References: <gswEYAk6CZraIlw7s1FcnyA9AnXJK33lcrN-EPSPSJM=.b4cfd3ce-1685-44f3-a9e3-8e4011376b3b@github.com>
 <NQUNZ2fPMvaEpdun9rwCQGN-3g2nxswOFKso9w-H0CA=.236568ec-04ae-487a-b738-424507d3b287@github.com>
 <ndGs-b0qNh5sQBfIepk3ArsPorGndncm-uj5mXNTQmM=.af6a546a-d36e-4b43-8633-16965af37d21@github.com>
 <e1UmHVMmz1DtV0nn989X9gsE8X6iH844CpGGJ4jQKVE=.f97120ff-0d1b-456a-a123-32c3d4514305@github.com>
 <HuK_MHak7cIAc1BE_0koTOQARUJIEtIa2cU7aHcqodk=.cce26c87-3d75-4917-b99d-27ecded9b7dc@github.com>
 <zs2cnGatLffoEaWyU26uujsmDfHFh4oiObTTqVG_nYI=.50c64ecd-9abe-4090-a403-693d2be70f6a@github.com>
 <2zbxEKMoFBOqompGsQ1hV3CXHFquMXBlorQi8De3sZY=.6b2da533-8d51-4516-b3da-9f7fa6db426d@github.com>
Message-ID: <9XNx3VgWz0DrcCM6v6vZgziL4t-JtW_-_8wOXQmtO9g=.2608c79a-23de-4064-80ec-a661d0f6170f@github.com>

On Mon, 7 Dec 2020 12:24:52 GMT, Jie Fu <jiefu at openjdk.org> wrote:

> > I had hoped my testing would have caught this, but might be that the test is to brittle after all. I saw it pass on aarch64, so not happening every time.
> > I would be good with removing the test as a fix.
> 
> OK.
> I'm fine to remove it and will do it soon.

It has been assigned to @tschatzl .
Thanks for fixing it.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1492


From tschatzl at openjdk.java.net  Mon Dec  7 12:34:17 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 7 Dec 2020 12:34:17 GMT
Subject: RFR: 8257820: Remove gc/ergonomics/TestMinHeapSize.java as it is too
 brittle
Message-ID: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>

Hi all,

  can I have reviews for this change that removes the gc/ergonomics/TestMinHeapSize.java test?

We found that it is unstable, and with reasonable effort very likely can't be made to work: we have no idea how much (C heap) memory the rest of the VM will use, or will in the future, so there will likely always be a constant need to update it. Even during review there have already some adaptations to make it work, until it "worked", but then it intermittently started failing in CI anyway.

Testing: this is a trivial removal of a complete test/file.

Thanks,
  Thomas

-------------

Commit messages:
 - Initial commit, remove test

Changes: https://git.openjdk.java.net/jdk/pull/1664/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1664&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257820
  Stats: 59 lines in 1 file changed: 0 ins; 59 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1664.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1664/head:pull/1664

PR: https://git.openjdk.java.net/jdk/pull/1664


From jiefu at openjdk.java.net  Mon Dec  7 12:43:14 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 7 Dec 2020 12:43:14 GMT
Subject: RFR: 8257820: Remove gc/ergonomics/TestMinHeapSize.java as it is
 too brittle
In-Reply-To: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
References: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
Message-ID: <ZfI50jPT6xHVdavpdXFQ9hj6k4nVV9RYfAFpDFCsexk=.6fbb2a75-d41e-43d7-b31f-4c47f6a528ca@github.com>

On Mon, 7 Dec 2020 12:29:21 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this change that removes the gc/ergonomics/TestMinHeapSize.java test?
> 
> We found that it is unstable, and with reasonable effort very likely can't be made to work: we have no idea how much (C heap) memory the rest of the VM will use, or will in the future, so there will likely always be a constant need to update it. Even during review there have already some adaptations to make it work, until it "worked", but then it intermittently started failing in CI anyway.
> 
> Testing: this is a trivial removal of a complete test/file.
> 
> Thanks,
>   Thomas

Thanks for fixing it.

-------------

Marked as reviewed by jiefu (Committer).

PR: https://git.openjdk.java.net/jdk/pull/1664


From kbarrett at openjdk.java.net  Mon Dec  7 12:43:14 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 7 Dec 2020 12:43:14 GMT
Subject: RFR: 8257820: Remove gc/ergonomics/TestMinHeapSize.java as it is
 too brittle
In-Reply-To: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
References: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
Message-ID: <9NZnjcqnP7tyCAWZ6n6R6hsPvv5f5H6JhsiE5Bvx6sI=.b4f8ea14-2666-41aa-8cb4-adff0b3789cc@github.com>

On Mon, 7 Dec 2020 12:29:21 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this change that removes the gc/ergonomics/TestMinHeapSize.java test?
> 
> We found that it is unstable, and with reasonable effort very likely can't be made to work: we have no idea how much (C heap) memory the rest of the VM will use, or will in the future, so there will likely always be a constant need to update it. Even during review there have already some adaptations to make it work, until it "worked", but then it intermittently started failing in CI anyway.
> 
> Testing: this is a trivial removal of a complete test/file.
> 
> Thanks,
>   Thomas

Looks good, and trivial.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1664


From tschatzl at openjdk.java.net  Mon Dec  7 12:48:10 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 7 Dec 2020 12:48:10 GMT
Subject: RFR: 8257820: Remove gc/ergonomics/TestMinHeapSize.java as it is
 too brittle
In-Reply-To: <9NZnjcqnP7tyCAWZ6n6R6hsPvv5f5H6JhsiE5Bvx6sI=.b4f8ea14-2666-41aa-8cb4-adff0b3789cc@github.com>
References: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
 <9NZnjcqnP7tyCAWZ6n6R6hsPvv5f5H6JhsiE5Bvx6sI=.b4f8ea14-2666-41aa-8cb4-adff0b3789cc@github.com>
Message-ID: <W1zQiLau-7e1ZkNBaV0PEeB2YoxWnfylFGPlj6VKpEI=.26a0a3c2-53d7-4d07-97ea-8d74274eff3e@github.com>

On Mon, 7 Dec 2020 12:41:13 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I have reviews for this change that removes the gc/ergonomics/TestMinHeapSize.java test?
>> 
>> We found that it is unstable, and with reasonable effort very likely can't be made to work: we have no idea how much (C heap) memory the rest of the VM will use, or will in the future, so there will likely always be a constant need to update it. Even during review there have already some adaptations to make it work, until it "worked", but then it intermittently started failing in CI anyway.
>> 
>> Testing: this is a trivial removal of a complete test/file.
>> 
>> Thanks,
>>   Thomas
>
> Looks good, and trivial.

thanks @kimbarrett , @DamonFool for your reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1664


From tschatzl at openjdk.java.net  Mon Dec  7 12:48:12 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 7 Dec 2020 12:48:12 GMT
Subject: Integrated: 8257820: Remove gc/ergonomics/TestMinHeapSize.java as it
 is too brittle
In-Reply-To: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
References: <TYraViGwBExgDdN5RIq8Fme31lYB6qnkZdRKRjM8OQI=.c7253382-dc3d-48b6-aa36-e7d1feb31769@github.com>
Message-ID: <Aikr18RFrv9eCiF4tbcQwf_XCXPy1VhbgQNQUbUvFLQ=.2743064e-17f7-4ddd-bdb9-46fcbf52ad5e@github.com>

On Mon, 7 Dec 2020 12:29:21 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this change that removes the gc/ergonomics/TestMinHeapSize.java test?
> 
> We found that it is unstable, and with reasonable effort very likely can't be made to work: we have no idea how much (C heap) memory the rest of the VM will use, or will in the future, so there will likely always be a constant need to update it. Even during review there have already some adaptations to make it work, until it "worked", but then it intermittently started failing in CI anyway.
> 
> Testing: this is a trivial removal of a complete test/file.
> 
> Thanks,
>   Thomas

This pull request has now been integrated.

Changeset: e08b9ed0
Author:    Thomas Schatzl <tschatzl at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/e08b9ed0
Stats:     59 lines in 1 file changed: 0 ins; 59 del; 0 mod

8257820: Remove gc/ergonomics/TestMinHeapSize.java as it is too brittle

Reviewed-by: jiefu, kbarrett

-------------

PR: https://git.openjdk.java.net/jdk/pull/1664


From pliden at openjdk.java.net  Mon Dec  7 13:10:14 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Mon, 7 Dec 2020 13:10:14 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <ozNDuMMZ0I5D35eCdk5Oy151HyCVMsr3SssXsQdDMEc=.ca35239f-4861-4756-93d8-a88eb5bbfe3e@github.com>
Message-ID: <x4Ct-7z1k3vv4DbzLWLPcnv91z694I2zYU2G2wbdzLM=.f7220271-cb47-4ec0-9491-f072258f3a08@github.com>

On Mon, 7 Dec 2020 06:04:36 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Per Liden has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Add comment
>
> Overall seems okay. Some comments on tests as I think the existing test logic is quite confused in places.
> 
> Thanks,
> David

> Not really. I consider pinAll should pin-all as the name implies. The question of when to pin should be handled by the caller of pinAll. If there were ever to be a second reason to pinAll then you would have to decide what semantics that has: does it maintain a count, or is it like thread suspension.

I would say that would not be in spirit of how the rest of this library is designed, with regards to nesting of calls. For example, `pin()/unpin()`, `suspend()/resume()`, `createNode()/deleteNode()`, etc. All these functions supports nesting, so they might just up/down a counter, instead of doing exactly what their name implies. The new `pinAll()/unpinAll()` follow the same model, which, to me, feels like the natural thing to do here.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From zgu at openjdk.java.net  Mon Dec  7 13:21:13 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 7 Dec 2020 13:21:13 GMT
Subject: Integrated: 8257793: Shenandoah: SATB barrier should only filter out
 already strongly marked oops
In-Reply-To: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>
References: <UZHcFHtCvkZFWQgyQqh7_cRNnuVTEnsJYcPDwMKq7Z4=.0cfbb885-e18b-4511-99d1-f5c39f187c6d@github.com>
Message-ID: <s6cqGuPgVajd3hsyaukJc3i5o2FQAIFUhx1ZgE-rNDc=.2d3a5b76-4daf-42f1-afe3-cbda2b8dbfe6@github.com>

On Mon, 7 Dec 2020 01:07:25 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> SATB barrier intercepts oops for later marking, and those oops will be marked as strongly reachable. So that, it can only filter out oops that already strongly marked, not if they are only weakly marked.
> 
> - [x] hotspot_gc_shenandoah
> - [x] nightly pipeline

This pull request has now been integrated.

Changeset: ecd7e476
Author:    Zhengyu Gu <zgu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/ecd7e476
Stats:     1 line in 1 file changed: 0 ins; 0 del; 1 mod

8257793: Shenandoah: SATB barrier should only filter out already strongly marked oops

Reviewed-by: shade, rkennke

-------------

PR: https://git.openjdk.java.net/jdk/pull/1655


From zgu at openjdk.java.net  Mon Dec  7 14:28:09 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 7 Dec 2020 14:28:09 GMT
Subject: RFR: 8257817: Shenandoah: Don't race with conc-weak-in-progress
 flag in weak-LRB
In-Reply-To: <N5rAIYgFx6RNGLok_5KKJIG7-a10uncd_75IgA5cRcw=.e0182756-5fa4-4697-a4a7-293aa58aa440@github.com>
References: <N5rAIYgFx6RNGLok_5KKJIG7-a10uncd_75IgA5cRcw=.e0182756-5fa4-4697-a4a7-293aa58aa440@github.com>
Message-ID: <C3RT5dN16eP_VASDDAdK-Cs8xqcqNrfL-U-nLCrapW4=.fa9d9917-6be9-4cb6-9b3b-b878aaba54b0@github.com>

On Mon, 7 Dec 2020 11:37:36 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> The weak-LRB code is currently subject to a race. Consider this sequence of events between a Java thread and GC threads:
> During conc-weak-root-in-progress:
> - Java: Load referent out of Reference, it is unreachable but not-yet-cleared
> - GC: Clears referent
> - GC: Concurrently turn off conc-weak-root-in-progress
> - Java: Checks conc-weak-root-in-progress, sees that it's false, continues to use/evac it -> successfully resurrected unreachable object. This must not happen.
> 
> AFAICT, this also affects conc-class-unloading and weak-roots. 
> 
> Proposed fix is to check for evac-in-progress instead. This should be acceptable because this is not a very common path and not very performance-sensitive.
> 
>  - [x] hotspot_gc_shenandoah

I wonder if it is more general problem. Given concurrent weak root in mandatory phase, we should just remove is_concurrent_weak_root_in_progress flag.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1662


From rkennke at openjdk.java.net  Mon Dec  7 15:27:12 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 7 Dec 2020 15:27:12 GMT
Subject: Withdrawn: 8257817: Shenandoah: Don't race with conc-weak-in-progress
 flag in weak-LRB
In-Reply-To: <N5rAIYgFx6RNGLok_5KKJIG7-a10uncd_75IgA5cRcw=.e0182756-5fa4-4697-a4a7-293aa58aa440@github.com>
References: <N5rAIYgFx6RNGLok_5KKJIG7-a10uncd_75IgA5cRcw=.e0182756-5fa4-4697-a4a7-293aa58aa440@github.com>
Message-ID: <0QIDxMRlr8ESjPUpjXWH9sJuN981S3m4NQvmdbLz4Co=.520848ee-2998-4bb2-bdfb-63a5df18594d@github.com>

On Mon, 7 Dec 2020 11:37:36 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> The weak-LRB code is currently subject to a race. Consider this sequence of events between a Java thread and GC threads:
> During conc-weak-root-in-progress:
> - Java: Load referent out of Reference, it is unreachable but not-yet-cleared
> - GC: Clears referent
> - GC: Concurrently turn off conc-weak-root-in-progress
> - Java: Checks conc-weak-root-in-progress, sees that it's false, continues to use/evac it -> successfully resurrected unreachable object. This must not happen.
> 
> AFAICT, this also affects conc-class-unloading and weak-roots. 
> 
> Proposed fix is to check for evac-in-progress instead. This should be acceptable because this is not a very common path and not very performance-sensitive.
> 
>  - [x] hotspot_gc_shenandoah

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1662


From zgu at openjdk.java.net  Mon Dec  7 18:36:21 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 7 Dec 2020 18:36:21 GMT
Subject: RFR: 8257817: Shenandoah: Don't race with conc-weak-in-progress flag
 in weak-LRB
Message-ID: <iJ3yVpnAE2e82gbL-GOcEIBLWy8NAnwVufhnSyxJBgg=.e1bdcac4-f71c-4650-a1ea-f91219338515@github.com>

After concurrent weak root processing, it should perform handshake first to ensure there are no dirty loads (loads have yet processed by barriers) in Java thread, before it resets conc-weak-root-in-progress flag.

- [x] hotspot_gc_shenandoah

-------------

Commit messages:
 - JDK-8257817

Changes: https://git.openjdk.java.net/jdk/pull/1673/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1673&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257817
  Stats: 7 lines in 1 file changed: 4 ins; 3 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1673.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1673/head:pull/1673

PR: https://git.openjdk.java.net/jdk/pull/1673


From rkennke at openjdk.java.net  Mon Dec  7 19:13:12 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 7 Dec 2020 19:13:12 GMT
Subject: RFR: 8257817: Shenandoah: Don't race with conc-weak-in-progress
 flag in weak-LRB
In-Reply-To: <iJ3yVpnAE2e82gbL-GOcEIBLWy8NAnwVufhnSyxJBgg=.e1bdcac4-f71c-4650-a1ea-f91219338515@github.com>
References: <iJ3yVpnAE2e82gbL-GOcEIBLWy8NAnwVufhnSyxJBgg=.e1bdcac4-f71c-4650-a1ea-f91219338515@github.com>
Message-ID: <YHbn1DhQz-fXAmFVt-8dv96_clH8bPQw3unFJoKK49E=.d0ee3e9a-e888-44a4-bf93-9b8888d89251@github.com>

On Mon, 7 Dec 2020 18:31:00 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> After concurrent weak root processing, it should perform handshake first to ensure there are no dirty loads (loads have yet processed by barriers) in Java thread, before it resets conc-weak-root-in-progress flag.
> 
> - [x] hotspot_gc_shenandoah

Looks good to me!
Might want to see in a follow-up if binding conc-roots-flag to should_do_class_unloading() is sane.

-------------

Marked as reviewed by rkennke (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1673


From zgu at openjdk.java.net  Mon Dec  7 19:18:18 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 7 Dec 2020 19:18:18 GMT
Subject: RFR: 8257817: Shenandoah: Don't race with conc-weak-in-progress
 flag in weak-LRB
In-Reply-To: <YHbn1DhQz-fXAmFVt-8dv96_clH8bPQw3unFJoKK49E=.d0ee3e9a-e888-44a4-bf93-9b8888d89251@github.com>
References: <iJ3yVpnAE2e82gbL-GOcEIBLWy8NAnwVufhnSyxJBgg=.e1bdcac4-f71c-4650-a1ea-f91219338515@github.com>
 <YHbn1DhQz-fXAmFVt-8dv96_clH8bPQw3unFJoKK49E=.d0ee3e9a-e888-44a4-bf93-9b8888d89251@github.com>
Message-ID: <8yMqrSyWyTVgWkBuT4AabSrSJx1tVfgnFrLTapk8XZg=.fdf9ada6-1888-4a84-8b3d-d0977a075de8@github.com>

On Mon, 7 Dec 2020 19:10:04 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> Looks good to me!
> Might want to see in a follow-up if binding conc-roots-flag to should_do_class_unloading() is sane.

Thanks for reviewing. There is already a RFE (JDK-8255837) to clean can/should_do_xxx up, as part of ongoing refactoring.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1673


From zgu at openjdk.java.net  Mon Dec  7 19:22:14 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 7 Dec 2020 19:22:14 GMT
Subject: Integrated: 8257817: Shenandoah: Don't race with
 conc-weak-in-progress flag in weak-LRB
In-Reply-To: <iJ3yVpnAE2e82gbL-GOcEIBLWy8NAnwVufhnSyxJBgg=.e1bdcac4-f71c-4650-a1ea-f91219338515@github.com>
References: <iJ3yVpnAE2e82gbL-GOcEIBLWy8NAnwVufhnSyxJBgg=.e1bdcac4-f71c-4650-a1ea-f91219338515@github.com>
Message-ID: <CISANGJCEJjuKse2a_jC0zsJajoZoTfkmviUCZtEn7Y=.ffc1c7ab-126b-407e-b04b-310b752a7b07@github.com>

On Mon, 7 Dec 2020 18:31:00 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> After concurrent weak root processing, it should perform handshake first to ensure there are no dirty loads (loads have yet processed by barriers) in Java thread, before it resets conc-weak-root-in-progress flag.
> 
> - [x] hotspot_gc_shenandoah

This pull request has now been integrated.

Changeset: 395b6bde
Author:    Zhengyu Gu <zgu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/395b6bde
Stats:     7 lines in 1 file changed: 4 ins; 3 del; 0 mod

8257817: Shenandoah: Don't race with conc-weak-in-progress flag in weak-LRB

Reviewed-by: rkennke

-------------

PR: https://git.openjdk.java.net/jdk/pull/1673


From github.com+168222+mgkwill at openjdk.java.net  Mon Dec  7 19:48:22 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Mon, 7 Dec 2020 19:48:22 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v6]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <U6naoRMLJdwypObAv75yhesvsmf3m4eTeBMZkgjIgno=.e2b0cd63-06bf-4a99-810a-dc93508ebc26@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits:

 - Thomas S. Feedback
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'master' into update_hlp
 - Remove remnant UseSHM change
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Adress Comments, Rework changes for PagesizeSet
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - JDK-8257588: Make os::_page_sizes a bitmask #1522
 - Merge branch 'master' into update_hlp
 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Add 2M LargePages to _page_sizes
   
   Use 2m pages for large page requests
   less than 1g on linux when 1G are default
   pages
   
   - Add os::Linux::large_page_size_2m() that
   returns 2m as size
   - Add os::Linux::select_large_page_size() to return
   correct large page size for size_t bytes
   - Add 2m size to _page_sizes array
   - Update reserve_memory_special methods
   to set/use large_page_size based on bytes reserved
   - Update large page not reserved warnings
   to include large_page_size attempted
   - Update TestLargePageUseForAuxMemory.java
   to expect 2m large pages in some instances
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge remote-tracking branch 'upstream/master' into update_hlp
 - Add 2M LargePages to _page_sizes
   
   Use 2m pages for large page requests
   less than 1g on linux when 1G are default
   pages
   
   - Add os::Linux::large_page_size_2m() that
   returns 2m as size
   - Add os::Linux::select_large_page_size() to return
   correct large page size for size_t bytes
   - Add 2m size to _page_sizes array
   - Update reserve_memory_special methods
   to set/use large_page_size based on bytes reserved
   - Update large page not reserved warnings
   to include large_page_size attempted
   - Update TestLargePageUseForAuxMemory.java
   to expect 2m large pages in some instances
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1153/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=05
  Stats: 71 lines in 4 files changed: 53 ins; 0 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Mon Dec  7 19:54:25 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Mon, 7 Dec 2020 19:54:25 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v7]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <NhpWic4FIdUfqx32Lp2HfwB48pFDE6H_gtUSFTg9Gzc=.66d16a52-b161-4591-86eb-45b6f1fbce59@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:

  Fix merge mistakes
  
  Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/1a33482b..5cd6d6a8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=05-06

  Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Mon Dec  7 19:59:28 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Mon, 7 Dec 2020 19:59:28 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v8]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <xX4n2gnKE_XRgpUT-YpdeehEjjsqsTt1ZazOjmWjauM=.f78356c1-f561-4bf2-93ff-ac6de6c07991@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request incrementally with three additional commits since the last revision:

 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Add newline at end of TestLargePageUseForAuxMemory.java
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix merge mistakes
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/5cd6d6a8..073ffabe

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=06-07

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Mon Dec  7 20:12:17 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Mon, 7 Dec 2020 20:12:17 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
 <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>
Message-ID: <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>

On Mon, 7 Dec 2020 10:37:31 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Hi Marcus,
> 
> I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
> 
> Cheers, Thomas

Hi Thomas. I was pushing to get this patch in before JDK16 was forked.

Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?

As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Mon Dec  7 20:16:15 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Mon, 7 Dec 2020 20:16:15 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
 <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>
 <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
Message-ID: <Ia7pIYOQMt-1nnKe7WWVaRBejWVdhWCol1eujm972L0=.724eb1dc-eba4-4ec8-ba5a-d6c8e4119ed3@github.com>

On Mon, 7 Dec 2020 20:09:37 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> Hi Marcus,
>> 
>> I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
>> 
>> Cheers, Thomas
>
>> Hi Marcus,
>> 
>> I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
>> 
>> Cheers, Thomas
> 
> Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
> 
> Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
> 
> As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.

I've resolved comments and merged so that changes from pull #1522 are no longer in the diff.

There are now only two files changed and a small amount of lines. I'd certainly appreciate any further detailed review or comments.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From cjplummer at openjdk.java.net  Mon Dec  7 20:33:20 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Mon, 7 Dec 2020 20:33:20 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
Message-ID: <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>

On Mon, 7 Dec 2020 10:44:56 GMT, Per Liden <pliden at openjdk.org> wrote:

>> test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java line 194:
>> 
>>> 192:                     debuggee.resume();
>>> 193:                     checkDebugeeAnswer_instances(className, baseInstances);
>>> 194:                     debuggee.suspend();
>> 
>> Before the changes in this PR, what was triggering the (expected) collection of the objects?
>
> @plummercj Nothing was explicitly triggering collection of these objects. However, the test is explicitly checking the number of objects "reachable for the purposes of garbage collection" in `checkDebugeeAnswer_instances()`. The tests sets up a breakpoint (with SUSPEND_ALL), which suspends the VM. Then it creates a number of new instances and expects these to be weakly reachable. However, with this change, suspending the VM will make all objects "reachable for the purposes of garbage collection". So, to let the test continue to create objects which are weakly reachable we need to first resume the VM, create the new instances, and then suspend it again.
> 
> @dholmes-ora I have no idea why these tests are so different. The VM suspend is implicit in the breakpoint in this test, which is set up using SUSPEND_ALL.

Ok, I understand now. `ReferenceType.instances()` only counts objects "reachable for the purposes of garbage collection". This change in behavior does concern me a little bit. I think the expectation is that the instances created by `ClassType.newInstance()` will not show up in this count unless `disableCollection()` is called, even when under a "suspend all". Clearly that's the expectation of this test, so the question is whether or not it is a reasonable expectation.

Note that `ClassType.newInstance()` says nothing about the state of the returned object w.r.t. GC. It makes no mention of the need to call `disableCollection()` before resuming the VM, so I guess this gives us some wiggle room. However, the argument against the object being strongly reachable is comes from user asking the question "who has the strong reference that makes it strongly reachable?". It's not obvious to the user why there is a strong reference, and why it seemingly goes a way once `VM.resumeAll()` is called.

I still think overall this is the right approach (baring a better approach being presented), but we may need to include some spec clarifications, and be prepared for some push back if this breaks anything.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From dholmes at openjdk.java.net  Mon Dec  7 22:09:14 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 7 Dec 2020 22:09:14 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <wmfIq8ZFMkGnfao7PezYJjF40geLpsezKUKrooe45IQ=.5884a4cd-44eb-43f0-b549-ef4102ae230e@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <wmfIq8ZFMkGnfao7PezYJjF40geLpsezKUKrooe45IQ=.5884a4cd-44eb-43f0-b549-ef4102ae230e@github.com>
Message-ID: <ujbM9-32evUnlfIyZphQCu3TNMSyohMATJpHw2jKE3I=.779ebd03-65c8-46f2-a949-ac1574e59a8b@github.com>

On Mon, 7 Dec 2020 11:22:30 GMT, Per Liden <pliden at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> Per Liden has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add comment

I still have some reservations about the logic in some of the tests now (ie using disableCollection whilst the VM is suspended and reenabling also whilst suspended) but the logic was unclear in the first place. If necessary follow up cleanup issues could be filed here.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1595


From dholmes at openjdk.java.net  Mon Dec  7 22:09:15 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 7 Dec 2020 22:09:15 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <SMBYcHVkHtS7ejzhxcW91YFlAOQTROdzJ8VDsByVLkA=.49051ea6-90e7-4232-b479-929e27f3e6c5@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <Wm71BbfKVs7DACh7TIrWmmZHwq0gBosjw55EusSGItw=.e4531645-512f-4423-8945-bdc5d5991f8e@github.com>
 <-2Yx99rM6jO7OHIzIlaHfdeojwgwwl7QthEqINqxiu4=.ae8922de-ea23-47c6-adff-3618cecd7eaf@github.com>
 <_mg_tfWiVdggsrpKEDFeFAR1-22yUsy-tTe0foBDdC4=.191dc88a-df48-4127-805d-62fba59d2750@github.com>
 <SMBYcHVkHtS7ejzhxcW91YFlAOQTROdzJ8VDsByVLkA=.49051ea6-90e7-4232-b479-929e27f3e6c5@github.com>
Message-ID: <LF6EZxmcKMlDXAXiCz4FEwuxOWdNY4Izjbjx0XFzp9g=.53a60512-c087-4a1f-82a5-b7b1a120e2d5@github.com>

On Mon, 7 Dec 2020 10:57:08 GMT, Per Liden <pliden at openjdk.org> wrote:

>> I agree a fatal error here seems excessive. Simply maintaining the strong ref seems reasonable.
>
> I was trying to mimic what we already do in `strengthenNode()`, i.e. it's a fatal error if we can't create a JNI ref. Here:
> 
>         strongRef = JNI_FUNC_PTR(env,NewGlobalRef)(env, node->ref);
>         /*
>          * NewGlobalRef on a weak ref will return NULL if the weak
>          * reference has been collected or if out of memory.
>          * It never throws OOM.
>          * We need to distinguish those two occurrences.
>          */
>         if ((strongRef == NULL) && !isSameObject(env, node->ref, NULL)) {
>             EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewGlobalRef");
>         }
> 
> So it seems appropriate to do the same thing if we fail to create a JNI weak ref. Also, as @plummercj mentioned, if we can't create a JNI ref, continuing the debug session seems rather pointless as we're about to go down anyway (the next allocation in the JVM will be fatal).

Okay.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From dholmes at openjdk.java.net  Mon Dec  7 22:17:19 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 7 Dec 2020 22:17:19 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
 <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
Message-ID: <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>

On Mon, 7 Dec 2020 20:30:07 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> @plummercj Nothing was explicitly triggering collection of these objects. However, the test is explicitly checking the number of objects "reachable for the purposes of garbage collection" in `checkDebugeeAnswer_instances()`. The tests sets up a breakpoint (with SUSPEND_ALL), which suspends the VM. Then it creates a number of new instances and expects these to be weakly reachable. However, with this change, suspending the VM will make all objects "reachable for the purposes of garbage collection". So, to let the test continue to create objects which are weakly reachable we need to first resume the VM, create the new instances, and then suspend it again.
>> 
>> @dholmes-ora I have no idea why these tests are so different. The VM suspend is implicit in the breakpoint in this test, which is set up using SUSPEND_ALL.
>
> Ok, I understand now. `ReferenceType.instances()` only counts objects "reachable for the purposes of garbage collection". This change in behavior does concern me a little bit. I think the expectation is that the instances created by `ClassType.newInstance()` will not show up in this count unless `disableCollection()` is called, even when under a "suspend all". Clearly that's the expectation of this test, so the question is whether or not it is a reasonable expectation.
> 
> Note that `ClassType.newInstance()` says nothing about the state of the returned object w.r.t. GC. It makes no mention of the need to call `disableCollection()` before resuming the VM, so I guess this gives us some wiggle room. However, the argument against the object being strongly reachable is comes from user asking the question "who has the strong reference that makes it strongly reachable?". It's not obvious to the user why there is a strong reference, and why it seemingly goes a way once `VM.resumeAll()` is called.
> 
> I still think overall this is the right approach (baring a better approach being presented), but we may need to include some spec clarifications, and be prepared for some push back if this breaks anything.

I don't follow your reasoning here Chris. All ObjectReferences can be GC'd at any time unless GC has been disallowed. So a reference create via newInstance is no different to any other reference. If it is currently reachable then instances() should return it. Are you treating "reachable for the purposes of garbage collection" as-if it said "strongly reachable"? It doesn't so I think you are reading too much into this. I think there is a lot of flexibility in this API in terms of what it may return regarding weak references.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From github.com+168222+mgkwill at openjdk.java.net  Mon Dec  7 23:35:28 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Mon, 7 Dec 2020 23:35:28 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:

  Fix space format, use Linux:: for local func.
  
  Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/073ffabe..870e8a54

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=08
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=07-08

  Stats: 8 lines in 1 file changed: 1 ins; 1 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From cjplummer at openjdk.java.net  Mon Dec  7 23:37:16 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Mon, 7 Dec 2020 23:37:16 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
 <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
 <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>
Message-ID: <_W4lIt9BSy6C6rbh9fR97LKXaL2n6DOkJDCOoaoqzYw=.7c887352-cc1e-4b78-8436-e720ed8d656a@github.com>

On Mon, 7 Dec 2020 22:14:28 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Ok, I understand now. `ReferenceType.instances()` only counts objects "reachable for the purposes of garbage collection". This change in behavior does concern me a little bit. I think the expectation is that the instances created by `ClassType.newInstance()` will not show up in this count unless `disableCollection()` is called, even when under a "suspend all". Clearly that's the expectation of this test, so the question is whether or not it is a reasonable expectation.
>> 
>> Note that `ClassType.newInstance()` says nothing about the state of the returned object w.r.t. GC. It makes no mention of the need to call `disableCollection()` before resuming the VM, so I guess this gives us some wiggle room. However, the argument against the object being strongly reachable is comes from user asking the question "who has the strong reference that makes it strongly reachable?". It's not obvious to the user why there is a strong reference, and why it seemingly goes a way once `VM.resumeAll()` is called.
>> 
>> I still think overall this is the right approach (baring a better approach being presented), but we may need to include some spec clarifications, and be prepared for some push back if this breaks anything.
>
> I don't follow your reasoning here Chris. All ObjectReferences can be GC'd at any time unless GC has been disallowed. So a reference create via newInstance is no different to any other reference. If it is currently reachable then instances() should return it. Are you treating "reachable for the purposes of garbage collection" as-if it said "strongly reachable"? It doesn't so I think you are reading too much into this. I think there is a lot of flexibility in this API in terms of what it may return regarding weak references.

I read "reachable for the purposes of garbage collection" as not including objects reachable only via weak reference. So if the only reference to an object is a weak reference, which is normally what you have after calling `ClassType.newInstance()`, then the object is not considered reachable. At the very least, his is how `ReferenceType.instances()` is implemented, and is based on JVMTI [FollowReferences](https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#FollowReferences)().

So given that, the expectation would be that an object returned `ClassType.newInstance()` would not be counted by `ReferenceType.instances()` unless something is done to add a strong reference to the object, such as calling `ObjectReference.disableCollection()`. Now with Per's changes a strong reference is also created with doing a VM.suspend(). The test doesn't expect this behavior, and it's understandable why.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From stuefe at openjdk.java.net  Tue Dec  8 08:39:13 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 8 Dec 2020 08:39:13 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
 <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>
 <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
Message-ID: <JOCt1-GENkCmFXLIvMm3YANuvI_c_19xQjPkJgzyGZk=.bd458b0b-5413-4d0b-99ff-97cb720ea3f7@github.com>

On Mon, 7 Dec 2020 20:09:37 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

> /test

Just re-run the gh actions.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From stuefe at openjdk.java.net  Tue Dec  8 08:52:13 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 8 Dec 2020 08:52:13 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
 <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>
 <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
Message-ID: <dUS_f78dzxbPuZQP-TihIx0ThGSW_L2Xk7jGxRWO0rg=.0d55af67-3d55-4fd0-a55a-f8dfab0fc2db@github.com>

On Mon, 7 Dec 2020 20:09:37 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

> > Hi Marcus,
> > I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
> > Cheers, Thomas
> 
> Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
> 
> Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
> 
> As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.

Its a simple matter of cycles. Code freeze is Dec 10. I'm snowed in right now.

Ideally I would liked to have run tests on ppc, s390 and aarch64 with multiple large page sizes enabled and used. A gtest for this scenario would also be good.

Then, code wise, there are some things we should straighten out. Not necessarily in your patch, but it should happen either before or after your patch is pushed. For example:
- we now have duplicate code for scanning the available huge pages
- the new select_large_page_size() feels very similar to the existing os::page_size_for_region_xx() functions.

I leave the decision to the others (@stefank @kstefanj ?). If they are fine with rushing this patch in its current form, its fine for me too. If problems arise in our platforms, we will deactivate this coding for non-Intel platforms before shipping jdk16.

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From kbarrett at openjdk.java.net  Tue Dec  8 10:13:18 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 8 Dec 2020 10:13:18 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
Message-ID: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>

Please review this change that eliminates the use of Reference.isEnqueued by
tests.  There were three tests using it:

vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
jdk/java/lang/ref/ReferenceEnqueue.java

In each of them, some combination of using Reference.refersTo and
ReferenceQueue.remove with a timeout were used to eliminate the use of
Reference.isEnqueued.

I also cleaned up ReferencesGC.java in various respects.  It contained
several bits of dead code, and the failure checks were made stronger.

Testing:
mach5 tier1
Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

-------------

Commit messages:
 - update WeakReferenceGC test
 - update ReferenceQueue test
 - update ReferencesGC test

Changes: https://git.openjdk.java.net/jdk/pull/1691/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1691&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257876
  Stats: 102 lines in 3 files changed: 21 ins; 39 del; 42 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1691.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1691/head:pull/1691

PR: https://git.openjdk.java.net/jdk/pull/1691


From sjohanss at openjdk.java.net  Tue Dec  8 10:18:12 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Tue, 8 Dec 2020 10:18:12 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v5]
In-Reply-To: <dUS_f78dzxbPuZQP-TihIx0ThGSW_L2Xk7jGxRWO0rg=.0d55af67-3d55-4fd0-a55a-f8dfab0fc2db@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <L-2eDRG9oZQZZhOLiMyYXSo5V969jF0RJZjQFnokXrI=.9263d06a-f93d-4716-be76-47ff886d34e0@github.com>
 <SWcM1peJ8cYcVXnFk_2dg04O8T6eEJ4ha1srKNUQ9Mg=.bfc3cb05-5088-4d6a-ae03-a3bb747616a9@github.com>
 <fZscscHcbFJ0HaIB_RtH7AEP2f9duaJFCiJCzqSwKlY=.00f63c30-dd16-47d2-83fb-0c9c15534b57@github.com>
 <dUS_f78dzxbPuZQP-TihIx0ThGSW_L2Xk7jGxRWO0rg=.0d55af67-3d55-4fd0-a55a-f8dfab0fc2db@github.com>
Message-ID: <aply5hyD-DNNqnDGbblyZHHNPKWakVZuA6BIDYAZhZw=.89fcb90d-f9e7-41bc-83c2-d492df06d49a@github.com>

On Tue, 8 Dec 2020 08:49:29 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>>> Hi Marcus,
>>> 
>>> I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
>>> 
>>> Cheers, Thomas
>> 
>> Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
>> 
>> Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
>> 
>> As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.
>
>> > Hi Marcus,
>> > I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
>> > Cheers, Thomas
>> 
>> Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
>> 
>> Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
>> 
>> As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.
> 
> Its a simple matter of cycles. Code freeze is Dec 10. I'm snowed in right now.
> 
> Ideally I would liked to have run tests on ppc, s390 and aarch64 with multiple large page sizes enabled and used. A gtest for this scenario would also be good.
> 
> Then, code wise, there are some things we should straighten out. Not necessarily in your patch, but it should happen either before or after your patch is pushed. For example:
> - we now have duplicate code for scanning the available huge pages
> - the new select_large_page_size() feels very similar to the existing os::page_size_for_region_xx() functions.
> 
> I leave the decision to the others (@stefank @kstefanj ?). If they are fine with rushing this patch in its current form, its fine for me too. If problems arise in our platforms, we will deactivate this coding for non-Intel platforms before shipping jdk16.
> 
> Cheers, Thomas

I will not be able to review this in time for the code freeze, so I also vote for not rushing it in.

Don't consider this a block, if you get other reviews I'm fine with it getting pushed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From shade at openjdk.java.net  Tue Dec  8 10:51:21 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 8 Dec 2020 10:51:21 GMT
Subject: RFR: 8251944: Add Shenandoah test config to
 compiler/gcbarriers/UnsafeIntrinsicsTest.java
Message-ID: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>

There used to be failures in Shenandoah CAS handling code like that were caught by this test. Those were fixed in JDK-8255401. This change turns the test into regression test for it.

Additional testing:
 - [x] Affected test on `x86_64` fastdebug, release
 - [x] Affected test on `x86_32` fastdebug
 - [x] Affected test on `aarch64` fastdebug

-------------

Commit messages:
 - Mention 8255401 in @bug
 - Make test pass in release
 - 8251944: Add Shenandoah test config to compiler/gcbarriers/UnsafeIntrinsicsTest.java

Changes: https://git.openjdk.java.net/jdk/pull/1693/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1693&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8251944
  Stats: 24 lines in 1 file changed: 22 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1693.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1693/head:pull/1693

PR: https://git.openjdk.java.net/jdk/pull/1693


From rkennke at openjdk.java.net  Tue Dec  8 11:45:11 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Tue, 8 Dec 2020 11:45:11 GMT
Subject: RFR: 8251944: Add Shenandoah test config to
 compiler/gcbarriers/UnsafeIntrinsicsTest.java
In-Reply-To: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>
References: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>
Message-ID: <3KQK4cZ0s8UFo04-zFxvIp0g1xtSiR5iqdW_LrNM3qg=.b9142a2a-7f36-4f44-bbd4-7bbba000a9c6@github.com>

On Tue, 8 Dec 2020 10:42:08 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> There used to be failures in Shenandoah CAS handling code like that were caught by this test. Those were fixed in JDK-8255401. This change turns the test into regression test for it.
> 
> Additional testing:
>  - [x] Affected test on `x86_64` fastdebug, release
>  - [x] Affected test on `x86_32` fastdebug
>  - [x] Affected test on `aarch64` fastdebug

Looks good to me! Thanks!

-------------

Marked as reviewed by rkennke (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1693


From pliden at openjdk.java.net  Tue Dec  8 14:07:14 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 8 Dec 2020 14:07:14 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <_W4lIt9BSy6C6rbh9fR97LKXaL2n6DOkJDCOoaoqzYw=.7c887352-cc1e-4b78-8436-e720ed8d656a@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
 <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
 <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>
 <_W4lIt9BSy6C6rbh9fR97LKXaL2n6DOkJDCOoaoqzYw=.7c887352-cc1e-4b78-8436-e720ed8d656a@github.com>
Message-ID: <SPK_78fvo-QgPLOPVrXBeV-jF7NmCI51FjEBbNjx58E=.f87190fe-ba18-4813-be6b-d008007b8e9b@github.com>

On Mon, 7 Dec 2020 23:34:00 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> I don't follow your reasoning here Chris. All ObjectReferences can be GC'd at any time unless GC has been disallowed. So a reference create via newInstance is no different to any other reference. If it is currently reachable then instances() should return it. Are you treating "reachable for the purposes of garbage collection" as-if it said "strongly reachable"? It doesn't so I think you are reading too much into this. I think there is a lot of flexibility in this API in terms of what it may return regarding weak references.
>
> I read "reachable for the purposes of garbage collection" as not including objects reachable only via weak reference. So if the only reference to an object is a weak reference, which is normally what you have after calling `ClassType.newInstance()`, then the object is not considered reachable. At the very least, his is how `ReferenceType.instances()` is implemented, and is based on JVMTI [FollowReferences](https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#FollowReferences)().
> 
> So given that, the expectation would be that an object returned `ClassType.newInstance()` would not be counted by `ReferenceType.instances()` unless something is done to add a strong reference to the object, such as calling `ObjectReference.disableCollection()`. Now with Per's changes a strong reference is also created with doing a VM.suspend(). The test doesn't expect this behavior, and it's understandable why.

I think we're still within what the spec says, given that the wording is so loose. But it's hard to tell if this change will be problematic for some use case.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From dongbohe at openjdk.java.net  Tue Dec  8 15:31:26 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Tue, 8 Dec 2020 15:31:26 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v6]
In-Reply-To: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
Message-ID: <ZQX7WUjCI_K4NQ5OaJZVs_UStZeBghIACdj9NVaxh5s=.01f71f84-869c-449c-b250-19e808165b74@github.com>

> Hi,
> 
> this is the continuation of the review of the implementation for:
> 
> https://bugs.openjdk.java.net/browse/JDK-8257145

Dongbo He has updated the pull request incrementally with one additional commit since the last revision:

  fix failure in test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1474/files
  - new: https://git.openjdk.java.net/jdk/pull/1474/files/17aab275..5aabcc31

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1474&range=04-05

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1474.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1474/head:pull/1474

PR: https://git.openjdk.java.net/jdk/pull/1474


From dongbohe at openjdk.java.net  Tue Dec  8 15:41:13 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Tue, 8 Dec 2020 15:41:13 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v5]
In-Reply-To: <o-kbK-GnksbiW5ZQHQiGvcnX4vanKqbqieUMNT59Gqs=.0a8b0f13-313a-4c5e-8766-77386039de84@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <iA_NaKUJ1VXeqS1QgBVroHC9BHhksX1Yo9aUtHsMd3U=.93e3a074-be3d-4759-84bc-5ba63fb427d7@github.com>
 <-BgjGGuqHSGLVTkiYkLrcFK6hgrQQY-RsTrNGpE-vi4=.2b8a0a90-ac11-4f79-a944-e99853d378b0@github.com>
 <o-kbK-GnksbiW5ZQHQiGvcnX4vanKqbqieUMNT59Gqs=.0a8b0f13-313a-4c5e-8766-77386039de84@github.com>
Message-ID: <lmto9faQ9vHusANYUPW2zPZ8f6pfl4kbPZwcCqjCB4M=.938389f4-2918-46b0-83b9-06dedc5bf727@github.com>

On Thu, 3 Dec 2020 03:14:13 GMT, Dongbo He <dongbohe at openjdk.org> wrote:

>> Looks good, thanks for fixing this.
>
> Thank you for your review, kstefanj.
> 
> As we saw in the test, this change will cause `./test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java` to fail.  I'm working on this case and will push it here for review when the work is done.

The failed cases are:   
`test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java:100`
`test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java:106`

The error message is as follows?
STDERR:
java.lang.RuntimeException: Expect that Survivor direct allocation are similar to all mem consumed
	at gc.g1.plab.TestPLABPromotion.checkLiveObjectsPromotion(TestPLABPromotion.java:168)
	at gc.g1.plab.TestPLABPromotion.checkResults(TestPLABPromotion.java:140)
	at gc.g1.plab.TestPLABPromotion.main(TestPLABPromotion.java:102)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
	at java.base/java.lang.Thread.run(Thread.java:831)

If except direct allocated, OBJECT_SIZE/word_size should bigger than PLAB_SIZE*WASTE_PCT.

Worked correctly after this patch.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From iwalulya at openjdk.java.net  Tue Dec  8 16:25:27 2020
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Tue, 8 Dec 2020 16:25:27 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
Message-ID: <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>

On Mon, 7 Dec 2020 23:35:28 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
>> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
>> 
>> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
>> 
>> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.
>
> Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix space format, use Linux:: for local func.
>   
>   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

Changes requested by iwalulya (Committer).

src/hotspot/os/linux/os_linux.cpp line 3726:

> 3724: }
> 3725: 
> 3726: void os::Linux::register_large_page_sizes() {

Please refactor to remove duplicated code with` find_large_page_size`, probably use `register_large_page_sizes` to eliminate the need for `find_large_page_size`

src/hotspot/os/linux/os_linux.cpp line 4221:

> 4219: }
> 4220: 
> 4221: size_t os::Linux::select_large_page_size(size_t bytes) {

As mentioned by @tstuefe , this is duplicating `size_t os::page_size_for_region(size_t region_size, size_t min_pages, bool must_be_aligned) `

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From adityam at openjdk.java.net  Tue Dec  8 17:13:09 2020
From: adityam at openjdk.java.net (Aditya Mandaleeka)
Date: Tue, 8 Dec 2020 17:13:09 GMT
Subject: RFR: 8251944: Add Shenandoah test config to
 compiler/gcbarriers/UnsafeIntrinsicsTest.java
In-Reply-To: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>
References: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>
Message-ID: <Cmzkb-nICeuURl9DqcRL9Iyi-194NMYn84FS7q-bkz0=.e4dfc925-e2aa-43aa-b2dd-77f098dde4d0@github.com>

On Tue, 8 Dec 2020 10:42:08 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> There used to be failures in Shenandoah CAS handling code like that were caught by this test. Those were fixed in JDK-8255401. This change turns the test into regression test for it.
> 
> Additional testing:
>  - [x] Affected test on `x86_64` fastdebug, release
>  - [x] Affected test on `x86_32` fastdebug
>  - [x] Affected test on `aarch64` fastdebug

LGTM

-------------

Marked as reviewed by adityam (Author).

PR: https://git.openjdk.java.net/jdk/pull/1693


From mchung at openjdk.java.net  Tue Dec  8 17:33:04 2020
From: mchung at openjdk.java.net (Mandy Chung)
Date: Tue, 8 Dec 2020 17:33:04 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
Message-ID: <2TbNnDlF1nuEFWLddNG3wdj5EL0gg-1hzGwe2-emoQE=.e950f0f7-6be0-426d-8634-bc3c3175030a@github.com>

On Tue, 8 Dec 2020 09:52:51 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change that eliminates the use of Reference.isEnqueued by
> tests.  There were three tests using it:
> 
> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
> jdk/java/lang/ref/ReferenceEnqueue.java
> 
> In each of them, some combination of using Reference.refersTo and
> ReferenceQueue.remove with a timeout were used to eliminate the use of
> Reference.isEnqueued.
> 
> I also cleaned up ReferencesGC.java in various respects.  It contained
> several bits of dead code, and the failure checks were made stronger.
> 
> Testing:
> mach5 tier1
> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

Marked as reviewed by mchung (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 18:03:25 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 18:03:25 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
Message-ID: <8qgrg_WguUKoqYco2cHwDJPBLvgaSkZNHnxy0zucubo=.f3d6cf56-2178-4b0d-bebe-e269001b5f44@github.com>

On Tue, 8 Dec 2020 16:19:27 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix space format, use Linux:: for local func.
>>   
>>   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
>
> src/hotspot/os/linux/os_linux.cpp line 4221:
> 
>> 4219: }
>> 4220: 
>> 4221: size_t os::Linux::select_large_page_size(size_t bytes) {
> 
> As mentioned by @tstuefe , this is duplicating `size_t os::page_size_for_region(size_t region_size, size_t min_pages, bool must_be_aligned) `

In latest patch I removed os::Linux::select_large_page_size and use os::page_size_for_region instead.

> src/hotspot/os/linux/os_linux.cpp line 3726:
> 
>> 3724: }
>> 3725: 
>> 3726: void os::Linux::register_large_page_sizes() {
> 
> Please refactor to remove duplicated code with` find_large_page_size`, probably use `register_large_page_sizes` to eliminate the need for `find_large_page_size`

In latest patch I removed Linux::find_large_page_size and use register_large_page_sizes. I tried to streamline Linux::setup_large_page_size.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 18:03:24 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 18:03:24 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v10]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <yB454HjjaBl99inCbE6qrbEYDI4JMlWIUgCpBWZ9wNw=.76b1c2de-40dd-4d66-b744-864b54e47dfb@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:

  Ivan W. Requested Changes
  
  Removed os::Linux::select_large_page_size and
  use os::page_size_for_region instead
  
  Removed Linux::find_large_page_size and use
  register_large_page_sizes. Streamlined
  Linux::setup_large_page_size
  
  Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/870e8a54..0bfc0cbb

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=09
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=08-09

  Stats: 72 lines in 2 files changed: 12 ins; 51 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 18:14:08 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 18:14:08 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
Message-ID: <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>

On Tue, 8 Dec 2020 16:22:35 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix space format, use Linux:: for local func.
>>   
>>   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
>
> Changes requested by iwalulya (Committer).

Hi Ivan (@walulyai). Thanks for the review!

I've addressed you and Thomas suggestion about duplication. Let me know if this meets your expectation or if further changes are required.

Thanks,
Marcus

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From iwalulya at openjdk.java.net  Tue Dec  8 18:48:10 2020
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Tue, 8 Dec 2020 18:48:10 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <8qgrg_WguUKoqYco2cHwDJPBLvgaSkZNHnxy0zucubo=.f3d6cf56-2178-4b0d-bebe-e269001b5f44@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <8qgrg_WguUKoqYco2cHwDJPBLvgaSkZNHnxy0zucubo=.f3d6cf56-2178-4b0d-bebe-e269001b5f44@github.com>
Message-ID: <DHPGcRVJXD-VruIw2anzWXarEK7MaEL09I2jYMrQpsM=.fe9f5e6a-da16-42ef-be19-194be81c2131@github.com>

On Tue, 8 Dec 2020 18:00:35 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> src/hotspot/os/linux/os_linux.cpp line 3726:
>> 
>>> 3724: }
>>> 3725: 
>>> 3726: void os::Linux::register_large_page_sizes() {
>> 
>> Please refactor to remove duplicated code with` find_large_page_size`, probably use `register_large_page_sizes` to eliminate the need for `find_large_page_size`
>
> In latest patch I removed Linux::find_large_page_size and use register_large_page_sizes. I tried to streamline Linux::setup_large_page_size.

with those changes, you have created a bug on os::large_page_size(), I don't think _large_page_size is set (unless I missed it).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 19:11:50 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 19:11:50 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <DHPGcRVJXD-VruIw2anzWXarEK7MaEL09I2jYMrQpsM=.fe9f5e6a-da16-42ef-be19-194be81c2131@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <8qgrg_WguUKoqYco2cHwDJPBLvgaSkZNHnxy0zucubo=.f3d6cf56-2178-4b0d-bebe-e269001b5f44@github.com>
 <DHPGcRVJXD-VruIw2anzWXarEK7MaEL09I2jYMrQpsM=.fe9f5e6a-da16-42ef-be19-194be81c2131@github.com>
Message-ID: <yAgvQ1gbXcKuSjv18OoJoSoUoiCGEVaCVVktz7jFRMw=.2c385167-685d-43d4-b764-ff48597190b6@github.com>

On Tue, 8 Dec 2020 18:44:54 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> In latest patch I removed Linux::find_large_page_size and use register_large_page_sizes. I tried to streamline Linux::setup_large_page_size.
>
> with those changes, you have created a bug on os::large_page_size(), I don't think _large_page_size is set (unless I missed it).

You are correct. That escaped me, even though I was looking for where os::large_page_size() was set. :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 19:11:46 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 19:11:46 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v11]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <i2rkW5OEkk-4zV5WoP3Nt1rSDWjft8878FipZzh_-_s=.afc042d7-c57e-4448-b76b-fff8bffb311a@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision:

  Fix os::large_page_size() in last update
  
  Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1153/files
  - new: https://git.openjdk.java.net/jdk/pull/1153/files/0bfc0cbb..85e75025

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=10
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=09-10

  Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From cjplummer at openjdk.java.net  Tue Dec  8 19:25:41 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Tue, 8 Dec 2020 19:25:41 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <SPK_78fvo-QgPLOPVrXBeV-jF7NmCI51FjEBbNjx58E=.f87190fe-ba18-4813-be6b-d008007b8e9b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
 <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
 <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>
 <_W4lIt9BSy6C6rbh9fR97LKXaL2n6DOkJDCOoaoqzYw=.7c887352-cc1e-4b78-8436-e720ed8d656a@github.com>
 <SPK_78fvo-QgPLOPVrXBeV-jF7NmCI51FjEBbNjx58E=.f87190fe-ba18-4813-be6b-d008007b8e9b@github.com>
Message-ID: <W3EgzIfxXGP0u_w3OZp6FNqR4P1udBlfqxsSlgrQRQ4=.770970f7-6beb-468e-bcb6-17a0c8b1af65@github.com>

On Tue, 8 Dec 2020 14:04:33 GMT, Per Liden <pliden at openjdk.org> wrote:

>> I read "reachable for the purposes of garbage collection" as not including objects reachable only via weak reference. So if the only reference to an object is a weak reference, which is normally what you have after calling `ClassType.newInstance()`, then the object is not considered reachable. At the very least, his is how `ReferenceType.instances()` is implemented, and is based on JVMTI [FollowReferences](https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#FollowReferences)().
>> 
>> So given that, the expectation would be that an object returned `ClassType.newInstance()` would not be counted by `ReferenceType.instances()` unless something is done to add a strong reference to the object, such as calling `ObjectReference.disableCollection()`. Now with Per's changes a strong reference is also created with doing a VM.suspend(). The test doesn't expect this behavior, and it's understandable why.
>
> I think we're still within what the spec says, given that the wording is so loose. But it's hard to tell if this change will be problematic for some use case.

I'm ok with making the change and then seeing if there is any fallout from it. My guess is there won't be. I do think there is a need to cleanup the JDI and JDWP specs in a few areas w.r.t. object liveness. Another CR can be filed for that.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From cjplummer at openjdk.java.net  Tue Dec  8 19:33:38 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Tue, 8 Dec 2020 19:33:38 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <ujbM9-32evUnlfIyZphQCu3TNMSyohMATJpHw2jKE3I=.779ebd03-65c8-46f2-a949-ac1574e59a8b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <wmfIq8ZFMkGnfao7PezYJjF40geLpsezKUKrooe45IQ=.5884a4cd-44eb-43f0-b549-ef4102ae230e@github.com>
 <ujbM9-32evUnlfIyZphQCu3TNMSyohMATJpHw2jKE3I=.779ebd03-65c8-46f2-a949-ac1574e59a8b@github.com>
Message-ID: <FyXQxGOEYVGOkVfkD7GlLLi0UwOGHfGAeeNVM07ISVQ=.428f8dff-6765-4296-a372-8836fdf6c66b@github.com>

On Mon, 7 Dec 2020 22:05:04 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> Per Liden has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Add comment
>
> I still have some reservations about the logic in some of the tests now (ie using disableCollection whilst the VM is suspended and reenabling also whilst suspended) but the logic was unclear in the first place. If necessary follow up cleanup issues could be filed here.
> 
> Thanks,
> David

A number of files need copyright updates.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Tue Dec  8 21:29:51 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 8 Dec 2020 21:29:51 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v3]
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <-7O_R4ZOVdbm3fvNcMb3xRiIQ6i8fGpTeHYtkvFZnvY=.2e67bf61-3df2-400c-9b28-c9def5bf7d13@github.com>

> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>> 
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>> 
>> and
>> 
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>> 
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>> 
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>> 
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
> 
> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
> 
> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
> 
> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

Per Liden has updated the pull request incrementally with one additional commit since the last revision:

  Fix copyright

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1595/files
  - new: https://git.openjdk.java.net/jdk/pull/1595/files/8fe1e52d..55cd2462

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1595&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1595&range=01-02

  Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1595.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1595/head:pull/1595

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Tue Dec  8 21:29:51 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 8 Dec 2020 21:29:51 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v2]
In-Reply-To: <FyXQxGOEYVGOkVfkD7GlLLi0UwOGHfGAeeNVM07ISVQ=.428f8dff-6765-4296-a372-8836fdf6c66b@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <wmfIq8ZFMkGnfao7PezYJjF40geLpsezKUKrooe45IQ=.5884a4cd-44eb-43f0-b549-ef4102ae230e@github.com>
 <ujbM9-32evUnlfIyZphQCu3TNMSyohMATJpHw2jKE3I=.779ebd03-65c8-46f2-a949-ac1574e59a8b@github.com>
 <FyXQxGOEYVGOkVfkD7GlLLi0UwOGHfGAeeNVM07ISVQ=.428f8dff-6765-4296-a372-8836fdf6c66b@github.com>
Message-ID: <A-T9OfF5X17ny3bijHXHIDBthT0Vyr4Mpc8Z-EarREQ=.0a7d92b0-b548-443a-b423-27e118fb4be3@github.com>

On Tue, 8 Dec 2020 19:30:44 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> I still have some reservations about the logic in some of the tests now (ie using disableCollection whilst the VM is suspended and reenabling also whilst suspended) but the logic was unclear in the first place. If necessary follow up cleanup issues could be filed here.
>> 
>> Thanks,
>> David
>
> A number of files need copyright updates.

@plummercj Copyright fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Tue Dec  8 21:44:35 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Tue, 8 Dec 2020 21:44:35 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v3]
In-Reply-To: <W3EgzIfxXGP0u_w3OZp6FNqR4P1udBlfqxsSlgrQRQ4=.770970f7-6beb-468e-bcb6-17a0c8b1af65@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
 <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
 <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>
 <_W4lIt9BSy6C6rbh9fR97LKXaL2n6DOkJDCOoaoqzYw=.7c887352-cc1e-4b78-8436-e720ed8d656a@github.com>
 <SPK_78fvo-QgPLOPVrXBeV-jF7NmCI51FjEBbNjx58E=.f87190fe-ba18-4813-be6b-d008007b8e9b@github.com>
 <W3EgzIfxXGP0u_w3OZp6FNqR4P1udBlfqxsSlgrQRQ4=.770970f7-6beb-468e-bcb6-17a0c8b1af65@github.com>
Message-ID: <gLUXmN6dLuVk5rdoU5zbliCamBP6u4EPMGkE9nAtmMQ=.4165b0f9-de5f-4b69-aecd-7cc287e45c4d@github.com>

On Tue, 8 Dec 2020 19:22:41 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> I think we're still within what the spec says, given that the wording is so loose. But it's hard to tell if this change will be problematic for some use case.
>
> I'm ok with making the change and then seeing if there is any fallout from it. My guess is there won't be. I do think there is a need to cleanup the JDI and JDWP specs in a few areas w.r.t. object liveness. Another CR can be filed for that.

I filed https://bugs.openjdk.java.net/browse/JDK-8257921. Feel free to extend/improve the description.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 22:17:45 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 22:17:45 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>
Message-ID: <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>

On Tue, 8 Dec 2020 18:10:54 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> Changes requested by iwalulya (Committer).
>
> Hi Ivan (@walulyai). Thanks for the review!
> 
> I've addressed you and Thomas suggestion about duplication. Let me know if this meets your expectation or if further changes are required.
> 
> Thanks,
> Marcus

> > > Hi Marcus,
> > > I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
> > > Cheers, Thomas
> > 
> > 
> > Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
> > Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
> > As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.
> 
> Its a simple matter of cycles. Code freeze is Dec 10. I'm snowed in right now.
> 
> Ideally I would liked to have run tests on ppc, s390 and aarch64 with multiple large page sizes enabled and used. A gtest for this scenario would also be good.
> 
> Then, code wise, there are some things we should straighten out. Not necessarily in your patch, but it should happen either before or after your patch is pushed. For example:
> 
> * we now have duplicate code for scanning the available huge pages
> * the new select_large_page_size() feels very similar to the existing os::page_size_for_region_xx() functions.
> 
> I leave the decision to the others (@stefank @kstefanj ?). If they are fine with rushing this patch in its current form, its fine for me too. If problems arise in our platforms, we will deactivate this coding for non-Intel platforms before shipping jdk16.
> 
> Cheers, Thomas

I've been looking at gtests (test/hotspot/gtest/runtime/test_os_linux.cpp and test/hotspot/gtest/memory/test_virtualspace.cpp) and correct me if I'm wrong but it seems like a gtest for this scenario (1G + 2m large pages or any variation thereof on different platforms) would require the build system to support page sizes (1G pages in this case) on the VM used to run gtests.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 22:17:44 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 22:17:44 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v12]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <zvnUnmHTpAspgmGg2m3vMUc-z3oJDmwaUfcOSw6TNtw=.68441b7f-ef5c-4581-9182-bf57772e37f3@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits:

 - Merge branch 'master' into update_hlp
 - Fix os::large_page_size() in last update
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Ivan W. Requested Changes
   
   Removed os::Linux::select_large_page_size and
   use os::page_size_for_region instead
   
   Removed Linux::find_large_page_size and use
   register_large_page_sizes. Streamlined
   Linux::setup_large_page_size
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix space format, use Linux:: for local func.
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Fix merge mistakes
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Add newline at end of TestLargePageUseForAuxMemory.java
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix merge mistakes
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Thomas S. Feedback
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'master' into update_hlp
 - ... and 8 more: https://git.openjdk.java.net/jdk/compare/c47ab5f6...70bd9016

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1153/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=11
  Stats: 63 lines in 2 files changed: 24 ins; 11 del; 28 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 22:17:45 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 22:17:45 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>
 <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>
Message-ID: <25liSjMiaKbiCx42cTqv70gLl24lOVCrvUcVvMYhmw0=.e4d2cb3f-66a6-4b23-bea9-5a7f2b97a1a7@github.com>

On Tue, 8 Dec 2020 19:24:20 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> Hi Ivan (@walulyai). Thanks for the review!
>> 
>> I've addressed you and Thomas suggestion about duplication. Let me know if this meets your expectation or if further changes are required.
>> 
>> Thanks,
>> Marcus
>
>> > > Hi Marcus,
>> > > I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
>> > > Cheers, Thomas
>> > 
>> > 
>> > Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
>> > Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
>> > As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.
>> 
>> Its a simple matter of cycles. Code freeze is Dec 10. I'm snowed in right now.
>> 
>> Ideally I would liked to have run tests on ppc, s390 and aarch64 with multiple large page sizes enabled and used. A gtest for this scenario would also be good.
>> 
>> Then, code wise, there are some things we should straighten out. Not necessarily in your patch, but it should happen either before or after your patch is pushed. For example:
>> 
>> * we now have duplicate code for scanning the available huge pages
>> * the new select_large_page_size() feels very similar to the existing os::page_size_for_region_xx() functions.
>> 
>> I leave the decision to the others (@stefank @kstefanj ?). If they are fine with rushing this patch in its current form, its fine for me too. If problems arise in our platforms, we will deactivate this coding for non-Intel platforms before shipping jdk16.
>> 
>> Cheers, Thomas
> 
> I've been looking at gtests (test/hotspot/gtest/runtime/test_os_linux.cpp and test/hotspot/gtest/memory/test_virtualspace.cpp) and correct me if I'm wrong but it seems like a gtest for this scenario (1G + 2m large pages or any variation thereof on different platforms) would require the build system to support page sizes (1G pages in this case) on the VM used to run gtests.

Updated with a merge for changes from master. 

It appears that some failures were caused by previous merge. See https://bugs.openjdk.java.net/browse/JDK-8257855

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From dholmes at openjdk.java.net  Tue Dec  8 22:31:36 2020
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 8 Dec 2020 22:31:36 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v3]
In-Reply-To: <-7O_R4ZOVdbm3fvNcMb3xRiIQ6i8fGpTeHYtkvFZnvY=.2e67bf61-3df2-400c-9b28-c9def5bf7d13@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <-7O_R4ZOVdbm3fvNcMb3xRiIQ6i8fGpTeHYtkvFZnvY=.2e67bf61-3df2-400c-9b28-c9def5bf7d13@github.com>
Message-ID: <TKpQYfMGV9k2Fz-7iXN4cYWnWJIk_YHuF_CGh9KKSlk=.c5bfd777-b9ee-4bdf-a149-1ff3f0ed666c@github.com>

On Tue, 8 Dec 2020 21:29:51 GMT, Per Liden <pliden at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> Per Liden has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix copyright

Marked as reviewed by dholmes (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From cjplummer at openjdk.java.net  Tue Dec  8 22:39:38 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Tue, 8 Dec 2020 22:39:38 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v3]
In-Reply-To: <-7O_R4ZOVdbm3fvNcMb3xRiIQ6i8fGpTeHYtkvFZnvY=.2e67bf61-3df2-400c-9b28-c9def5bf7d13@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <-7O_R4ZOVdbm3fvNcMb3xRiIQ6i8fGpTeHYtkvFZnvY=.2e67bf61-3df2-400c-9b28-c9def5bf7d13@github.com>
Message-ID: <3r_aQU9A4Vu5QCcVxk4xpb2rGZoA7BjPGSpRz3OEg2c=.4165ab02-964f-4873-a8fc-7d36b95357a6@github.com>

On Tue, 8 Dec 2020 21:29:51 GMT, Per Liden <pliden at openjdk.org> wrote:

>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>> 
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>> 
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>> 
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>> 
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>> 
>>> and
>>> 
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>> 
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>> 
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>> 
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>> 
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>> 
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>> 
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>> 
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>> 
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> Per Liden has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix copyright

Marked as reviewed by cjplummer (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From cjplummer at openjdk.java.net  Tue Dec  8 22:39:38 2020
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Tue, 8 Dec 2020 22:39:38 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v3]
In-Reply-To: <gLUXmN6dLuVk5rdoU5zbliCamBP6u4EPMGkE9nAtmMQ=.4165b0f9-de5f-4b69-aecd-7cc287e45c4d@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <hizRLEuf29pEqdmBN_DVZnlpm4l2Po6ZGOhZhmAaK6o=.79b4609f-07d6-49e7-ac13-932976cf345b@github.com>
 <UvA_NjGTTOEPEPXHRMBNf_606ZBMySVR-JP6cLPqEOI=.333061db-976c-4cc2-9417-e0e74a358b1f@github.com>
 <EsuP8i4-93J_bGbppT8bHFsiZMWzpsLkSwBr4R2IgP8=.64af80df-641f-479a-8337-8bb6ddd923c0@github.com>
 <Kskih7ktVJDjyUVCckv0EHzbEk5KtzfqwQOdv_ZTdWw=.36c91324-0638-457a-be1f-300cd1f5dc98@github.com>
 <_W4lIt9BSy6C6rbh9fR97LKXaL2n6DOkJDCOoaoqzYw=.7c887352-cc1e-4b78-8436-e720ed8d656a@github.com>
 <SPK_78fvo-QgPLOPVrXBeV-jF7NmCI51FjEBbNjx58E=.f87190fe-ba18-4813-be6b-d008007b8e9b@github.com>
 <W3EgzIfxXGP0u_w3OZp6FNqR4P1udBlfqxsSlgrQRQ4=.770970f7-6beb-468e-bcb6-17a0c8b1af65@github.com>
 <gLUXmN6dLuVk5rdoU5zbliCamBP6u4EPMGkE9nAtmMQ=.4165b0f9-de5f-4b69-aecd-7cc287e45c4d@github.com>
Message-ID: <8_3bIAI5yDF79HSbQVGGsbr3xcoBpQd9cJXONYT_X1w=.f577f007-13eb-4d73-af48-19143a5c4403@github.com>

On Tue, 8 Dec 2020 21:42:11 GMT, Per Liden <pliden at openjdk.org> wrote:

>> I'm ok with making the change and then seeing if there is any fallout from it. My guess is there won't be. I do think there is a need to cleanup the JDI and JDWP specs in a few areas w.r.t. object liveness. Another CR can be filed for that.
>
> I filed https://bugs.openjdk.java.net/browse/JDK-8257921. Feel free to extend/improve the description.

Thanks. I'll add some suggestions to the CR based on some of our recent discussions.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec  8 23:29:37 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 8 Dec 2020 23:29:37 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <25liSjMiaKbiCx42cTqv70gLl24lOVCrvUcVvMYhmw0=.e4d2cb3f-66a6-4b23-bea9-5a7f2b97a1a7@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>
 <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>
 <25liSjMiaKbiCx42cTqv70gLl24lOVCrvUcVvMYhmw0=.e4d2cb3f-66a6-4b23-bea9-5a7f2b97a1a7@github.com>
Message-ID: <sH1ssM5PJOfv6D211zXZA3Kx1wfL3E12p2X-jYvhu_8=.b5d475ad-3352-4c94-a751-c1b0fadbbe10@github.com>

On Tue, 8 Dec 2020 22:15:34 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>>> > > Hi Marcus,
>>> > > I generally like this patch. I will do a more thorough review later. But could this wait please until after JDK16 has been forked off? Since I would like this to spend some more times cooking on our more exotic Linuxes.
>>> > > Cheers, Thomas
>>> > 
>>> > 
>>> > Hi Thomas. I was pushing to get this patch in before JDK16 was forked.
>>> > Can we run exotic Linux tests now? Is there anything else keeping this from inclusion in JDK16?
>>> > As an aside, I will stand behind any patch I get upstream, including maintain it, discuss it or fix bugs.
>>> 
>>> Its a simple matter of cycles. Code freeze is Dec 10. I'm snowed in right now.
>>> 
>>> Ideally I would liked to have run tests on ppc, s390 and aarch64 with multiple large page sizes enabled and used. A gtest for this scenario would also be good.
>>> 
>>> Then, code wise, there are some things we should straighten out. Not necessarily in your patch, but it should happen either before or after your patch is pushed. For example:
>>> 
>>> * we now have duplicate code for scanning the available huge pages
>>> * the new select_large_page_size() feels very similar to the existing os::page_size_for_region_xx() functions.
>>> 
>>> I leave the decision to the others (@stefank @kstefanj ?). If they are fine with rushing this patch in its current form, its fine for me too. If problems arise in our platforms, we will deactivate this coding for non-Intel platforms before shipping jdk16.
>>> 
>>> Cheers, Thomas
>> 
>> I've been looking at gtests (test/hotspot/gtest/runtime/test_os_linux.cpp and test/hotspot/gtest/memory/test_virtualspace.cpp) and correct me if I'm wrong but it seems like a gtest for this scenario (1G + 2m large pages or any variation thereof on different platforms) would require the build system to support page sizes (1G pages in this case) on the VM used to run gtests.
>
> Updated with a merge for changes from master. 
> 
> It appears that some failures were caused by previous merge. See https://bugs.openjdk.java.net/browse/JDK-8257855

There also appears to be an issue on TestSegments.java. See https://github.com/openjdk/jdk/pull/1688 

https://bugs.openjdk.java.net/browse/JDK-8257887

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From shade at openjdk.java.net  Wed Dec  9 06:47:50 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 9 Dec 2020 06:47:50 GMT
Subject: RFR: 8251944: Add Shenandoah test config to
 compiler/gcbarriers/UnsafeIntrinsicsTest.java [v2]
In-Reply-To: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>
References: <SZXS4B0nNyyP9mUOR25D7epwECGDbym0wOsBuXhNJBU=.833f38cd-4283-4df8-98b9-42e68f6fdb93@github.com>
Message-ID: <rAGelaMfhyq1kk9kKpuWaiSDAaGjZdUYfgxQ5YT6lMw=.ad290b44-1e04-4b16-b639-75447d22de9e@github.com>

> There used to be failures in Shenandoah CAS handling code like that were caught by this test. Those were fixed in JDK-8255401. This change turns the test into regression test for it.
> 
> Additional testing:
>  - [x] Affected test on `x86_64` fastdebug, release
>  - [x] Affected test on `x86_32` fastdebug
>  - [x] Affected test on `aarch64` fastdebug

Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - Merge branch 'master' into JDK-8251944-shenandoah-test-unsafe
 - Mention 8255401 in @bug
 - Make test pass in release
 - 8251944: Add Shenandoah test config to compiler/gcbarriers/UnsafeIntrinsicsTest.java

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1693/files
  - new: https://git.openjdk.java.net/jdk/pull/1693/files/8ca12686..073e166c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1693&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1693&range=00-01

  Stats: 3353 lines in 229 files changed: 2017 ins; 565 del; 771 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1693.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1693/head:pull/1693

PR: https://git.openjdk.java.net/jdk/pull/1693


From pliden at openjdk.java.net  Wed Dec  9 07:49:36 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Wed, 9 Dec 2020 07:49:36 GMT
Subject: Integrated: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException
In-Reply-To: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
Message-ID: <KSjGUwCvIWyq_cxgYSe2NOBAH26D3oCNP0hUPFdH1_4=.5b2a903b-7e9d-4e4a-9589-daa84d7bba22@github.com>

On Thu, 3 Dec 2020 12:55:04 GMT, Per Liden <pliden at openjdk.org> wrote:

> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
> 
> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
> 
> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
> 
>> Going back to the spec, ObjectReference.disableCollection() says:
>> 
>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>> 
>> and
>> 
>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>> 
>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>> 
>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>> 
>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
> 
> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
> 
> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
> 
> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
> 
> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>  - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>  - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>  - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>  - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
> 
> Testing:
> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.

This pull request has now been integrated.

Changeset: 79f1dfb8
Author:    Per Liden <pliden at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/79f1dfb8
Stats:     168 lines in 8 files changed: 135 ins; 0 del; 33 mod

8255987: JDI tests fail with com.sun.jdi.ObjectCollectedException

Reviewed-by: dholmes, cjplummer

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From pliden at openjdk.java.net  Wed Dec  9 07:49:34 2020
From: pliden at openjdk.java.net (Per Liden)
Date: Wed, 9 Dec 2020 07:49:34 GMT
Subject: RFR: 8255987: JDI tests fail with
 com.sun.jdi.ObjectCollectedException [v3]
In-Reply-To: <3r_aQU9A4Vu5QCcVxk4xpb2rGZoA7BjPGSpRz3OEg2c=.4165ab02-964f-4873-a8fc-7d36b95357a6@github.com>
References: <EE2ErKeQMTs3dGlgocxqg-ZSPnfJlM3ekHiDpiz2090=.a2b4edb8-7741-422b-aba4-17cb8bbc87b6@github.com>
 <-7O_R4ZOVdbm3fvNcMb3xRiIQ6i8fGpTeHYtkvFZnvY=.2e67bf61-3df2-400c-9b28-c9def5bf7d13@github.com>
 <3r_aQU9A4Vu5QCcVxk4xpb2rGZoA7BjPGSpRz3OEg2c=.4165ab02-964f-4873-a8fc-7d36b95357a6@github.com>
Message-ID: <2RwHa1kbWXQiOkndL3Enlpr9Adn9Pr3qw5pd6TowN9g=.e68376c5-af4f-43a1-8dfa-63be68d55012@github.com>

On Tue, 8 Dec 2020 22:37:13 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> Per Liden has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix copyright
>
> Marked as reviewed by cjplummer (Reviewer).

Thanks for reviewing, @plummercj and @dholmes-ora!

-------------

PR: https://git.openjdk.java.net/jdk/pull/1595


From stuefe at openjdk.java.net  Wed Dec  9 09:10:36 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 9 Dec 2020 09:10:36 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <sH1ssM5PJOfv6D211zXZA3Kx1wfL3E12p2X-jYvhu_8=.b5d475ad-3352-4c94-a751-c1b0fadbbe10@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>
 <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>
 <25liSjMiaKbiCx42cTqv70gLl24lOVCrvUcVvMYhmw0=.e4d2cb3f-66a6-4b23-bea9-5a7f2b97a1a7@github.com>
 <sH1ssM5PJOfv6D211zXZA3Kx1wfL3E12p2X-jYvhu_8=.b5d475ad-3352-4c94-a751-c1b0fadbbe10@github.com>
Message-ID: <5Z3qMVB7F933CRQ_GxJzvep6zsSrnoh4pBhNM_oAkd8=.d7ee01f9-e4b7-452f-b837-e4805fd2011c@github.com>

On Tue, 8 Dec 2020 23:25:46 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> Updated with a merge for changes from master. 
>> 
>> It appears that some failures were caused by previous merge. See https://bugs.openjdk.java.net/browse/JDK-8257855
>
> There also appears to be an issue on TestSegments.java. See https://github.com/openjdk/jdk/pull/1688 
> 
> https://bugs.openjdk.java.net/browse/JDK-8257887
> 
> https://github.com/mgkwill/jdk/runs/1520624742#step:13:15

Hi Marcus,

Sorry, I changed my opinion about JDK16.

According to Ivan, one of your last commits broke os::large_page_size(). That is fine, things happen. But what makes me nervous is that it took a reviewer to find this, this should have popped up in tests right away (eg runtime/test_os.cpp, "os_pagesizes" test). So we have holes in test coverage or in the way the tests are executed.

The GH actions are of course not enough, since they do not run with large pages to my knowledge. What would be needed, in my opinion:
- one jtreg test to test that the VM comes up with `-XX:+UseLargePages -XX:LargePageSizeInBytes=1G` and allocates small-large-pages as expected. This is not only needed as a function proof but to prevent regressions when we reform the code (which will happen)
- We should have a gtest run with large pages. I opened https://bugs.openjdk.java.net/browse/JDK-8257959 to track that.
- This patch changes behavior insofar as that now we return memory to the caller with a page size which he may not expect. We _think_ this is fine, since committing/uncommitting this memory is disabled. But since this is used by GC and Compiler and potentially other consumers as well, this should be thoroughly tested. I think tier1...3 at least, plus gc stress tests? probably with LargePageSizeInBytes=1G specified for all those tests.

Side note: gtests are not bound to the build. You can run them manually by launching the gtestlauncher:
`./hotspot/variant-xxx/libjvm/gtest/gtestLauncher -jdk:./images/jdk`

You can add VM options to it, e.g.:
`./hotspot/variant-xxx/libjvm/gtest/gtestLauncher -jdk:./images/jdk -Xmx128m -XX:+UseLargePages -XX:LargePageSizeInBytes=1G`

and you probably should run this at least manually for your patch. Note caveat: death tests will fail with LP, see JDK-8257229.

Figuring all this out would be something we would assist you with. But you try to push this into JDK16, whose deadline is tomorrow. That puts us into an awkward position. The way it is now, we only could integrate this without having run many tests, without regression testing and without testing on non-Intel platforms. I do not think this is the right way.

Note that if you think JDK16 is important you always can backport changes to older releases once the patch is in JDK17 and has been tested there.

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From stefank at openjdk.java.net  Wed Dec  9 09:13:36 2020
From: stefank at openjdk.java.net (Stefan Karlsson)
Date: Wed, 9 Dec 2020 09:13:36 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v9]
In-Reply-To: <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
 <QHgebtpgLiqJKc6DWm9d_4vooQsTeRdNtQvfbK9D8Y4=.229739ca-dcdd-48b4-853b-266e4a630ca8@github.com>
 <VlhGJTiWQAjidWE9FeuOBiSjHr3D0MZmrybBMFHaD20=.2db3dc5e-9c22-4eaa-ab52-08a6ddc4e33d@github.com>
 <Enlk-DHiG6XCtwvCfIIZOl7rt6eXBFADgdyT4_K2rng=.50a41b92-94b1-40f0-9cdc-c47f6f405cfc@github.com>
 <xU_icX3fdtr_mOYiZctQetpKyIb12VpeRo3b5GEKKeQ=.ca837c78-047d-4795-a31f-aa345bdc7481@github.com>
Message-ID: <v15JEWFPnFLH1xW1HTWKsei8vTWfUazKXq0aMcKrVA8=.bff4fc93-15d5-401b-9209-eca99a6e6085@github.com>

On Tue, 8 Dec 2020 19:24:20 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

> I leave the decision to the others (@stefank @kstefanj ?). If they are fine with rushing this patch in its current form, its fine for me too. If problems arise in our platforms, we will deactivate this coding for non-Intel platforms before shipping jdk16.

I see that @tstuefe pinged me. I've talked to @walulyai and he's going to review this change and figure out if it makes it into JDK 16.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153


From tschatzl at openjdk.java.net  Wed Dec  9 10:54:49 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 9 Dec 2020 10:54:49 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap
Message-ID: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>

Hi all,

  can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?

`VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.

(Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)

They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.

The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.

There some points I would like to bring up in advance in this change that may be contentious:
- each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
- the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
- so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.

Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

-------------

Commit messages:
 - Initial import

Changes: https://git.openjdk.java.net/jdk/pull/1661/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1661&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256641
  Stats: 132 lines in 10 files changed: 84 ins; 30 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1661.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1661/head:pull/1661

PR: https://git.openjdk.java.net/jdk/pull/1661


From kbarrett at openjdk.java.net  Wed Dec  9 12:27:36 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 9 Dec 2020 12:27:36 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap
In-Reply-To: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
Message-ID: <1NlD3pEuvW66ocFsjP9la5yjKfsLN6hmtMee8-ZafPM=.d3db39ce-730e-46e6-9124-297ccdd2cc2f@github.com>

On Mon, 7 Dec 2020 11:23:04 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

Changes requested by kbarrett (Reviewer).

src/hotspot/share/gc/shared/gcVMOperations.cpp line 64:

> 62: 
> 63: void VM_GC_Sync_Operation::doit_epilogue() {
> 64:   if (Universe::has_reference_pending_list()) {

Why is the pending list handling moved here, rather than remaining in VM_GC_Operation::doit_epilogue?  This doesn't have anything to do with syncing between operations, and seems odd for VM_Verify (for example) to do.

src/hotspot/share/gc/shared/gcVMOperations.cpp line 61:

> 59:   }
> 60:   return _prologue_succeeded;
> 61: }

This invocation checking doesn't seem right at this level.  That is, skip_operation and prologue_succeeded all seem to me to have nothing to do with syncing, instead belonging to the VM_GC_Operation level and should remain there.

src/hotspot/share/gc/shared/gcVMOperations.hpp line 103:

> 101:   // Acquire the reference synchronization lock
> 102:   virtual bool doit_prologue();
> 103:   // Do notifyAll (if needed) and release held lock

s/notifyAll/notify_all/

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From tschatzl at openjdk.java.net  Wed Dec  9 12:31:34 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 9 Dec 2020 12:31:34 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap
In-Reply-To: <1NlD3pEuvW66ocFsjP9la5yjKfsLN6hmtMee8-ZafPM=.d3db39ce-730e-46e6-9124-297ccdd2cc2f@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <1NlD3pEuvW66ocFsjP9la5yjKfsLN6hmtMee8-ZafPM=.d3db39ce-730e-46e6-9124-297ccdd2cc2f@github.com>
Message-ID: <6nH7SZYCYVXMHQaONxFh2OAZzhby9pQmyfuHVx-m69c=.f77113b7-84d5-45e6-8fa8-9ad628534f43@github.com>

On Wed, 9 Dec 2020 11:58:46 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
>> 
>> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
>> 
>> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
>> 
>> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
>> 
>> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
>> 
>> There some points I would like to bring up in advance in this change that may be contentious:
>> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
>> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
>> 
>> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)
>
> src/hotspot/share/gc/shared/gcVMOperations.cpp line 64:
> 
>> 62: 
>> 63: void VM_GC_Sync_Operation::doit_epilogue() {
>> 64:   if (Universe::has_reference_pending_list()) {
> 
> Why is the pending list handling moved here, rather than remaining in VM_GC_Operation::doit_epilogue?  This doesn't have anything to do with syncing between operations, and seems odd for VM_Verify (for example) to do.

Also answering the next question: these two items (i.e. including the `prologue_succeeded` stuff) have mostly been kept there to allow simple reuse in `VM_GC_Operation`. I'll remove those and (maybe) just break the inheritance chain.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From tschatzl at openjdk.java.net  Wed Dec  9 13:19:46 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 9 Dec 2020 13:19:46 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v2]
In-Reply-To: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
Message-ID: <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>

> Hi all,
> 
>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  kbarrett review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1661/files
  - new: https://git.openjdk.java.net/jdk/pull/1661/files/1b5b5a8d..213fbeed

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1661&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1661&range=00-01

  Stats: 56 lines in 2 files changed: 15 ins; 20 del; 21 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1661.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1661/head:pull/1661

PR: https://git.openjdk.java.net/jdk/pull/1661


From tschatzl at openjdk.java.net  Wed Dec  9 13:25:33 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 9 Dec 2020 13:25:33 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <2TbNnDlF1nuEFWLddNG3wdj5EL0gg-1hzGwe2-emoQE=.e950f0f7-6be0-426d-8634-bc3c3175030a@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <2TbNnDlF1nuEFWLddNG3wdj5EL0gg-1hzGwe2-emoQE=.e950f0f7-6be0-426d-8634-bc3c3175030a@github.com>
Message-ID: <MHGYWR3DXDyJumlf-FTpcLYbU4EqweKcbDTHzfOHmeA=.febcd446-b55c-4ffe-b99c-4e1faed792d2@github.com>

On Tue, 8 Dec 2020 17:30:11 GMT, Mandy Chung <mchung at openjdk.org> wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> Marked as reviewed by mchung (Reviewer).

I'm not able to put this in the appropriate place using the github UI:

[pre-existing] The topWeakReferenceGC.java description at the top describes that the test calls System.gc() explicitly to trigger garbage collections at the end. It does not. Maybe this could be weasel-worded around like in the other cases in that text.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From tschatzl at openjdk.java.net  Wed Dec  9 13:31:41 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 9 Dec 2020 13:31:41 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
Message-ID: <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>

On Tue, 8 Dec 2020 09:52:51 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change that eliminates the use of Reference.isEnqueued by
> tests.  There were three tests using it:
> 
> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
> jdk/java/lang/ref/ReferenceEnqueue.java
> 
> In each of them, some combination of using Reference.refersTo and
> ReferenceQueue.remove with a timeout were used to eliminate the use of
> Reference.isEnqueued.
> 
> I also cleaned up ReferencesGC.java in various respects.  It contained
> several bits of dead code, and the failure checks were made stronger.
> 
> Testing:
> mach5 tier1
> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

Changes requested by tschatzl (Reviewer).

test/jdk/java/lang/ref/ReferenceEnqueue.java line 58:

> 56:             for (int i = 0; i < iterations; i++) {
> 57:                 System.gc();
> 58:                 enqueued = (queue.remove(100) == ref);

The code does not catch `InterruptedException` like it does in the other files.

test/hotspot/jtreg/vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java line 129:

> 127:                 }
> 128: 
> 129:                 int REMOVE = (int) (RANGE * RATIO);

These two constants could be factored out as static finals to match the casing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From tschatzl at openjdk.java.net  Wed Dec  9 14:01:35 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 9 Dec 2020 14:01:35 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
Message-ID: <QjRep49fBPVa9LRUoxCBzswRLtGA6GmSbOssY1A1KJo=.f087eac0-5b58-4330-9b9b-148c280b0ca0@github.com>

On Wed, 9 Dec 2020 13:23:47 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> test/jdk/java/lang/ref/ReferenceEnqueue.java line 58:
> 
>> 56:             for (int i = 0; i < iterations; i++) {
>> 57:                 System.gc();
>> 58:                 enqueued = (queue.remove(100) == ref);
> 
> The code does not catch `InterruptedException` like it does in the other files.

I understand that the test code previously just forwarded the `InterruptedException` if it happened in the `Thread.sleep()` call too. So this may only be an exiting issue and please ignore this comment.
Not catching `InterruptedException` here only seems to be a cause for unnecessary failure. Then again, it probably does not happen a lot.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From github.com+168222+mgkwill at openjdk.java.net  Wed Dec  9 15:30:45 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Wed, 9 Dec 2020 15:30:45 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v13]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <ntKbbM2pFGYX_2-pWtbABpTvJx_qs42cW2AhTjsK2RM=.3eb8b2c8-0883-4237-9a1f-39d2f2398edb@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits:

 - Merge branch 'master' into update_hlp
 - Merge branch 'master' into update_hlp
 - Fix os::large_page_size() in last update
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Ivan W. Requested Changes
   
   Removed os::Linux::select_large_page_size and
   use os::page_size_for_region instead
   
   Removed Linux::find_large_page_size and use
   register_large_page_sizes. Streamlined
   Linux::setup_large_page_size
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix space format, use Linux:: for local func.
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Fix merge mistakes
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Add newline at end of TestLargePageUseForAuxMemory.java
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix merge mistakes
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Thomas S. Feedback
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - ... and 9 more: https://git.openjdk.java.net/jdk/compare/6eff9315...8f1474a9

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1153/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=12
  Stats: 63 lines in 2 files changed: 24 ins; 11 del; 28 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From kbarrett at openjdk.java.net  Wed Dec  9 15:58:38 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 9 Dec 2020 15:58:38 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v2]
In-Reply-To: <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
Message-ID: <NBkWszhOWXxLNAxKIMYjtcU9wmwiL-wyllZbUwIbwCY=.d17c401a-b1b0-48f8-9138-774c17a0f0a8@github.com>

On Wed, 9 Dec 2020 13:19:46 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
>> 
>> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
>> 
>> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
>> 
>> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
>> 
>> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
>> 
>> There some points I would like to bring up in advance in this change that may be contentious:
>> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
>> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
>> 
>> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)
>
> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
> 
>   kbarrett review

Code changes look good.  A couple of places where comments could use some improvement.

src/hotspot/share/gc/shared/gcVMOperations.hpp line 146:

> 144: 
> 145:   // Acquire the reference synchronization lock
> 146:   virtual bool doit_prologue();

This does a lot more than just acquiring the lock.  It also handles the prevention of multiple gc requests.

src/hotspot/share/gc/shared/gcVMOperations.hpp line 111:

> 109:  protected:
> 110:   uint           _gc_count_before;         // gc count before acquiring PLL
> 111:   uint           _full_gc_count_before;    // full gc count before acquiring PLL

[pre-existing] "PLL" ?  I think that might be obsolete terminology, referring to the "pending list lock"?  I think should be Heap_lock now.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1661


From iwalulya at openjdk.java.net  Wed Dec  9 17:23:34 2020
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Wed, 9 Dec 2020 17:23:34 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
Message-ID: <Ki8TjwRI6biLbMKN2NV63CZ3YzCxQ9Jqt4UdSM-PJuM=.a4574100-7882-49ef-94fd-639b8cfbb6bd@github.com>

On Fri, 4 Dec 2020 09:34:36 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this reimplementation of WeakProcessorPhase.  It is changed to
> a scoped enum at namespace scope, and uses the recently added EnumIterator
> facility to provide iteration, rather than a bespoke iterator class.
> 
> This is a step toward eliminating it entirely.  I've split it out into a
> separate PR to make the review of the follow-up work a bit easier.
> 
> As part of this the file weakProcessorPhases.hpp is renamed to
> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
> rename and (majorly) edit, instead treating it as a remove and add a new
> file.
> 
> Testing: mach5 tier1

lgtm

-------------

Marked as reviewed by iwalulya (Committer).

PR: https://git.openjdk.java.net/jdk/pull/1620


From ayang at openjdk.java.net  Wed Dec  9 18:26:41 2020
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Wed, 9 Dec 2020 18:26:41 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
Message-ID: <nMjcy7Ea0A61plFgUfRlquMk4b1I_BAROR40ztXwj-s=.ae0654a4-1459-4309-9f18-1c66193b9014@github.com>

On Fri, 4 Dec 2020 09:34:36 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this reimplementation of WeakProcessorPhase.  It is changed to
> a scoped enum at namespace scope, and uses the recently added EnumIterator
> facility to provide iteration, rather than a bespoke iterator class.
> 
> This is a step toward eliminating it entirely.  I've split it out into a
> separate PR to make the review of the follow-up work a bit easier.
> 
> As part of this the file weakProcessorPhases.hpp is renamed to
> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
> rename and (majorly) edit, instead treating it as a remove and add a new
> file.
> 
> Testing: mach5 tier1

Marked as reviewed by ayang (Author).

src/hotspot/share/gc/shared/weakProcessor.inline.hpp line 88:

> 86:     CountingClosure<IsAlive, KeepAlive> cl(is_alive, keep_alive);
> 87:     WeakProcessorPhaseTimeTracker pt(_phase_times, phase, worker_id);
> 88:     int state_index = checked_cast<int>(phase_range.index(phase));

I feel `EnumRange<WeakProcessorPhase>().index(phase)` is better than `phase_range.index(phase)`, since we want to know the "global" index for this phase, not the "local" index within this particular range instance. The two are identical currently. However, if the for-loop is iterating over a subset of all `WeakProcessorPhase`, the two cases will give different results, I believe.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1620


From iklam at openjdk.java.net  Wed Dec  9 18:26:48 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 9 Dec 2020 18:26:48 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v2]
In-Reply-To: <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
Message-ID: <fXdjK0Zztok3XTvo3K-gypBM1YeA064TiNCDPiaMNQI=.13417d8d-22c1-41e7-9082-ae909492d7e9@github.com>

On Wed, 9 Dec 2020 13:19:46 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
>> 
>> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
>> 
>> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
>> 
>> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
>> 
>> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
>> 
>> There some points I would like to bring up in advance in this change that may be contentious:
>> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
>> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
>> 
>> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)
>
> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
> 
>   kbarrett review

The CDS part looks good to me.

I also scanned the GC code and it looks reasonable to me, but I don't understand all the details to give an official review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From zgu at openjdk.java.net  Wed Dec  9 20:02:46 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Wed, 9 Dec 2020 20:02:46 GMT
Subject: RFR: 8255019: Shenandoah: Split STW and concurrent mark into
 separate classes [v19]
In-Reply-To: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
References: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
Message-ID: <RyCQVoiZsSVLZvtc3lDUSDoao058uHeJ6JzzKi2F4oo=.73dec359-8191-41b5-bc78-d110603277b5@github.com>

> This is the first part of refactoring, that aims to isolate three Shenandoah GC modes (concurrent, degenerated and full gc).
> 
> Shenandoah started with two GC modes, concurrent and full gc, with minimal shared code, mainly in mark phase. After introducing degenerated GC, it shared quite large portion of code with concurrent GC, with the concept that degenerated GC can simply pick up remaining work of concurrent GC in STW mode.
> 
> It was not a big problem at that time, since concurrent GC also processed roots STW. Since Shenandoah gradually moved root processing into concurrent phase, code started to diverge, that made code hard to reason and maintain.
> 
> First step, I would like to split STW and concurrent mark, so that:
> 1) Code has to special case for STW and concurrent mark.
> 2) STW mark does not need to rendezvous workers between root mark and the rest of mark
> 3) STW mark does not need to activate SATB barrier and drain SATB buffers.
> 4) STW mark does not need to remark some of roots.
> 
> The patch mainly just shuffles code.  Creates a base class ShenandoahMark, and moved shared code (from current shenandoahConcurrentMark) into this base class. I did 'git mv shenandoahConcurrentMark.inline.hpp  shenandoahMark.inline.hpp, but git does not seem to reflect that.
> 
> A few changes:
> 1) Moved task queue set from ShenandoahConcurrentMark to ShenandoahHeap. ShenandoahMark and its subclasses are stateless. Instead, mark states are maintained in task queue, mark bitmap and SATB buffers, so that they can be created on demand.
> 2) Split ShenandoahConcurrentRootScanner template to ShenandoahConcurrentRootScanner and ShenandoahSTWRootScanner
> 3) Split code inside op_final_mark code into finish_mark and prepare_evacuation helper functions.
> 4) Made ShenandoahMarkCompact stack allocated (as well as ShenandoahConcurrentGC and ShenandoahDegeneratedGC in upcoming refactoring)

Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits:

 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Silent valgrind on potential memory leak
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Removed ShenandoahConcurrentMark parameter from concurrent GC entry/op, etc.
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge
 - Moved task queues to marking context
 - Merge
 - Merge branch 'master' into JDK-8255019-sh-mark
 - ... and 16 more: https://git.openjdk.java.net/jdk/compare/fd5f6e2e...05faa443

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1009/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1009&range=18
  Stats: 1957 lines in 22 files changed: 1072 ins; 747 del; 138 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1009.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1009/head:pull/1009

PR: https://git.openjdk.java.net/jdk/pull/1009


From mgronlun at openjdk.java.net  Wed Dec  9 21:02:23 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Wed, 9 Dec 2020 21:02:23 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v4]
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <3xq4tVLKVLdmZGawJ-rH6UPnGa756u0f3QK_YqdYwsc=.5928cc03-6c72-4b4d-8ec4-cc86a2f4f404@github.com>

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with 12 additional commits since the last revision:

 - initialization check
 - thread locals and detach and reattach
 - Tighter ThrottleUnit
 - JFC control elements
 - TLAB include
 - ThrottleUnit enum
 - remote tests
 - jfc control attributes
 - Sampling frequency adjustment for large objects
 - Treat large objects as tlabs for sampling purposes
 - ... and 2 more: https://git.openjdk.java.net/jdk/compare/6918f0c8...4e986552

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1624/files
  - new: https://git.openjdk.java.net/jdk/pull/1624/files/6918f0c8..4e986552

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=02-03

  Stats: 530 lines in 13 files changed: 174 ins; 285 del; 71 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From github.com+26284057+mybloodtop at openjdk.java.net  Wed Dec  9 21:02:24 2020
From: github.com+26284057+mybloodtop at openjdk.java.net (Mukilesh Sethu)
Date: Wed, 9 Dec 2020 21:02:24 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default)
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <qEj4oxwHjOzoHvLqXQChaMtuPOnp8rpQZ7oGLwx4XoE=.f6056380-0d0f-4771-98ad-4572b49c3f4b@github.com>

On Fri, 4 Dec 2020 15:25:23 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

This is great! It would be awesome to have this capability.

One query on the way you are throttling the events (please correct me if I'm wrong here as I am new to this codebase),

If I understood correctly, you are throttling the events at the time of committing, specifically part of `should_write` or `should_commit` in `jfrEvent.hpp`. If so, how would we be able to add throttling to events which might require it early on like `ObjectCountAfterGC` or `ObjectCount` events ? 

I think it makes perfect sense to have it part of commit for allocation events because most of the time consuming tasks like stack walking or storing stack trace in global table is done part of event commit and we will be able to throttle it. However, for events like `ObjectCountAfterGC` the time consuming task is iterating the heap which is unavoidable if we add throttling part of commit. So, I am just curious how can we extend this solution to such events ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From egahlin at openjdk.java.net  Wed Dec  9 21:02:33 2020
From: egahlin at openjdk.java.net (Erik Gahlin)
Date: Wed, 9 Dec 2020 21:02:33 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v4]
In-Reply-To: <3xq4tVLKVLdmZGawJ-rH6UPnGa756u0f3QK_YqdYwsc=.5928cc03-6c72-4b4d-8ec4-cc86a2f4f404@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
 <3xq4tVLKVLdmZGawJ-rH6UPnGa756u0f3QK_YqdYwsc=.5928cc03-6c72-4b4d-8ec4-cc86a2f4f404@github.com>
Message-ID: <uoFISdOPSilq13OLnJ1OcHfwvfSEm6-X0DpF9hSH1QQ=.28263371-32a0-4964-84f1-44fa35243edb@github.com>

On Wed, 9 Dec 2020 20:58:48 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> Greetings,
>> 
>> please help review this enhancement to let JFR sample object allocations by default.
>> 
>> A description is provided in the JIRA issue.
>> 
>> Thanks
>> Markus
>
> Markus Gr?nlund has updated the pull request incrementally with 12 additional commits since the last revision:
> 
>  - initialization check
>  - thread locals and detach and reattach
>  - Tighter ThrottleUnit
>  - JFC control elements
>  - TLAB include
>  - ThrottleUnit enum
>  - remote tests
>  - jfc control attributes
>  - Sampling frequency adjustment for large objects
>  - Treat large objects as tlabs for sampling purposes
>  - ... and 2 more: https://git.openjdk.java.net/jdk/compare/6918f0c8...4e986552

src/jdk.jfr/share/classes/jdk/jfr/internal/EventControl.java line 79:

> 77:     private final PlatformEventType type;
> 78:     private final String idName;
> 79: 

Why move Enabled to later?

src/jdk.jfr/share/classes/jdk/jfr/internal/Utils.java line 229:

> 227:     // Expected input format is "x/y" or "x\y" where x is a non-negative long
> 228:     // and y is a time unit. Split the string at the delimiter.
> 229:     private static String parseThrottleString(String s, boolean value) {

I think we should only support one type of slash "/".

src/jdk.jfr/share/classes/jdk/jfr/internal/Utils.java line 249:

> 247:     }
> 248: 
> 249:     private static TimeUnit timeUnit(String unit) {

This could be done with an enum with a constructor.

src/jdk.jfr/share/classes/jdk/jfr/internal/settings/ThrottleSetting.java line 65:

> 63:     @Override
> 64:     public String combine(Set<String> values) {
> 65:         double max = OFF;

Probably better to use a long (nanos) than floating number

src/jdk.jfr/share/classes/jdk/jfr/internal/settings/ThrottleSetting.java line 88:

> 86:     @Override
> 87:     public void setValue(String s) {
> 88:         this.value = s;

If parsing fails, I think things should be kept as is. At least that is what the SettingControl interface say.s 

I looked at other setting control and the implementation seems wrong there as well.

src/jdk.jfr/share/conf/jfr/default.jfc line 618:

> 616: 
> 617:     <event name="jdk.ObjectAllocationSample">
> 618:       <setting name="enabled">true</setting>

I think enabled should have the "memory-profiling" control.

src/jdk.jfr/share/conf/jfr/profile.jfc line 608:

> 606: 
> 607:     <event name="jdk.ObjectAllocationInNewTLAB">
> 608:       <setting name="enabled" control="memory-profiling-enabled-medium">false</setting>

Need to sync this with <selection>. 

Perhaps a new choice are needed "Object Allocation"

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From mgronlun at openjdk.java.net  Wed Dec  9 21:02:25 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Wed, 9 Dec 2020 21:02:25 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default)
In-Reply-To: <qEj4oxwHjOzoHvLqXQChaMtuPOnp8rpQZ7oGLwx4XoE=.f6056380-0d0f-4771-98ad-4572b49c3f4b@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
 <qEj4oxwHjOzoHvLqXQChaMtuPOnp8rpQZ7oGLwx4XoE=.f6056380-0d0f-4771-98ad-4572b49c3f4b@github.com>
Message-ID: <ae0tsQY5xMw2q_OESJbxrlt1j-4ABUnGWwwRDy5r0fY=.b51add2d-e63f-48af-9681-380439fb1b47@github.com>

On Mon, 7 Dec 2020 22:37:20 GMT, Mukilesh Sethu <github.com+26284057+myBloodTop at openjdk.org> wrote:

>> Greetings,
>> 
>> please help review this enhancement to let JFR sample object allocations by default.
>> 
>> A description is provided in the JIRA issue.
>> 
>> Thanks
>> Markus
>
> This is great! It would be awesome to have this capability.
> 
> One query on the way you are throttling the events (please correct me if I'm wrong here as I am new to this codebase),
> 
> If I understood correctly, you are throttling the events at the time of committing, specifically part of `should_write` or `should_commit` in `jfrEvent.hpp`. If so, how would we be able to add throttling to events which might require it early on like `ObjectCountAfterGC` or `ObjectCount` events ? 
> 
> I think it makes perfect sense to have it part of commit for allocation events because most of the time consuming tasks like stack walking or storing stack trace in global table is done part of event commit and we will be able to throttle it. However, for events like `ObjectCountAfterGC` the time consuming task is iterating the heap which is unavoidable if we add throttling part of commit. So, I am just curious how can we extend this solution to such events ?

Hi, @myBloodTop, thanks for your comment.

For more special events (for example periodic events), it will be possible, although not yet supported, to use the throttling mechanism directly. For example: 

TRACE_REQUEST_FUNC(ObjectCount) {  
       if (JfrEventThrottler::accept(JfrObjectCountEvent)) {  
     VM_GC_SendObjectCountEvent op;
     VMThread::execute(&op);
  }
}

Evaluating the throttle predicate as part of commit or should_commit() is an optimization to avoid having to take the clock twice, but for cases such as the above, if you don't pass a timestamp, a timestamp will be taken for you as part of the evaluation.

Now, ObjectCount and ObjectCountAfterGC are also special in another respect, in that they are UNTIMED, meaning the events are timestamped outside of JFR. For other, non-UNTIMED events, it would be sufficient to only use the should_commit() tester, since the throttler evaluation is incorporated (post enable and threshold checks evaluations). For example:

MyEvent event;
...
if (event.should_commit()) { <<-- throttle evaluation
   event.set_field(...);
   event.commit();
}

Thanks
Markus

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From mgronlun at openjdk.java.net  Wed Dec  9 21:02:41 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Wed, 9 Dec 2020 21:02:41 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v4]
In-Reply-To: <uoFISdOPSilq13OLnJ1OcHfwvfSEm6-X0DpF9hSH1QQ=.28263371-32a0-4964-84f1-44fa35243edb@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
 <3xq4tVLKVLdmZGawJ-rH6UPnGa756u0f3QK_YqdYwsc=.5928cc03-6c72-4b4d-8ec4-cc86a2f4f404@github.com>
 <uoFISdOPSilq13OLnJ1OcHfwvfSEm6-X0DpF9hSH1QQ=.28263371-32a0-4964-84f1-44fa35243edb@github.com>
Message-ID: <EWA1LbWXoXxCm_eltY2BZH12yXnJHE9h6sELnLn-kzY=.412b03af-515c-461d-92c7-55eb4e1159a3@github.com>

On Tue, 8 Dec 2020 16:31:37 GMT, Erik Gahlin <egahlin at openjdk.org> wrote:

>> Markus Gr?nlund has updated the pull request incrementally with 12 additional commits since the last revision:
>> 
>>  - initialization check
>>  - thread locals and detach and reattach
>>  - Tighter ThrottleUnit
>>  - JFC control elements
>>  - TLAB include
>>  - ThrottleUnit enum
>>  - remote tests
>>  - jfc control attributes
>>  - Sampling frequency adjustment for large objects
>>  - Treat large objects as tlabs for sampling purposes
>>  - ... and 2 more: https://git.openjdk.java.net/jdk/compare/6918f0c8...4e986552
>
> src/jdk.jfr/share/classes/jdk/jfr/internal/EventControl.java line 79:
> 
>> 77:     private final PlatformEventType type;
>> 78:     private final String idName;
>> 79: 
> 
> Why move Enabled to later?

Configuring the throttler implementation in native is a bit involved. With the existing order, with enabled first, events are enabled before subsequent conditions are set up. For the throttler, it means as soon as the enabled setting is flipped, the throttler gets traffic, but its not yet configured to accept it. It has a default, which is off, meaning it accepts all events until the subsequent call to configure the throttler which can take some time, because the setup is non-trivial. It was found that rates are not respected because of the throttler not having been setup yet when its starts to get traffic.

> src/jdk.jfr/share/classes/jdk/jfr/internal/Utils.java line 229:
> 
>> 227:     // Expected input format is "x/y" or "x\y" where x is a non-negative long
>> 228:     // and y is a time unit. Split the string at the delimiter.
>> 229:     private static String parseThrottleString(String s, boolean value) {
> 
> I think we should only support one type of slash "/".

Fixed.

> src/jdk.jfr/share/classes/jdk/jfr/internal/Utils.java line 249:
> 
>> 247:     }
>> 248: 
>> 249:     private static TimeUnit timeUnit(String unit) {
> 
> This could be done with an enum with a constructor.

Fixed.

> src/jdk.jfr/share/classes/jdk/jfr/internal/settings/ThrottleSetting.java line 65:
> 
>> 63:     @Override
>> 64:     public String combine(Set<String> values) {
>> 65:         double max = OFF;
> 
> Probably better to use a long (nanos) than floating number

Fixed.

> src/jdk.jfr/share/classes/jdk/jfr/internal/settings/ThrottleSetting.java line 88:
> 
>> 86:     @Override
>> 87:     public void setValue(String s) {
>> 88:         this.value = s;
> 
> If parsing fails, I think things should be kept as is. At least that is what the SettingControl interface say.s 
> 
> I looked at other setting control and the implementation seems wrong there as well.

Fixed.

> src/jdk.jfr/share/conf/jfr/default.jfc line 618:
> 
>> 616: 
>> 617:     <event name="jdk.ObjectAllocationSample">
>> 618:       <setting name="enabled">true</setting>
> 
> I think enabled should have the "memory-profiling" control.

Fixed.

> src/jdk.jfr/share/conf/jfr/profile.jfc line 608:
> 
>> 606: 
>> 607:     <event name="jdk.ObjectAllocationInNewTLAB">
>> 608:       <setting name="enabled" control="memory-profiling-enabled-medium">false</setting>
> 
> Need to sync this with <selection>. 
> 
> Perhaps a new choice are needed "Object Allocation"

New control elements and attributes introduced.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From github.com+26284057+mybloodtop at openjdk.java.net  Wed Dec  9 22:23:37 2020
From: github.com+26284057+mybloodtop at openjdk.java.net (Mukilesh Sethu)
Date: Wed, 9 Dec 2020 22:23:37 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default)
In-Reply-To: <qEj4oxwHjOzoHvLqXQChaMtuPOnp8rpQZ7oGLwx4XoE=.f6056380-0d0f-4771-98ad-4572b49c3f4b@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
 <qEj4oxwHjOzoHvLqXQChaMtuPOnp8rpQZ7oGLwx4XoE=.f6056380-0d0f-4771-98ad-4572b49c3f4b@github.com>
Message-ID: <olT1NuQc4mwhg4zI2kSOkOKb1dODy8bgeu_Issg2blQ=.697f8a07-300b-4d31-a958-fdf96af88e82@github.com>

On Mon, 7 Dec 2020 22:37:20 GMT, Mukilesh Sethu <github.com+26284057+myBloodTop at openjdk.org> wrote:

>> Greetings,
>> 
>> please help review this enhancement to let JFR sample object allocations by default.
>> 
>> A description is provided in the JIRA issue.
>> 
>> Thanks
>> Markus
>
> This is great! It would be awesome to have this capability.
> 
> One query on the way you are throttling the events (please correct me if I'm wrong here as I am new to this codebase),
> 
> If I understood correctly, you are throttling the events at the time of committing, specifically part of `should_write` or `should_commit` in `jfrEvent.hpp`. If so, how would we be able to add throttling to events which might require it early on like `ObjectCountAfterGC` or `ObjectCount` events ? 
> 
> I think it makes perfect sense to have it part of commit for allocation events because most of the time consuming tasks like stack walking or storing stack trace in global table is done part of event commit and we will be able to throttle it. However, for events like `ObjectCountAfterGC` the time consuming task is iterating the heap which is unavoidable if we add throttling part of commit. So, I am just curious how can we extend this solution to such events ?

> Hi, @myBloodTop, thanks for your comment.
> 
> For more special events (for example periodic events), it will be possible, although not yet supported, to use the throttling mechanism directly. For example:
> 
> TRACE_REQUEST_FUNC(ObjectCount) {
> if (JfrEventThrottler::accept(JfrObjectCountEvent)) {
> VM_GC_SendObjectCountEvent op;
> VMThread::execute(&op);
> }
> }
> 
> Evaluating the throttle predicate as part of commit or should_commit() is an optimization to avoid having to take the clock twice, but for cases such as the above, if you don't pass a timestamp, a timestamp will be taken for you as part of the evaluation.
> 
> Now, ObjectCount and ObjectCountAfterGC are also special in another respect, in that they are UNTIMED, meaning the events are timestamped outside of JFR. For other, non-UNTIMED events, it would be sufficient to only use the should_commit() tester, since the throttler evaluation is incorporated (post enable and threshold checks evaluations). For example:
> 
> MyEvent event;
> ...
> if (event.should_commit()) { <<-- throttle evaluation
> event.set_field(...);
> event.commit();
> }
> 
> Thanks
> Markus

Thank you for the clarification :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From iklam at openjdk.java.net  Wed Dec  9 23:05:36 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Wed, 9 Dec 2020 23:05:36 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v2]
In-Reply-To: <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
Message-ID: <vGtJBGQQ3yWlaTsXuIzBQSdwcQFbYyCW3eJtXPwzSrY=.f10a6200-9836-4ea4-823a-e00de0cd2573@github.com>

On Wed, 9 Dec 2020 13:19:46 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
>> 
>> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
>> 
>> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
>> 
>> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
>> 
>> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
>> 
>> There some points I would like to bring up in advance in this change that may be contentious:
>> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
>> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
>> 
>> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)
>
> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
> 
>   kbarrett review

Marked as reviewed by iklam (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From kbarrett at openjdk.java.net  Thu Dec 10 08:37:35 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 08:37:35 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
Message-ID: <RZyxVxzEM2j_nWo3r1z7THxOwHto9BJmghJVp17zBao=.c1c1ed94-d968-4f82-862d-f20ba8d8c84c@github.com>

On Wed, 9 Dec 2020 13:28:44 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> Changes requested by tschatzl (Reviewer).

> [pre-existing] The topWeakReferenceGC.java description at the top describes that the test calls System.gc() explicitly to trigger garbage collections at the end. It does not. Maybe this could be weasel-worded around like in the other cases in that text.

There are a lot of things much more wrong with that comment.  Doing more GCs
doesn't cause more enqueues to happen.  The "non-deterministic" enqueuing is
just a race.  The GC adds references to the pending list.  The reference
processing thread transfers references from the pending list to their
associated queue (if any).  The test code is racing with that.  The change
to use Reference.remove with a timeout eliminates all that, and one GC
should be.  Addressing all that would be a substantial rewrite of this
test though.  Mind if I defer that to a new RFE?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From kbarrett at openjdk.java.net  Thu Dec 10 08:46:36 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 08:46:36 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
Message-ID: <foqBKuulNHSF8dT2o3aC66Fa1kJnmQVlzxakF1--frY=.5bc96cde-4eee-4db2-96c6-f4ae3bba1db3@github.com>

On Wed, 9 Dec 2020 13:26:04 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> test/hotspot/jtreg/vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java line 129:
> 
>> 127:                 }
>> 128: 
>> 129:                 int REMOVE = (int) (RANGE * RATIO);
> 
> These two constants could be factored out as static finals to match the casing.

I'm making REMOVE and RETAIN statics, near RANGE and RATIO.  (Meant to do that before, but forgot.)  They can't be final though, because RANGE and RATIO aren't final, and can be set from command line arguments.  So they'll get initialized in parseArgs.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From kbarrett at openjdk.java.net  Thu Dec 10 09:01:54 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 09:01:54 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests [v2]
In-Reply-To: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
Message-ID: <lciuW7ucEit5w1kA7UbAxxXU145CqP3iAy2vFU3ff6s=.43679589-5c80-4758-bf78-4ba6620b2d78@github.com>

> Please review this change that eliminates the use of Reference.isEnqueued by
> tests.  There were three tests using it:
> 
> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
> jdk/java/lang/ref/ReferenceEnqueue.java
> 
> In each of them, some combination of using Reference.refersTo and
> ReferenceQueue.remove with a timeout were used to eliminate the use of
> Reference.isEnqueued.
> 
> I also cleaned up ReferencesGC.java in various respects.  It contained
> several bits of dead code, and the failure checks were made stronger.
> 
> Testing:
> mach5 tier1
> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:

  move REMOVE and RETAIN decls and init

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1691/files
  - new: https://git.openjdk.java.net/jdk/pull/1691/files/e87206a8..01710567

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1691&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1691&range=00-01

  Stats: 6 lines in 1 file changed: 4 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1691.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1691/head:pull/1691

PR: https://git.openjdk.java.net/jdk/pull/1691


From kbarrett at openjdk.java.net  Thu Dec 10 09:01:54 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 09:01:54 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests [v2]
In-Reply-To: <QjRep49fBPVa9LRUoxCBzswRLtGA6GmSbOssY1A1KJo=.f087eac0-5b58-4330-9b9b-148c280b0ca0@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
 <QjRep49fBPVa9LRUoxCBzswRLtGA6GmSbOssY1A1KJo=.f087eac0-5b58-4330-9b9b-148c280b0ca0@github.com>
Message-ID: <4JvCs53pm1rUddMiklt8Q5QgJXrloJDNblRbBDYDW6U=.5041142a-b99e-4e91-9871-5f996ca761a1@github.com>

On Wed, 9 Dec 2020 13:59:09 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> test/jdk/java/lang/ref/ReferenceEnqueue.java line 58:
>> 
>>> 56:             for (int i = 0; i < iterations; i++) {
>>> 57:                 System.gc();
>>> 58:                 enqueued = (queue.remove(100) == ref);
>> 
>> The code does not catch `InterruptedException` like it does in the other files.
>
> I understand that the test code previously just forwarded the `InterruptedException` if it happened in the `Thread.sleep()` call too. So this may only be an exiting issue and please ignore this comment.
> Not catching `InterruptedException` here only seems to be a cause for unnecessary failure. Then again, it probably does not happen a lot.

Nothing in the test calls Thread.interrupt(), so there isn't a risk of
failure due to not handling that exception in some "interesting" way. But
InterruptedException must be "handled" somehow, because it's a checked
exception. That's already dealt with by the run() method declaring that it
throws that type, and main declaring that it throws Exception.  The other
tests modified in this change don't take that approach (just let it
propagate out through main), instead wrapping the interruptable calls in
try/catch, though again just to satisfy the requirement that a checked
exception must be statically verified to be handled, even though there
aren't going to be any thrown.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From kbarrett at openjdk.java.net  Thu Dec 10 09:25:37 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 09:25:37 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <nMjcy7Ea0A61plFgUfRlquMk4b1I_BAROR40ztXwj-s=.ae0654a4-1459-4309-9f18-1c66193b9014@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
 <nMjcy7Ea0A61plFgUfRlquMk4b1I_BAROR40ztXwj-s=.ae0654a4-1459-4309-9f18-1c66193b9014@github.com>
Message-ID: <1hsUvvQtCw5vL-8n-HOJrBV-pByIqX0QwBohvou9f_U=.44c5f772-688e-4a5d-b2ed-3739ee826c3b@github.com>

On Wed, 9 Dec 2020 18:22:27 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> Please review this reimplementation of WeakProcessorPhase.  It is changed to
>> a scoped enum at namespace scope, and uses the recently added EnumIterator
>> facility to provide iteration, rather than a bespoke iterator class.
>> 
>> This is a step toward eliminating it entirely.  I've split it out into a
>> separate PR to make the review of the follow-up work a bit easier.
>> 
>> As part of this the file weakProcessorPhases.hpp is renamed to
>> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
>> rename and (majorly) edit, instead treating it as a remove and add a new
>> file.
>> 
>> Testing: mach5 tier1
>
> src/hotspot/share/gc/shared/weakProcessor.inline.hpp line 88:
> 
>> 86:     CountingClosure<IsAlive, KeepAlive> cl(is_alive, keep_alive);
>> 87:     WeakProcessorPhaseTimeTracker pt(_phase_times, phase, worker_id);
>> 88:     int state_index = checked_cast<int>(phase_range.index(phase));
> 
> I feel `EnumRange<WeakProcessorPhase>().index(phase)` is better than `phase_range.index(phase)`, since we want to know the "global" index for this phase, not the "local" index within this particular range instance. The two are identical currently. However, if the for-loop is iterating over a subset of all `WeakProcessorPhase`, the two cases will give different results, I believe.

I agree that this isn't entirely ideal as-is. If someone were to change
phase_range to be a subset of the full range, then this line would need to
be examined and probably modified. However, that's not going to happen. This
PR is an intermediate step toward eliminating WeakProcessorPhase entirely,
which will address this issue (among others). I have that followup ready for
review once this one is integrated and I've rebased and retested.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1620


From tschatzl at openjdk.java.net  Thu Dec 10 09:27:35 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 10 Dec 2020 09:27:35 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests [v2]
In-Reply-To: <4JvCs53pm1rUddMiklt8Q5QgJXrloJDNblRbBDYDW6U=.5041142a-b99e-4e91-9871-5f996ca761a1@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <YvT1jXBV-Lzs65xXrV1yqKTquZbTOMH2AJ4TqkSk8To=.b4125984-82f7-4f66-b74a-1d2640eac283@github.com>
 <QjRep49fBPVa9LRUoxCBzswRLtGA6GmSbOssY1A1KJo=.f087eac0-5b58-4330-9b9b-148c280b0ca0@github.com>
 <4JvCs53pm1rUddMiklt8Q5QgJXrloJDNblRbBDYDW6U=.5041142a-b99e-4e91-9871-5f996ca761a1@github.com>
Message-ID: <yM7uaSiyJ9nGRV-Pbx4beV5cXuAD8TbFYSlbnICzj_8=.8f2c115a-f048-4c26-abfb-a1db8ebd874d@github.com>

On Thu, 10 Dec 2020 08:56:25 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> I understand that the test code previously just forwarded the `InterruptedException` if it happened in the `Thread.sleep()` call too. So this may only be an exiting issue and please ignore this comment.
>> Not catching `InterruptedException` here only seems to be a cause for unnecessary failure. Then again, it probably does not happen a lot.
>
> Nothing in the test calls Thread.interrupt(), so there isn't a risk of
> failure due to not handling that exception in some "interesting" way. But
> InterruptedException must be "handled" somehow, because it's a checked
> exception. That's already dealt with by the run() method declaring that it
> throws that type, and main declaring that it throws Exception.  The other
> tests modified in this change don't take that approach (just let it
> propagate out through main), instead wrapping the interruptable calls in
> try/catch, though again just to satisfy the requirement that a checked
> exception must be statically verified to be handled, even though there
> aren't going to be any thrown.

Okay.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From tschatzl at openjdk.java.net  Thu Dec 10 09:27:33 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 10 Dec 2020 09:27:33 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests [v2]
In-Reply-To: <lciuW7ucEit5w1kA7UbAxxXU145CqP3iAy2vFU3ff6s=.43679589-5c80-4758-bf78-4ba6620b2d78@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <lciuW7ucEit5w1kA7UbAxxXU145CqP3iAy2vFU3ff6s=.43679589-5c80-4758-bf78-4ba6620b2d78@github.com>
Message-ID: <PAMjA6RkswiG9oHvEIlnrNeeGaavZ9exICBp93j-AVY=.49bce330-12bd-4d05-81e0-a604dada96e1@github.com>

On Thu, 10 Dec 2020 09:01:54 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
> 
>   move REMOVE and RETAIN decls and init

Also good with deferring the changes to the comments and the move of the statics.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1691


From ayang at openjdk.java.net  Thu Dec 10 09:46:36 2020
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 10 Dec 2020 09:46:36 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <1hsUvvQtCw5vL-8n-HOJrBV-pByIqX0QwBohvou9f_U=.44c5f772-688e-4a5d-b2ed-3739ee826c3b@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
 <nMjcy7Ea0A61plFgUfRlquMk4b1I_BAROR40ztXwj-s=.ae0654a4-1459-4309-9f18-1c66193b9014@github.com>
 <1hsUvvQtCw5vL-8n-HOJrBV-pByIqX0QwBohvou9f_U=.44c5f772-688e-4a5d-b2ed-3739ee826c3b@github.com>
Message-ID: <Yvmp5hSiAMVhC6BOBtBe69ctJaxofDhcxeGphaZ_LKo=.e127239c-9dd0-45a6-9c7b-011313047971@github.com>

On Thu, 10 Dec 2020 09:23:07 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> src/hotspot/share/gc/shared/weakProcessor.inline.hpp line 88:
>> 
>>> 86:     CountingClosure<IsAlive, KeepAlive> cl(is_alive, keep_alive);
>>> 87:     WeakProcessorPhaseTimeTracker pt(_phase_times, phase, worker_id);
>>> 88:     int state_index = checked_cast<int>(phase_range.index(phase));
>> 
>> I feel `EnumRange<WeakProcessorPhase>().index(phase)` is better than `phase_range.index(phase)`, since we want to know the "global" index for this phase, not the "local" index within this particular range instance. The two are identical currently. However, if the for-loop is iterating over a subset of all `WeakProcessorPhase`, the two cases will give different results, I believe.
>
> I agree that this isn't entirely ideal as-is. If someone were to change
> phase_range to be a subset of the full range, then this line would need to
> be examined and probably modified. However, that's not going to happen. This
> PR is an intermediate step toward eliminating WeakProcessorPhase entirely,
> which will address this issue (among others). I have that followup ready for
> review once this one is integrated and I've rebased and retested.

I see; thanks for the explanation.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1620


From mgronlun at openjdk.java.net  Thu Dec 10 10:04:24 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Thu, 10 Dec 2020 10:04:24 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v5]
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <YvaO4soCjwPi_6-36pMz0z2V3-VhMrF3DlbtLCIhfAE=.1a05a487-15c4-4667-9b6a-e8e9330b3fe0@github.com>

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:

  remove override declaration

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1624/files
  - new: https://git.openjdk.java.net/jdk/pull/1624/files/4e986552..fad24016

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=03-04

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From mgronlun at openjdk.java.net  Thu Dec 10 10:31:28 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Thu, 10 Dec 2020 10:31:28 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v6]
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <Gqss6CSPgTmJs7MmF0I6lDNtz0tNv-gOvzJqubLp5nk=.f5e67bc3-422e-410d-908d-9de36613f792@github.com>

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:

  rename variable name

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1624/files
  - new: https://git.openjdk.java.net/jdk/pull/1624/files/fad24016..1d81605f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=04-05

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From tschatzl at openjdk.java.net  Thu Dec 10 10:34:49 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 10 Dec 2020 10:34:49 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v3]
In-Reply-To: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
Message-ID: <Te0OSzNfwd-W47gNKZExD1QYQotpKCSNTP-wHn3dhCM=.a852ce28-5568-4f27-910a-ae54b40010ad@github.com>

> Hi all,
> 
>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  kbarrett review2, comment updates

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1661/files
  - new: https://git.openjdk.java.net/jdk/pull/1661/files/213fbeed..5805fabe

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1661&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1661&range=01-02

  Stats: 5 lines in 1 file changed: 1 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1661.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1661/head:pull/1661

PR: https://git.openjdk.java.net/jdk/pull/1661


From kbarrett at openjdk.java.net  Thu Dec 10 10:37:54 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 10:37:54 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests [v3]
In-Reply-To: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
Message-ID: <-S0WCahnGx8E1KkBszTUFJP0xiAo6n9ffFGiSGeHPnw=.a46180c0-ff19-4cbf-aee9-117f7a9ea717@github.com>

> Please review this change that eliminates the use of Reference.isEnqueued by
> tests.  There were three tests using it:
> 
> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
> jdk/java/lang/ref/ReferenceEnqueue.java
> 
> In each of them, some combination of using Reference.refersTo and
> ReferenceQueue.remove with a timeout were used to eliminate the use of
> Reference.isEnqueued.
> 
> I also cleaned up ReferencesGC.java in various respects.  It contained
> several bits of dead code, and the failure checks were made stronger.
> 
> Testing:
> mach5 tier1
> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:

 - Merge branch 'master' into no_isenqueued
 - move REMOVE and RETAIN decls and init
 - update WeakReferenceGC test
 - update ReferenceQueue test
 - update ReferencesGC test

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1691/files
  - new: https://git.openjdk.java.net/jdk/pull/1691/files/01710567..d5355342

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1691&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1691&range=01-02

  Stats: 7952 lines in 328 files changed: 5091 ins; 1646 del; 1215 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1691.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1691/head:pull/1691

PR: https://git.openjdk.java.net/jdk/pull/1691


From kbarrett at openjdk.java.net  Thu Dec 10 10:37:55 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 10:37:55 GMT
Subject: Integrated: 8257876: Avoid Reference.isEnqueued in tests
In-Reply-To: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
Message-ID: <doLbMC-nz6xnW2_PvyjbNrlYkQ3qkquJIUyoN4aL32E=.80f3fba2-88c7-46cb-997e-c70707d07591@github.com>

On Tue, 8 Dec 2020 09:52:51 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change that eliminates the use of Reference.isEnqueued by
> tests.  There were three tests using it:
> 
> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
> jdk/java/lang/ref/ReferenceEnqueue.java
> 
> In each of them, some combination of using Reference.refersTo and
> ReferenceQueue.remove with a timeout were used to eliminate the use of
> Reference.isEnqueued.
> 
> I also cleaned up ReferencesGC.java in various respects.  It contained
> several bits of dead code, and the failure checks were made stronger.
> 
> Testing:
> mach5 tier1
> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

This pull request has now been integrated.

Changeset: db5da961
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/db5da961
Stats:     104 lines in 3 files changed: 23 ins; 39 del; 42 mod

8257876: Avoid Reference.isEnqueued in tests

Reviewed-by: mchung, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From tschatzl at openjdk.java.net  Thu Dec 10 10:50:37 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 10 Dec 2020 10:50:37 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v2]
In-Reply-To: <vGtJBGQQ3yWlaTsXuIzBQSdwcQFbYyCW3eJtXPwzSrY=.f10a6200-9836-4ea4-823a-e00de0cd2573@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <s8Jm2cyzuCxqoxGnNmuRBrfVPy_rp9eyP6jT0vMIyoc=.70b33048-0a99-4ede-9c6e-306644eea6f7@github.com>
 <vGtJBGQQ3yWlaTsXuIzBQSdwcQFbYyCW3eJtXPwzSrY=.f10a6200-9836-4ea4-823a-e00de0cd2573@github.com>
Message-ID: <6KzerbJs7UifQj02OfcYgJVcvwl6ttyOo_WerZWykso=.b4b3c86f-6548-47d2-9a06-4fd8a3e44cbb@github.com>

On Wed, 9 Dec 2020 23:02:54 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   kbarrett review
>
> Marked as reviewed by iklam (Reviewer).

I fixed the mentioned comments but would like to defer further cleanup of the classes, particularly those `VM_GC_Operation`s that do not actually participate in the skipping protocol to [JDK-8258029](https://bugs.openjdk.java.net/browse/JDK-8258029) I filed just now.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From mgronlun at openjdk.java.net  Thu Dec 10 11:20:24 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Thu, 10 Dec 2020 11:20:24 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v7]
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <jTH_yt2C1TQMpB0hBz86gz6VPz9Jb4AJUSsfwzPdGU8=.5f58133f-f152-410a-9a8c-7f4fc31c3774@github.com>

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:

  no regexp split

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1624/files
  - new: https://git.openjdk.java.net/jdk/pull/1624/files/1d81605f..a1076ac4

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1624&range=05-06

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1624.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1624/head:pull/1624

PR: https://git.openjdk.java.net/jdk/pull/1624


From egahlin at openjdk.java.net  Thu Dec 10 11:25:37 2020
From: egahlin at openjdk.java.net (Erik Gahlin)
Date: Thu, 10 Dec 2020 11:25:37 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v7]
In-Reply-To: <jTH_yt2C1TQMpB0hBz86gz6VPz9Jb4AJUSsfwzPdGU8=.5f58133f-f152-410a-9a8c-7f4fc31c3774@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
 <jTH_yt2C1TQMpB0hBz86gz6VPz9Jb4AJUSsfwzPdGU8=.5f58133f-f152-410a-9a8c-7f4fc31c3774@github.com>
Message-ID: <8iMfUGTSz9jtKh8WCXELcMcUiapHM60Hlxx0N93scmw=.69ce1dbc-f9ea-4d39-a734-0fd1baa20ee3@github.com>

On Thu, 10 Dec 2020 11:20:24 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> Greetings,
>> 
>> please help review this enhancement to let JFR sample object allocations by default.
>> 
>> A description is provided in the JIRA issue.
>> 
>> Thanks
>> Markus
>
> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:
> 
>   no regexp split

Marked as reviewed by egahlin (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From tschatzl at openjdk.java.net  Thu Dec 10 11:35:34 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 10 Dec 2020 11:35:34 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
Message-ID: <MacTFuEQrmfKfvfVJL5abgsWLeM2NacHkHrROCB44IA=.bfe7ee9f-a477-40c1-9835-af05e841451a@github.com>

On Fri, 4 Dec 2020 09:34:36 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this reimplementation of WeakProcessorPhase.  It is changed to
> a scoped enum at namespace scope, and uses the recently added EnumIterator
> facility to provide iteration, rather than a bespoke iterator class.
> 
> This is a step toward eliminating it entirely.  I've split it out into a
> separate PR to make the review of the follow-up work a bit easier.
> 
> As part of this the file weakProcessorPhases.hpp is renamed to
> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
> rename and (majorly) edit, instead treating it as a remove and add a new
> file.
> 
> Testing: mach5 tier1

Marked as reviewed by tschatzl (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1620


From kbarrett at openjdk.java.net  Thu Dec 10 11:44:35 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 11:44:35 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v3]
In-Reply-To: <Te0OSzNfwd-W47gNKZExD1QYQotpKCSNTP-wHn3dhCM=.a852ce28-5568-4f27-910a-ae54b40010ad@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <Te0OSzNfwd-W47gNKZExD1QYQotpKCSNTP-wHn3dhCM=.a852ce28-5568-4f27-910a-ae54b40010ad@github.com>
Message-ID: <cXOnLpwjwd9n3jgP7fblCVHl7SfEk2lAdkvbhDB0wT0=.a0d1f38e-86b0-40b7-a9f0-d4b1091b0940@github.com>

On Thu, 10 Dec 2020 10:34:49 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
>> 
>> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
>> 
>> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
>> 
>> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
>> 
>> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
>> 
>> There some points I would like to bring up in advance in this change that may be contentious:
>> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
>> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
>> 
>> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)
>
> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
> 
>   kbarrett review2, comment updates

Marked as reviewed by kbarrett (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From jbachorik at openjdk.java.net  Thu Dec 10 11:57:37 2020
From: jbachorik at openjdk.java.net (Jaroslav Bachorik)
Date: Thu, 10 Dec 2020 11:57:37 GMT
Subject: RFR: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default) [v7]
In-Reply-To: <jTH_yt2C1TQMpB0hBz86gz6VPz9Jb4AJUSsfwzPdGU8=.5f58133f-f152-410a-9a8c-7f4fc31c3774@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
 <jTH_yt2C1TQMpB0hBz86gz6VPz9Jb4AJUSsfwzPdGU8=.5f58133f-f152-410a-9a8c-7f4fc31c3774@github.com>
Message-ID: <qTqshRosqRNwZqbZYyU0sERzbtT2LXYROZY9i1IqEGU=.e8621e0b-f22a-424d-a466-bec20a956e35@github.com>

On Thu, 10 Dec 2020 11:20:24 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> Greetings,
>> 
>> please help review this enhancement to let JFR sample object allocations by default.
>> 
>> A description is provided in the JIRA issue.
>> 
>> Thanks
>> Markus
>
> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:
> 
>   no regexp split

Marked as reviewed by jbachorik (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From kbarrett at openjdk.java.net  Thu Dec 10 12:09:40 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 10 Dec 2020 12:09:40 GMT
Subject: RFR: 8257876: Avoid Reference.isEnqueued in tests [v3]
In-Reply-To: <2TbNnDlF1nuEFWLddNG3wdj5EL0gg-1hzGwe2-emoQE=.e950f0f7-6be0-426d-8634-bc3c3175030a@github.com>
References: <RgxrSMzI-iwQBCHVEu0yT_uQ3HWWZBq6diqvpIn4kyg=.6d555d17-08fa-4c23-ac1d-446b3dff2192@github.com>
 <2TbNnDlF1nuEFWLddNG3wdj5EL0gg-1hzGwe2-emoQE=.e950f0f7-6be0-426d-8634-bc3c3175030a@github.com>
Message-ID: <cQn4Djq1wG-4qDK_5qmdN6PlM5eq8GL-9wtnxWOXzuQ=.ecec227c-cb82-470d-935c-7f66fb98cf4f@github.com>

On Tue, 8 Dec 2020 17:30:11 GMT, Mandy Chung <mchung at openjdk.org> wrote:

>> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>> 
>>  - Merge branch 'master' into no_isenqueued
>>  - move REMOVE and RETAIN decls and init
>>  - update WeakReferenceGC test
>>  - update ReferenceQueue test
>>  - update ReferencesGC test
>
> Marked as reviewed by mchung (Reviewer).

Thanks for reviews @mlchung and @tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/1691


From mgronlun at openjdk.java.net  Thu Dec 10 12:37:40 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Thu, 10 Dec 2020 12:37:40 GMT
Subject: Integrated: 8257602: Introduce JFR Event Throttling and new
 jdk.ObjectAllocationSample event (enabled by default)
In-Reply-To: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
References: <es5LivYOFnfekmPi-IqPC9QFP3dh1TLZ7aZ-vMAIXjs=.020af513-a0e2-4a32-ba88-82b30a945753@github.com>
Message-ID: <7M4O5mf_KYwZJSftUk-3-HerZJtLIvh-MiZg5PK21Ug=.2daa50be-1a07-400c-b8ca-c8b972337c81@github.com>

On Fri, 4 Dec 2020 15:25:23 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

> Greetings,
> 
> please help review this enhancement to let JFR sample object allocations by default.
> 
> A description is provided in the JIRA issue.
> 
> Thanks
> Markus

This pull request has now been integrated.

Changeset: 502a5241
Author:    Markus Gr?nlund <mgronlun at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/502a5241
Stats:     2488 lines in 41 files changed: 2210 ins; 239 del; 39 mod

8257602: Introduce JFR Event Throttling and new jdk.ObjectAllocationSample event (enabled by default)

Co-authored-by: Jaroslav Bachorik <jbachorik at openjdk.org>
Reviewed-by: egahlin, jbachorik

-------------

PR: https://git.openjdk.java.net/jdk/pull/1624


From tschatzl at openjdk.java.net  Thu Dec 10 13:37:45 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 10 Dec 2020 13:37:45 GMT
Subject: RFR: 8257774: G1: Trigger collect when free region count drops
 below threshold to prevent evacuation failures
In-Reply-To: <JJHix9SgIjJWYZtIIrz3WDudrV1jApcffIsVanoNNqc=.8d8555dd-68c8-44f9-87eb-a6b9fbf10780@github.com>
References: <JJHix9SgIjJWYZtIIrz3WDudrV1jApcffIsVanoNNqc=.8d8555dd-68c8-44f9-87eb-a6b9fbf10780@github.com>
Message-ID: <yn2ArD8wEb88pT11grc5HGZz-iq65DzczE6e2bI9OVE=.07cdcb44-c591-4d8a-bd7d-1ac5f2499433@github.com>

On Sun, 6 Dec 2020 17:39:54 GMT, Charlie Gracie <cgracie at openjdk.org> wrote:

> Bursts of short lived Humongous object allocations can cause GCs to be initiated with 0 free regions. When these GCs happen they take significantly longer to complete. No objects are evacuated so there is a large amount of time spent in reversing self forwarded pointers and the only memory recovered is from the short lived humongous objects. My proposal is to add a check to the slow allocation path which will force a GC to happen if the number of free regions drops below the amount that would be required to complete the GC if it happened at that moment. The threshold will be based on the survival rates from Eden and survivor spaces along with the space required for Tenure space evacuations.
> 
> The goal is to resolve the issue with bursts of short lived humongous objects without impacting other workloads negatively. I would appreciate reviews and any feedback that you might have. Thanks.
> 
> Here are the links to the threads on the mailing list where I initially discussion the issue and my idea to resolve it:
> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-November/032189.html
> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-December/032677.html

This is only a first pass over the changes; I need to think a bit more about the estimation, and whether there are any issues with this new heuristic, i.e. that we could end up doing lots of gcs that effectively achieve nothing at some point and whether this is preferable to running into an evacuation failure.

test/hotspot/jtreg/gc/g1/TestGCLogMessages.java line 316:

> 314:         private static byte[] garbage;
> 315:         private static byte[] largeObject;
> 316:         private static Object[] holder = new Object[800]; // Must be larger than G1EvacuationFailureALotCount

Just curious about these changes: it is not immediately obvious to me why they are necessary as the mechanism to force evacuation failure (G1EvacuationFailureALotCount et al) should be independent of these changes.

And the 17MB (for the humonguous object)+ 16MB of garbage should be enough for at least one gc; but maybe these changes trigger an early gc?

src/hotspot/share/gc/g1/g1VMOperations.hpp line 71:

> 69: class VM_G1CollectForAllocation : public VM_CollectForAllocation {
> 70:   bool _gc_succeeded;
> 71:   bool _force_gc;

Not completely happy about using an extra flag for these forced GCs here; also this makes them indistinguishable with other GCs in the logs as far as I can see.
What do you think about adding  GCCause(s) instead to automatically make them stand out in the logs?
Or at least make sure that they stand out in the logs for debugging issues.

src/hotspot/share/gc/g1/g1VMOperations.cpp line 125:

> 123:   G1CollectedHeap* g1h = G1CollectedHeap::heap();
> 124: 
> 125:   if (_word_size > 0 && !_force_gc) {

Why not start the GC with a requested word_size == 0 here and remove the need for the flag (still there is need to record somehow the purpose of this gc)? In some way it makes sense as space is not really exhausted.

src/hotspot/share/gc/g1/g1Policy.hpp line 102:

> 100: 
> 101:   size_t _predicted_survival_bytes_from_survivor;
> 102:   size_t _predicted_survival_bytes_from_old;

As a non-English native speaker I think "survival_bytes" is strange as "survival" isn't an adjective. Maybe "surviving_bytes" sounds better?
The code in `calculate_required_regions_for_next_collect` uses "survivor" btw. I would still prefer "surviving" in some way as it differs from the "survivor" in "survivor regions", but let's keep nomenclature at least consistent.

Also please add a comment what these are used for.

src/hotspot/share/gc/g1/g1Policy.hpp line 368:

> 366:                                                  uint& num_optional_regions);
> 367: 
> 368:   bool can_mutator_consume_free_regions(uint region_count);

Comments missing.

src/hotspot/share/gc/g1/g1Policy.hpp line 369:

> 367: 
> 368:   bool can_mutator_consume_free_regions(uint region_count);
> 369:   void calculate_required_regions_for_next_collect();

"... for_next_collection" sounds better? This method name seems self-explaining, but it does not calculate the required regions but updates the two new members which contain values stored *in bytes*.

src/hotspot/share/gc/g1/g1Policy.cpp line 1452:

> 1450: bool G1Policy::can_mutator_consume_free_regions(uint alloc_region_count) {
> 1451:   uint eden_count = _g1h->eden_regions_count();
> 1452:   if (eden_count < 1) {

I'd prefer "eden_count == 0" here since it is an uint anyway.

src/hotspot/share/gc/g1/g1Policy.cpp line 1459:

> 1457:   // adjust the total survival bytes by the target amount of wasted space in PLABs.
> 1458:   // should old bytes be adjusted and turned into a region count on its own?
> 1459:   size_t const adjusted_survival_bytes_bytes = (size_t)(total_predicted_survival_bytes * (100 + TargetPLABWastePct) / 100.0);

Answering the question: yes, both young gen and old gen survivors must be treated separately and rounded up to regions seperately.

src/hotspot/share/gc/g1/g1Policy.cpp line 1457:

> 1455:   size_t const predicted_survival_bytes_from_eden = _eden_surv_rate_group->accum_surv_rate_pred(eden_count) * HeapRegion::GrainBytes;
> 1456:   size_t const total_predicted_survival_bytes = predicted_survival_bytes_from_eden + _predicted_survival_bytes_from_survivor + _predicted_survival_bytes_from_old;
> 1457:   // adjust the total survival bytes by the target amount of wasted space in PLABs.

This adjustment for wasted space (the `(100 + PLABWastePercent) / 100)` code) in PLABs is now done twice. Please extract a (maybe static) helper function.

src/hotspot/share/gc/g1/g1Policy.cpp line 1479:

> 1477: 
> 1478: void G1Policy::calculate_required_regions_for_next_collect() {
> 1479:   // calculate the survival bytes from survivor in the next GC

Comments that are sentences should start upper case and end with a full stop.

src/hotspot/share/gc/g1/g1Policy.cpp line 1490:

> 1488:   _predicted_survival_bytes_from_survivor = survivor_bytes;
> 1489: 
> 1490:   // calculate the survival bytes from old in the next GC

Same problem as above with the comment style. :) Please add a sentence that we use the minimum old gen collection set as conservative estimate for the number of regions to take for this calculation.

src/hotspot/share/gc/g1/g1Policy.cpp line 1496:

> 1494:     uint predicted_old_region_count = calc_min_old_cset_length();
> 1495:     uint num_remaining = candidates->num_remaining();
> 1496:     uint iterate_count = num_remaining < predicted_old_region_count ? num_remaining : predicted_old_region_count;

I kind of prefer using the `MIN2()` expression here instead of the "if" as it is what you want anyway (it's the same of course, but the use of MIN2 directly indicates that we are taking the minimum).

Not sure if adding the two locals `predicted_old_region_count` and `num_remaining` should be kept then. They do not add too much (and should be `const`).

I'm also not sure if the second part of the condition of the outer if (i.e. `&& !candidates->is_empty()`) is useful. `candidates->num_remaining()` should be zero in this case.

src/hotspot/share/gc/g1/g1Policy.cpp line 1494:

> 1492:   G1CollectionSetCandidates *candidates = _collection_set->candidates();
> 1493:   if ((candidates != NULL) && !candidates->is_empty()) {
> 1494:     uint predicted_old_region_count = calc_min_old_cset_length();

Or add the comment that we intentionally use `calc_min_old_cset_length` as an estimate for the number of old regions for this calculations here.

src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 765:

> 763:                                 bool*          succeeded,
> 764:                                 GCCause::Cause gc_cause,
> 765:                                 bool           force_gc);

If that parameter is kept, please add documentation.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 423:

> 421:   for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
> 422:     bool should_try_gc;
> 423:     bool force_gc = false;

`force_gc` and `should_try_gc` seems to overlap a bit here. At least the naming isn't perfect because we may not do a gc even if `force_gc` is true which I'd kind of expect.

I do not have a good new name right now how to fix this.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 428:

> 426:     {
> 427:       MutexLocker x(Heap_lock);
> 428:       if (policy()->can_mutator_consume_free_regions(1)) {

I would prefer if `force_gc` (or whatever name it will have) would be set here unconditionally as the `else` is pretty far away here.

I.e.
force_gc = policy()->can_mutator_consume_free_regions(1);
  
  if (force_gc) { // needing to use the name "force_gc" here shows that the name is wrong...
    ... try allocation
    ... check if we should expand young gen beyond regular size due to GCLocker
  }
The other issue I have with using `can_mutator_consume_free_regions()` here is that there is already a very similar `G1Policy::should_allocate_mutator_region`; and anyway, the `attempt_allocation_locked` call may actually succeed without requiring a new region (actually, it is not uncommon that another thread got a free region while trying to take the `Heap_lock`.

I think a better place for `can_mutator_consume_free_regions()` is in `G1Policy::should_allocate_mutator_region()` for this case.

`attempt_allocation_locked` however does not return a reason for why allocation failed (at the moment). Maybe it is better to let it return a tuple with result and reason (or a second "out" parameter)? (I haven't tried how this would look like, it seems worth trying and better than the current way of handling this).

This one could be used in the following code.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 459:

> 457:       bool succeeded;
> 458:       result = do_collection_pause(word_size, gc_count_before, &succeeded,
> 459:                                    GCCause::_g1_inc_collection_pause, force_gc);

I really think it is better to add a new GCCause for this instead of the additional parameter. We are not doing a GC because we are out of space because of the allocation of `word_size`, but a "pre-emptive" GC due to the allocation of `word_size`.
This warrants a new GC cause imho.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 885:

> 883:       bool succeeded;
> 884:       result = do_collection_pause(word_size, gc_count_before, &succeeded,
> 885:                                    GCCause::_g1_humongous_allocation, force_gc);

I believe the reason for this GC if `force_gc` is true is *not* that we do not have enough space for the humongous allocation, but we are doing a "pre-emptive" GC because of the allocation of "word_size" similar to before.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 860:

> 858: 
> 859:       size_t size_in_regions = humongous_obj_size_in_regions(word_size);
> 860:       if (policy()->can_mutator_consume_free_regions((uint)size_in_regions)) {

Again, I would prefer that the result of this is stored in a local not called `force_gc` :) and then used instead of appending such a small `else` after the "long'ish" true case.

-------------

Changes requested by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1650


From github.com+168222+mgkwill at openjdk.java.net  Thu Dec 10 18:10:10 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Thu, 10 Dec 2020 18:10:10 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v14]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <P2UsYz-M5iQ9sYIYP8VR-yH_OojBAiLWuYFiwkd3yDI=.b8f0a780-d198-4471-a092-7a1c3fd36869@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits:

 - Merge branch 'master' into update_hlp
 - Remove extraneous ' from warning
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'master' into update_hlp
 - Merge branch 'master' into update_hlp
 - Merge branch 'master' into update_hlp
 - Fix os::large_page_size() in last update
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Ivan W. Requested Changes
   
   Removed os::Linux::select_large_page_size and
   use os::page_size_for_region instead
   
   Removed Linux::find_large_page_size and use
   register_large_page_sizes. Streamlined
   Linux::setup_large_page_size
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix space format, use Linux:: for local func.
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - Fix merge mistakes
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - ... and 12 more: https://git.openjdk.java.net/jdk/compare/f5740561...81ff7c53

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1153/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=13
  Stats: 63 lines in 2 files changed: 24 ins; 11 del; 28 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From akozlov at openjdk.java.net  Thu Dec 10 20:08:00 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Thu, 10 Dec 2020 20:08:00 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v6]
In-Reply-To: <3tNI7G1GOXjH1xIJQoGswrg3DC63zq6FE3_wSnhAd4Y=.952df04c-71d9-48a0-aff2-7c2d64dbfeda@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <or1fXOo5w_rt079gPe-aaLjc6SVngT5Li-Js1NYaNCk=.fccc501e-3da3-4250-9e7b-299145b6c122@github.com>
 <y7SzUraPbBYXwjKjERwYv6-tWhNQfCTzNdXVNpdcLJ8=.f7d30287-4128-4834-8961-cedb634bd063@github.com>
 <DNU5b07FB8qff785ppffh56I6LeSX79IB-bw3bGxrJ4=.29b096a9-3fe6-4615-993a-9565d7f189e8@github.com>
 <YNyaEmPGWA27belhRPyRp3kfN0E4qcIuKPHTHgD5lkk=.face1bfa-5818-4e85-bef6-7ab5c8cf1fd5@github.com>
 <TkAiIhvHv7CTy79dBgQkq8KrxQitAkEDtoaM6Da2slU=.681beb8d-e809-4b18-bc20-f7893dcea710@github.com>
 <ipj-g4-5uS5S5Kh1O4Faabx77iQV7urwrJMCfn2wRDY=.e732f9e2-45bb-4cb5-a596-0ce765f39743@github.com>
 <mdxddVhAymzAhturM4Hidu65DvxPKnedfuzMVtO53jY=.d6c538b1-5b6c-4023-a692-1a223e294fb4@github.com>
 <thMUH_fwSdOMECu1ENN9hobfPL2k27ODVC1dyGr6Kns=.1014ba20-e574-446c-859d-2e7b6f39e868@github.com>
 <ExvyytEFfljc1sxFzm1IR8nEfbq1XzJPy10M95Y_GQ4=.70eda7fe-7c3c-42cd-b87e-ecde0dedd0c2@github.com>
 <OR0aR0CiRpDKsaEP2crzkIfs8q5Sq3R-OMN
 6vo14cvs=.db0a7680-8de0-4315-a4c0-6cb5b3817bcd@github.com>
 <3A8yXtEkRQymlaf0L15jBPViSYPFlMIKxp7aefZyv2E=.25a4e984-17a7-4a38-9ce4-c37c2f0dc428@github.com>
 <W8Etnx2h78q8BW3Qagqs3zNgJq1wxOYkRrMcGAIe9KI=.177031f4-20d4-4341-89f9-d651418e4072@github.com>
 <0Rl1rgfPK8tQJ9KPwMTTTqlN_GjyxjIwBSUXtHIvUyo=.11610ef7-95f8-4d8d-872a-f38960d320ff@github.com>
 <q-CzfteSs71yUdQb3IWkUVNwU66J5TPBa7VRVcgn9nE=.3cca0e99-4da7-4376-86e4-01d787939c66@github.com>
 <3tNI7G1GOXjH1xIJQoGswrg3DC63zq6FE3_wSnhAd4Y=.952df04c-71d9-48a0-aff2-7c2d64dbfeda@github.com>
Message-ID: <TILTZHILcxD7tw_W8FswVdHl1X0pL4i1gzkMKO82nxM=.d24beb43-8363-41e6-9680-d2b37b813e2a@github.com>

On Sat, 5 Dec 2020 14:52:32 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>>> > I found this: https://stackoverflow.com/questions/7718964/how-can-i-force-macos-to-release-madv-freed-pages. One remark recommends MADV_FREE_REUSABLE to deal with the display problem; could that be a solution
>>> 
>>> I'd found MADV_FREE_REUSABLE as well. One problem is that it's barely documented. The only description from the vendor I could find was
>>> 
>>> ```
>>> #define MADV_FREE               5       /* pages unneeded, discard contents */
>>> #define MADV_ZERO_WIRED_PAGES   6       /* zero the wired pages that have not been unwired before the entry is deleted */
>>> #define MADV_FREE_REUSABLE      7       /* pages can be reused (by anyone) */
>>> #define MADV_FREE_REUSE         8       /* caller wants to reuse those pages */
>>> ```
>>> 
>>> The other problem, it cannot substitute mmap completely, see below.
>>> 
>>> > My only remaining question is: is there really an observable difference between replacing the mapping with mmap and calling madvice(MADV_FREE)? And if there is, does it matter in practice?
>>> 
>>> Yes, it is. For a sample program after uncommit implemented by different ways, mmap the only way to reduce occupied memory size in Activity Monitor (system GUI application user will likely look to).
>>> 
>> 
>> Okay, I see. Thanks for these tests, they are valuable. My one remaining doubt would be if the numbers were different in the face of memory pressure.
>> 
>> But I don't like to block this PR anymore, I caused enough work and discussions. So I am fine with the general thrust of the change:
>> - add exec to reserve and uncommit
>> - with the contract being that the exec parameter handed in with commit and uncommit has to match the one used with reserve.
>> Maybe we can have future improvements with these interfaces and reduce the complexity again (e.g. having an opaque handle structure holding mapping creation information). 
>> 
>> Is the current version review-worthy?
>> 
>> Thanks a lot for your patience,
>> 
>> ..Thomas
>
>> So I am fine with the general thrust of the change:
>> * add exec to reserve and uncommit
>> * with the contract being that the exec parameter handed in with commit and uncommit has to match the one used with reserve.
> 
> The latest version implements this approach. It's ready for review.
> 
> Thanks,
> Anton

Hi. Could someone else review the patch? AFAIK Thomas is unable to do this on this week. And building broader consensus is not worthless.

Unfortunately, we've reverted to an older version of the patch, so @iklam 's review applies to almost completely different code. I'm sorry for that.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From burban at openjdk.java.net  Thu Dec 10 21:46:58 2020
From: burban at openjdk.java.net (Bernhard Urban-Forster)
Date: Thu, 10 Dec 2020 21:46:58 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v7]
In-Reply-To: <4mw_qwllDU7qLgqcm7Z_kxyGICpv18HZ_LrbidneSw4=.891574d8-45a6-4ecd-9dc9-be2070bdc3e6@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <4mw_qwllDU7qLgqcm7Z_kxyGICpv18HZ_LrbidneSw4=.891574d8-45a6-4ecd-9dc9-be2070bdc3e6@github.com>
Message-ID: <m-bD77QpgNVMq3pdSbvC5FS6nVu3mpVCkTxmpsvn_M4=.b05ef2d4-bb9a-4039-9ce5-181d2b8a5156@github.com>

On Fri, 4 Dec 2020 22:29:25 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
>> 
>> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
>> 
>> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
>> 
>> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
>> 
>> Tested: 
>> * local tier1
>> * jdk-submit
>> * codesign[2] with hardened runtime and allow-jit but without 
>> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
>> 
>> (adding GC group as suggested by @dholmes-ora)
>> 
>> 
>> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
>> [2]
>>  
>>   codesign \
>>     --sign - \
>>     --options runtime \
>>     --entitlements ents.plist \
>>     --timestamp \
>>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
>> [3]
>>   <?xml version="1.0" encoding="UTF-8"?>
>>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>>   <plist version="1.0">
>>     <dict>
>>       <key>com.apple.security.cs.allow-jit</key>
>>       <true/>
>>       <key>com.apple.security.cs.disable-library-validation</key>
>>       <true/>
>>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>>       <true/>
>>     </dict>
>>   </plist>
>
> Anton Kozlov has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - Fix style
>  - JDK-8234930 v4: Use MAP_JIT when allocating pages for code cache on macOS
>  - Revert "Separate executable_memory interface"
>    
>    This reverts commit 49253d8fe8963ce069f10783dcea5327079ba848.

I read through the discussion and it makes sense to me. Thanks for the `mprotect`/`madvice` tests, they are pretty interesting.

Patch looks good too (but I'm not a Reviewer).

src/hotspot/os/bsd/os_bsd.cpp line 1937:

> 1935:   // Bsd mmap allows caller to pass an address as hint; give it a try first,
> 1936:   // if kernel honors the hint then we can return immediately.
> 1937:   char * addr = anon_mmap(requested_addr, bytes, false/*executable*/);

use `!ExecMem`?

src/hotspot/os/linux/os_linux.cpp line 3275:

> 3273: struct bitmask* os::Linux::_numa_membind_bitmask;
> 3274: 
> 3275: bool os::pd_uncommit_memory(char* addr, size_t size, bool exec) {

nit: I'm irritated by `bool exec` in `pd_uncommit_memory`, but `bool executable` in `pd_reserve_memory`. Choose one :-)

-------------

Marked as reviewed by burban (Author).

PR: https://git.openjdk.java.net/jdk/pull/294


From kbarrett at openjdk.java.net  Fri Dec 11 07:22:04 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Dec 2020 07:22:04 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <Ki8TjwRI6biLbMKN2NV63CZ3YzCxQ9Jqt4UdSM-PJuM=.a4574100-7882-49ef-94fd-639b8cfbb6bd@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
 <Ki8TjwRI6biLbMKN2NV63CZ3YzCxQ9Jqt4UdSM-PJuM=.a4574100-7882-49ef-94fd-639b8cfbb6bd@github.com>
Message-ID: <SDTlAWEw4wHICr3APJir0xtF3em9FIZx3Xu6Rjbc9oM=.b9f6172e-e83f-44fa-8d8d-40d88af06209@github.com>

On Wed, 9 Dec 2020 17:20:25 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Please review this reimplementation of WeakProcessorPhase.  It is changed to
>> a scoped enum at namespace scope, and uses the recently added EnumIterator
>> facility to provide iteration, rather than a bespoke iterator class.
>> 
>> This is a step toward eliminating it entirely.  I've split it out into a
>> separate PR to make the review of the follow-up work a bit easier.
>> 
>> As part of this the file weakProcessorPhases.hpp is renamed to
>> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
>> rename and (majorly) edit, instead treating it as a remove and add a new
>> file.
>> 
>> Testing: mach5 tier1
>
> lgtm

Thanks @walulyai , @albertnetymk , @tschatzl for reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1620


From kbarrett at openjdk.java.net  Fri Dec 11 07:48:18 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Dec 2020 07:48:18 GMT
Subject: RFR: 8257676: Simplify WeakProcessorPhase [v2]
In-Reply-To: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
Message-ID: <xpcJfzIfGq64DEB5TOscbCWOvgXTzK8X1ckh6hWeExg=.3f07ec14-b701-45db-bccf-dfd456ed6a33@github.com>

> Please review this reimplementation of WeakProcessorPhase.  It is changed to
> a scoped enum at namespace scope, and uses the recently added EnumIterator
> facility to provide iteration, rather than a bespoke iterator class.
> 
> This is a step toward eliminating it entirely.  I've split it out into a
> separate PR to make the review of the follow-up work a bit easier.
> 
> As part of this the file weakProcessorPhases.hpp is renamed to
> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
> rename and (majorly) edit, instead treating it as a remove and add a new
> file.
> 
> Testing: mach5 tier1

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:

 - Merge branch 'master' into simplify_weak_phase
 - simplify phases and use enum class

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1620/files
  - new: https://git.openjdk.java.net/jdk/pull/1620/files/23fdd553..de6a66fb

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1620&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1620&range=00-01

  Stats: 32946 lines in 810 files changed: 24540 ins; 5557 del; 2849 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1620/head:pull/1620

PR: https://git.openjdk.java.net/jdk/pull/1620


From kbarrett at openjdk.java.net  Fri Dec 11 07:48:20 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Dec 2020 07:48:20 GMT
Subject: Integrated: 8257676: Simplify WeakProcessorPhase
In-Reply-To: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
References: <9c5BlhHyYpaX79CwBIsowoJXopczIZc3oiqv_mL1LKA=.c06c01be-4947-406b-b0ef-7859c6587474@github.com>
Message-ID: <VwDjliGoJHy7BC2Sw8amWAI89cJmGkj44NdZGFZGy9g=.e58a5a76-9975-47c0-acec-9d18b5316a5c@github.com>

On Fri, 4 Dec 2020 09:34:36 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this reimplementation of WeakProcessorPhase.  It is changed to
> a scoped enum at namespace scope, and uses the recently added EnumIterator
> facility to provide iteration, rather than a bespoke iterator class.
> 
> This is a step toward eliminating it entirely.  I've split it out into a
> separate PR to make the review of the follow-up work a bit easier.
> 
> As part of this the file weakProcessorPhases.hpp is renamed to
> weakProcessorPhase.hpp, but git doesn't seem to be recognizing that as a
> rename and (majorly) edit, instead treating it as a remove and add a new
> file.
> 
> Testing: mach5 tier1

This pull request has now been integrated.

Changeset: fa20186c
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/fa20186c
Stats:     219 lines in 8 files changed: 38 ins; 171 del; 10 mod

8257676: Simplify WeakProcessorPhase

Reviewed-by: iwalulya, ayang, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/1620


From tschatzl at openjdk.java.net  Fri Dec 11 08:40:57 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 08:40:57 GMT
Subject: RFR: 8257145: Performance regression with -XX:-ResizePLAB after
 JDK-8079555 [v6]
In-Reply-To: <ZQX7WUjCI_K4NQ5OaJZVs_UStZeBghIACdj9NVaxh5s=.01f71f84-869c-449c-b250-19e808165b74@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
 <ZQX7WUjCI_K4NQ5OaJZVs_UStZeBghIACdj9NVaxh5s=.01f71f84-869c-449c-b250-19e808165b74@github.com>
Message-ID: <5PL24Oi6DdSh_Enax3RExssJsdgTbxIT19TdKB2iX5A=.041a7a50-6003-4c87-91e0-f57f771cbefe@github.com>

On Tue, 8 Dec 2020 15:31:26 GMT, Dongbo He <dongbohe at openjdk.org> wrote:

>> Hi,
>> 
>> this is the continuation of the review of the implementation for:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8257145
>
> Dongbo He has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix failure in test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java

Lgtm.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1474


From dongbohe at openjdk.java.net  Fri Dec 11 09:08:56 2020
From: dongbohe at openjdk.java.net (Dongbo He)
Date: Fri, 11 Dec 2020 09:08:56 GMT
Subject: Integrated: 8257145: Performance regression with -XX:-ResizePLAB
 after JDK-8079555
In-Reply-To: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
References: <bh5OYTdbJZ3z4JXCVJCOp5B-4qBDl0XxkHQ1J8PQBu0=.c2b14d46-17b3-4de1-a3a0-54a680692f6c@github.com>
Message-ID: <R4-YLsipZmpoN0yTPwmFMlBXUwI7itGUn2uACtyiH0Q=.b4853185-8006-47a6-b1cd-7ce0e71d9d03@github.com>

On Fri, 27 Nov 2020 03:37:42 GMT, Dongbo He <dongbohe at openjdk.org> wrote:

> Hi,
> 
> this is the continuation of the review of the implementation for:
> 
> https://bugs.openjdk.java.net/browse/JDK-8257145

This pull request has now been integrated.

Changeset: b28b0947
Author:    Dongbo He <dongbohe at openjdk.org>
Committer: Fei Yang <fyang at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/b28b0947
Stats:     12 lines in 5 files changed: 6 ins; 0 del; 6 mod

8257145: Performance regression with -XX:-ResizePLAB after JDK-8079555

Co-authored-by: Junjun Lin <linjunjun at huawei.com>
Reviewed-by: tschatzl, sjohanss

-------------

PR: https://git.openjdk.java.net/jdk/pull/1474


From tschatzl at openjdk.java.net  Fri Dec 11 10:01:59 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 10:01:59 GMT
Subject: Withdrawn: 8256641: CDS VM operations do not lock the heap
In-Reply-To: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
Message-ID: <1XWPwVQosrcoUFibuMd0N2tm6zm61sDenqPZ-o0Iymc=.a3cad6ef-09a1-47dd-8a3e-3e2786c2c46a@github.com>

On Mon, 7 Dec 2020 11:23:04 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> `VM_PopulateDumpSharedSpace`, `VM_PopulateDynamicDumpSharedSpace` and `VM_Verify` are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the `Heap_lock`, so the suggested solution is based on having all these VM operations descend from a new `VM_GC_Sync_Operation` `VM_Operation` which does that (and only that), split out from `VM_GC_Operation`.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> - each VM Operation could handle `Heap_lock` by itself, which I considered to be too error-prone.
> - the need for `VM_Verify` to coordinate with garbage collections is new and has been introduced with [JDK-8253081](https://bugs.openjdk.java.net/browse/JDK-8253081) as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
> One (implementation) drawback is that since ZGC also uses `VM_Verify`, that operation now gets the `Heap_lock` too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
> - so this change adds a new VM Operation class called `VM_GC_Sync_Operation` that splits off the handling of `Heap_lock` (i.e. the actual synchronization` from `VM_GC_Operation`. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the `VM_Populate*` or even `VM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From tschatzl at openjdk.java.net  Fri Dec 11 10:01:58 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 10:01:58 GMT
Subject: RFR: 8256641: CDS VM operations do not lock the heap  [v3]
In-Reply-To: <cXOnLpwjwd9n3jgP7fblCVHl7SfEk2lAdkvbhDB0wT0=.a0d1f38e-86b0-40b7-a9f0-d4b1091b0940@github.com>
References: <vjd0hX065wl62RssCbY2PKxRNXxCuW_IybKyi_R2mWk=.3fbbacad-d2bc-4035-8579-aebbed85cf9a@github.com>
 <Te0OSzNfwd-W47gNKZExD1QYQotpKCSNTP-wHn3dhCM=.a852ce28-5568-4f27-910a-ae54b40010ad@github.com>
 <cXOnLpwjwd9n3jgP7fblCVHl7SfEk2lAdkvbhDB0wT0=.a0d1f38e-86b0-40b7-a9f0-d4b1091b0940@github.com>
Message-ID: <NTRRQ5Tf6j4_ghlhQIty2gfWFXHVH0HtZT85KuHBuyM=.c97024f7-8a3a-443a-a448-2b715ee97352@github.com>

On Thu, 10 Dec 2020 11:41:35 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   kbarrett review2, comment updates
>
> Marked as reviewed by kbarrett (Reviewer).

This is a change originally meant for JDK16, but the fork has occurred before integration. So re-requesting a pull [there](https://github.com/openjdk/jdk16/pull/8)

-------------

PR: https://git.openjdk.java.net/jdk/pull/1661


From kbarrett at openjdk.java.net  Fri Dec 11 10:06:00 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 11 Dec 2020 10:06:00 GMT
Subject: [jdk16] RFR:  8256641: CDS VM operations do not lock the heap
In-Reply-To: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
References: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
Message-ID: <MJrxT-nwuacoVfYlVd2rWNxavCdOxbI9_mahW-VI6cU=.b6c6b961-f1d3-4fc2-8b0b-f54ab8aaa42e@github.com>

On Fri, 11 Dec 2020 09:57:14 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> (Originally started in openjdk/jdk [PR #1161](https://github.com/openjdk/jdk/pull/1661), but the fork happened before pushing)
> 
> Hi all,
> 
> can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the Heap_lock, so the suggested solution is based on having all these VM operations descend from a new VM_GC_Sync_Operation VM_Operation which does that (and only that), split out from VM_GC_Operation.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> 
>     each VM Operation could handle Heap_lock by itself, which I considered to be too error-prone.
>     the need for VM_Verify to coordinate with garbage collections is new and has been introduced with JDK-8253081 as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>     One (implementation) drawback is that since ZGC also uses VM_Verify, that operation now gets the Heap_lock too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>     so this change adds a new VM Operation class called VM_GC_Sync_Operation that splits off the handling of Heap_lock (i.e. the actual synchronizationfromVM_GC_Operation. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the VM_Populate*or evenVM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

Still looks good.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/8


From tschatzl at openjdk.java.net  Fri Dec 11 10:06:00 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 10:06:00 GMT
Subject: [jdk16] RFR:  8256641: CDS VM operations do not lock the heap
Message-ID: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>

(Originally started in openjdk/jdk [PR #1161](https://github.com/openjdk/jdk/pull/1661), but the fork happened before pushing)

Hi all,

can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?

VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.

(Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)

They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.

The existing mechanism to prevent this kind of interference is taking the Heap_lock, so the suggested solution is based on having all these VM operations descend from a new VM_GC_Sync_Operation VM_Operation which does that (and only that), split out from VM_GC_Operation.

There some points I would like to bring up in advance in this change that may be contentious:

    each VM Operation could handle Heap_lock by itself, which I considered to be too error-prone.
    the need for VM_Verify to coordinate with garbage collections is new and has been introduced with JDK-8253081 as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
    One (implementation) drawback is that since ZGC also uses VM_Verify, that operation now gets the Heap_lock too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
    so this change adds a new VM Operation class called VM_GC_Sync_Operation that splits off the handling of Heap_lock (i.e. the actual synchronizationfromVM_GC_Operation. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the VM_Populate*or evenVM_Verify` operations.

Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

-------------

Commit messages:
 - kbarrett review2, comment fixup
 - kbarrett review
 - Initial commit

Changes: https://git.openjdk.java.net/jdk16/pull/8/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=8&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256641
  Stats: 100 lines in 10 files changed: 65 ins; 15 del; 20 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/8.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/8/head:pull/8

PR: https://git.openjdk.java.net/jdk16/pull/8


From tschatzl at openjdk.java.net  Fri Dec 11 10:06:00 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 10:06:00 GMT
Subject: [jdk16] RFR:  8256641: CDS VM operations do not lock the heap
In-Reply-To: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
References: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
Message-ID: <tFG_Ch7H1kQl6MrG2fgO-X4d_VUb29K7wx8eYoFs1EA=.211ee6a7-20b6-4a4e-be48-a0b9cb748089@github.com>

On Fri, 11 Dec 2020 09:57:14 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> (Originally started in openjdk/jdk [PR #1161](https://github.com/openjdk/jdk/pull/1661), but the fork happened before pushing)
> 
> Hi all,
> 
> can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the Heap_lock, so the suggested solution is based on having all these VM operations descend from a new VM_GC_Sync_Operation VM_Operation which does that (and only that), split out from VM_GC_Operation.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> 
>     each VM Operation could handle Heap_lock by itself, which I considered to be too error-prone.
>     the need for VM_Verify to coordinate with garbage collections is new and has been introduced with JDK-8253081 as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>     One (implementation) drawback is that since ZGC also uses VM_Verify, that operation now gets the Heap_lock too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>     so this change adds a new VM Operation class called VM_GC_Sync_Operation that splits off the handling of Heap_lock (i.e. the actual synchronizationfromVM_GC_Operation. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the VM_Populate*or evenVM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

No changes from original, applies cleany.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/8


From akozlov at openjdk.java.net  Fri Dec 11 13:25:14 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Fri, 11 Dec 2020 13:25:14 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v8]
In-Reply-To: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
Message-ID: <o-5t1pTLSC64qbkuUXlc7274hi2AuuIeEkleCfaOR4g=.70f82985-9afd-4415-bb50-cdc706a725f7@github.com>

> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
> 
> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
> 
> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
> 
> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
> 
> Tested: 
> * local tier1
> * jdk-submit
> * codesign[2] with hardened runtime and allow-jit but without 
> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
> 
> (adding GC group as suggested by @dholmes-ora)
> 
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2]
>  
>   codesign \
>     --sign - \
>     --options runtime \
>     --entitlements ents.plist \
>     --timestamp \
>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
> [3]
>   <?xml version="1.0" encoding="UTF-8"?>
>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>   <plist version="1.0">
>     <dict>
>       <key>com.apple.security.cs.allow-jit</key>
>       <true/>
>       <key>com.apple.security.cs.disable-library-validation</key>
>       <true/>
>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>       <true/>
>     </dict>
>   </plist>

Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:

  Update style

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/294/files
  - new: https://git.openjdk.java.net/jdk/pull/294/files/b3eb5b01..31fe1fb0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=06-07

  Stats: 19 lines in 8 files changed: 0 ins; 0 del; 19 mod
  Patch: https://git.openjdk.java.net/jdk/pull/294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Fri Dec 11 13:32:59 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Fri, 11 Dec 2020 13:32:59 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v7]
In-Reply-To: <m-bD77QpgNVMq3pdSbvC5FS6nVu3mpVCkTxmpsvn_M4=.b05ef2d4-bb9a-4039-9ce5-181d2b8a5156@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <4mw_qwllDU7qLgqcm7Z_kxyGICpv18HZ_LrbidneSw4=.891574d8-45a6-4ecd-9dc9-be2070bdc3e6@github.com>
 <m-bD77QpgNVMq3pdSbvC5FS6nVu3mpVCkTxmpsvn_M4=.b05ef2d4-bb9a-4039-9ce5-181d2b8a5156@github.com>
Message-ID: <VGzUH5V4RtabfF3iCw2pExSLvfLoTT96u1xLN2b2OPk=.a65300f7-78a6-4118-8837-f0105409fe8c@github.com>

On Thu, 10 Dec 2020 21:16:36 GMT, Bernhard Urban-Forster <burban at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with three additional commits since the last revision:
>> 
>>  - Fix style
>>  - JDK-8234930 v4: Use MAP_JIT when allocating pages for code cache on macOS
>>  - Revert "Separate executable_memory interface"
>>    
>>    This reverts commit 49253d8fe8963ce069f10783dcea5327079ba848.
>
> src/hotspot/os/bsd/os_bsd.cpp line 1937:
> 
>> 1935:   // Bsd mmap allows caller to pass an address as hint; give it a try first,
>> 1936:   // if kernel honors the hint then we can return immediately.
>> 1937:   char * addr = anon_mmap(requested_addr, bytes, false/*executable*/);
> 
> use `!ExecMem`?

Agree, fixed. I've avoided that, as formally it's for `os::` interface layer. But unlikely it's worth even a bit of readability

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Fri Dec 11 13:42:58 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Fri, 11 Dec 2020 13:42:58 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v7]
In-Reply-To: <m-bD77QpgNVMq3pdSbvC5FS6nVu3mpVCkTxmpsvn_M4=.b05ef2d4-bb9a-4039-9ce5-181d2b8a5156@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <4mw_qwllDU7qLgqcm7Z_kxyGICpv18HZ_LrbidneSw4=.891574d8-45a6-4ecd-9dc9-be2070bdc3e6@github.com>
 <m-bD77QpgNVMq3pdSbvC5FS6nVu3mpVCkTxmpsvn_M4=.b05ef2d4-bb9a-4039-9ce5-181d2b8a5156@github.com>
Message-ID: <9Y08fwmleXTIxeAUCxqYGB5zRdTciI9Dma3wlhEe4UU=.57decef8-8e68-4d1d-b5d3-92685b1def49@github.com>

On Thu, 10 Dec 2020 21:17:59 GMT, Bernhard Urban-Forster <burban at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with three additional commits since the last revision:
>> 
>>  - Fix style
>>  - JDK-8234930 v4: Use MAP_JIT when allocating pages for code cache on macOS
>>  - Revert "Separate executable_memory interface"
>>    
>>    This reverts commit 49253d8fe8963ce069f10783dcea5327079ba848.
>
> src/hotspot/os/linux/os_linux.cpp line 3275:
> 
>> 3273: struct bitmask* os::Linux::_numa_membind_bitmask;
>> 3274: 
>> 3275: bool os::pd_uncommit_memory(char* addr, size_t size, bool exec) {
> 
> nit: I'm irritated by `bool exec` in `pd_uncommit_memory`, but `bool executable` in `pd_reserve_memory`. Choose one :-)

Thanks, fixed. Too much of code shuffling. Now these should be consistent with the surroundings.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From mgronlun at openjdk.java.net  Fri Dec 11 13:56:09 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Fri, 11 Dec 2020 13:56:09 GMT
Subject: [jdk16] RFR: 8258094: AIX build fails after 8257602
Message-ID: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>

Greetings,

AIX/xlc does not support the THREAD_LOCAL macro.

This changes moves more involved event posting from gc/shared into JFR.

Thanks
Markus

-------------

Commit messages:
 - whitespace
 - move more involved event posting to jfr

Changes: https://git.openjdk.java.net/jdk16/pull/11/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=11&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258094
  Stats: 239 lines in 5 files changed: 156 ins; 79 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/11.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/11/head:pull/11

PR: https://git.openjdk.java.net/jdk16/pull/11


From zgu at openjdk.java.net  Fri Dec 11 15:01:12 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Fri, 11 Dec 2020 15:01:12 GMT
Subject: RFR: 8255019: Shenandoah: Split STW and concurrent mark into
 separate classes [v20]
In-Reply-To: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
References: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
Message-ID: <O59dgKK5J1FJ03E7PHLPLgkeG1DtJuC3v2X6wFOLA9g=.d149bcd1-8b75-46e9-ae4a-94832f02bcdb@github.com>

> This is the first part of refactoring, that aims to isolate three Shenandoah GC modes (concurrent, degenerated and full gc).
> 
> Shenandoah started with two GC modes, concurrent and full gc, with minimal shared code, mainly in mark phase. After introducing degenerated GC, it shared quite large portion of code with concurrent GC, with the concept that degenerated GC can simply pick up remaining work of concurrent GC in STW mode.
> 
> It was not a big problem at that time, since concurrent GC also processed roots STW. Since Shenandoah gradually moved root processing into concurrent phase, code started to diverge, that made code hard to reason and maintain.
> 
> First step, I would like to split STW and concurrent mark, so that:
> 1) Code has to special case for STW and concurrent mark.
> 2) STW mark does not need to rendezvous workers between root mark and the rest of mark
> 3) STW mark does not need to activate SATB barrier and drain SATB buffers.
> 4) STW mark does not need to remark some of roots.
> 
> The patch mainly just shuffles code.  Creates a base class ShenandoahMark, and moved shared code (from current shenandoahConcurrentMark) into this base class. I did 'git mv shenandoahConcurrentMark.inline.hpp  shenandoahMark.inline.hpp, but git does not seem to reflect that.
> 
> A few changes:
> 1) Moved task queue set from ShenandoahConcurrentMark to ShenandoahHeap. ShenandoahMark and its subclasses are stateless. Instead, mark states are maintained in task queue, mark bitmap and SATB buffers, so that they can be created on demand.
> 2) Split ShenandoahConcurrentRootScanner template to ShenandoahConcurrentRootScanner and ShenandoahSTWRootScanner
> 3) Split code inside op_final_mark code into finish_mark and prepare_evacuation helper functions.
> 4) Made ShenandoahMarkCompact stack allocated (as well as ShenandoahConcurrentGC and ShenandoahDegeneratedGC in upcoming refactoring)

Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision:

  Concurrent mark does not expect forwarded objects

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1009/files
  - new: https://git.openjdk.java.net/jdk/pull/1009/files/05faa443..0cb404db

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1009&range=19
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1009&range=18-19

  Stats: 15 lines in 1 file changed: 0 ins; 8 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1009.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1009/head:pull/1009

PR: https://git.openjdk.java.net/jdk/pull/1009


From zgu at openjdk.java.net  Fri Dec 11 15:35:08 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Fri, 11 Dec 2020 15:35:08 GMT
Subject: RFR: 8255019: Shenandoah: Split STW and concurrent mark into
 separate classes [v21]
In-Reply-To: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
References: <T9KtzqPK9rEjIUcR_aNy4dmt_0E_3zI57Ricct56lHE=.1c6e85b4-c359-4e4c-b1d5-b9aae902d49c@github.com>
Message-ID: <fvuUukbkCZULYUvZ6g5y-079AE3_O3QdWc0WdlHqj0M=.50d0d540-d712-4348-b3da-719e34525fde@github.com>

> This is the first part of refactoring, that aims to isolate three Shenandoah GC modes (concurrent, degenerated and full gc).
> 
> Shenandoah started with two GC modes, concurrent and full gc, with minimal shared code, mainly in mark phase. After introducing degenerated GC, it shared quite large portion of code with concurrent GC, with the concept that degenerated GC can simply pick up remaining work of concurrent GC in STW mode.
> 
> It was not a big problem at that time, since concurrent GC also processed roots STW. Since Shenandoah gradually moved root processing into concurrent phase, code started to diverge, that made code hard to reason and maintain.
> 
> First step, I would like to split STW and concurrent mark, so that:
> 1) Code has to special case for STW and concurrent mark.
> 2) STW mark does not need to rendezvous workers between root mark and the rest of mark
> 3) STW mark does not need to activate SATB barrier and drain SATB buffers.
> 4) STW mark does not need to remark some of roots.
> 
> The patch mainly just shuffles code.  Creates a base class ShenandoahMark, and moved shared code (from current shenandoahConcurrentMark) into this base class. I did 'git mv shenandoahConcurrentMark.inline.hpp  shenandoahMark.inline.hpp, but git does not seem to reflect that.
> 
> A few changes:
> 1) Moved task queue set from ShenandoahConcurrentMark to ShenandoahHeap. ShenandoahMark and its subclasses are stateless. Instead, mark states are maintained in task queue, mark bitmap and SATB buffers, so that they can be created on demand.
> 2) Split ShenandoahConcurrentRootScanner template to ShenandoahConcurrentRootScanner and ShenandoahSTWRootScanner
> 3) Split code inside op_final_mark code into finish_mark and prepare_evacuation helper functions.
> 4) Made ShenandoahMarkCompact stack allocated (as well as ShenandoahConcurrentGC and ShenandoahDegeneratedGC in upcoming refactoring)

Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 28 commits:

 - Merge branch 'master' into JDK-8255019-sh-mark
 - Concurrent mark does not expect forwarded objects
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Silent valgrind on potential memory leak
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Removed ShenandoahConcurrentMark parameter from concurrent GC entry/op, etc.
 - Merge branch 'master' into JDK-8255019-sh-mark
 - Merge
 - Moved task queues to marking context
 - ... and 18 more: https://git.openjdk.java.net/jdk/compare/82735140...85a4469e

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1009/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1009&range=20
  Stats: 1972 lines in 22 files changed: 1072 ins; 755 del; 145 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1009.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1009/head:pull/1009

PR: https://git.openjdk.java.net/jdk/pull/1009


From iklam at openjdk.java.net  Fri Dec 11 16:16:59 2020
From: iklam at openjdk.java.net (Ioi Lam)
Date: Fri, 11 Dec 2020 16:16:59 GMT
Subject: [jdk16] RFR:  8256641: CDS VM operations do not lock the heap
In-Reply-To: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
References: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
Message-ID: <s5dMIEgeIMIyqtFkCX2KJhECswK1X0s__VlJndEfbHY=.07def262-472b-4721-a64f-28abddc6e742@github.com>

On Fri, 11 Dec 2020 09:57:14 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> (Originally started in openjdk/jdk [PR #1161](https://github.com/openjdk/jdk/pull/1661), but the fork happened before pushing)
> 
> Hi all,
> 
> can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the Heap_lock, so the suggested solution is based on having all these VM operations descend from a new VM_GC_Sync_Operation VM_Operation which does that (and only that), split out from VM_GC_Operation.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> 
>     each VM Operation could handle Heap_lock by itself, which I considered to be too error-prone.
>     the need for VM_Verify to coordinate with garbage collections is new and has been introduced with JDK-8253081 as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>     One (implementation) drawback is that since ZGC also uses VM_Verify, that operation now gets the Heap_lock too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>     so this change adds a new VM Operation class called VM_GC_Sync_Operation that splits off the handling of Heap_lock (i.e. the actual synchronizationfromVM_GC_Operation. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the VM_Populate*or evenVM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

Marked as reviewed by iklam (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk16/pull/8


From tschatzl at openjdk.java.net  Fri Dec 11 18:17:56 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 18:17:56 GMT
Subject: [jdk16] RFR:  8256641: CDS VM operations do not lock the heap
In-Reply-To: <MJrxT-nwuacoVfYlVd2rWNxavCdOxbI9_mahW-VI6cU=.b6c6b961-f1d3-4fc2-8b0b-f54ab8aaa42e@github.com>
References: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
 <MJrxT-nwuacoVfYlVd2rWNxavCdOxbI9_mahW-VI6cU=.b6c6b961-f1d3-4fc2-8b0b-f54ab8aaa42e@github.com>
Message-ID: <GPWO135pq0l_z0tycrIze_cQ0-rLtE1lFYwFNeRx3zU=.dd8f32a3-6c69-4cef-8bef-5b27eda0ca64@github.com>

On Fri, 11 Dec 2020 10:02:27 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> (Originally started in openjdk/jdk [PR #1161](https://github.com/openjdk/jdk/pull/1661), but the fork happened before pushing)
>> 
>> Hi all,
>> 
>> can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
>> 
>> VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
>> 
>> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
>> 
>> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
>> 
>> The existing mechanism to prevent this kind of interference is taking the Heap_lock, so the suggested solution is based on having all these VM operations descend from a new VM_GC_Sync_Operation VM_Operation which does that (and only that), split out from VM_GC_Operation.
>> 
>> There some points I would like to bring up in advance in this change that may be contentious:
>> 
>>     each VM Operation could handle Heap_lock by itself, which I considered to be too error-prone.
>>     the need for VM_Verify to coordinate with garbage collections is new and has been introduced with JDK-8253081 as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>>     One (implementation) drawback is that since ZGC also uses VM_Verify, that operation now gets the Heap_lock too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>>     so this change adds a new VM Operation class called VM_GC_Sync_Operation that splits off the handling of Heap_lock (i.e. the actual synchronizationfromVM_GC_Operation. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the VM_Populate*or evenVM_Verify` operations.
>> 
>> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)
>
> Still looks good.

Thanks @kimbarrett @iklam for your reviews.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/8


From tschatzl at openjdk.java.net  Fri Dec 11 18:17:58 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 11 Dec 2020 18:17:58 GMT
Subject: [jdk16] Integrated:  8256641: CDS VM operations do not lock the heap
In-Reply-To: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
References: <-0YZek3j3ujXtj6cBFiodRLUqpdVySISpUGHYfN0Wu8=.fff6c370-b895-4e0c-a1a1-83070b0c9577@github.com>
Message-ID: <z0IEqgJIFA94XBYOO_DlE91PQd1r6JRbyLmyXxVeneY=.3778b884-4661-44be-adb8-284ea0e3d3c0@github.com>

On Fri, 11 Dec 2020 09:57:14 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> (Originally started in openjdk/jdk [PR #1161](https://github.com/openjdk/jdk/pull/1661), but the fork happened before pushing)
> 
> Hi all,
> 
> can I get reviews for this change that adds missing synchronization of CDS related VM operations with other heap operations?
> 
> VM_PopulateDumpSharedSpace, VM_PopulateDynamicDumpSharedSpace and VM_Verify are used during CDS operation, one for creating the CDS archive (eventually doing a GC), one for mapping in the CDS archive into the heap, and the last one for verification.
> 
> (Fwiw, imho the first two are awfully close and should be renamed to be better distinguishable, but that's another matter)
> 
> They all in one way or the other need to synchronize with garbage collection as they may either do a GC or just do verification, as actual (STW-)gc returns an uninitialized block of memory that is not parseable; and before that block of memory can be initialized, another VM operation like one of the mentioned could be started otherwise seeing that uninitialized memory and crashing.
> 
> The existing mechanism to prevent this kind of interference is taking the Heap_lock, so the suggested solution is based on having all these VM operations descend from a new VM_GC_Sync_Operation VM_Operation which does that (and only that), split out from VM_GC_Operation.
> 
> There some points I would like to bring up in advance in this change that may be contentious:
> 
>     each VM Operation could handle Heap_lock by itself, which I considered to be too error-prone.
>     the need for VM_Verify to coordinate with garbage collections is new and has been introduced with JDK-8253081 as since then a Java thread might execute it - that's why this hasn't been a problem before. That could be undone (removed), but I kind of believe that with more expected changes to the CDS mechanism in the future the additional full-heap verification after loading the archive is worth the additional effort.
>     One (implementation) drawback is that since ZGC also uses VM_Verify, that operation now gets the Heap_lock too, and is kind of also using some part of the "set of operations related to GC" in general but did not so before, keeping almost completely separate. Testing did not show an issue, and I tried to look at the code carefully to see whether there could be issues with no result. (I.e. I couldn't find an issue). Obviously I'd like to ask you to look over this again.
>     so this change adds a new VM Operation class called VM_GC_Sync_Operation that splits off the handling of Heap_lock (i.e. the actual synchronizationfromVM_GC_Operation. The reason is that I do not think the logic for the gc VM operation that prevents multiple back-to-back GC operations is a good fit for any of the VM_Populate*or evenVM_Verify` operations.
> 
> Testing: tier1-5; test case attached to the CR; other known reproducers (runtime/valhalla/inlinetypes/InlineOops.java in the Valhalla repo)

This pull request has now been integrated.

Changeset: bacf22b9
Author:    Thomas Schatzl <tschatzl at openjdk.org>
URL:       https://git.openjdk.java.net/jdk16/commit/bacf22b9
Stats:     100 lines in 10 files changed: 65 ins; 15 del; 20 mod

8256641: CDS VM operations do not lock the heap

Reviewed-by: kbarrett, iklam

-------------

PR: https://git.openjdk.java.net/jdk16/pull/8


From stuefe at openjdk.java.net  Sat Dec 12 09:31:59 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 12 Dec 2020 09:31:59 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v8]
In-Reply-To: <o-5t1pTLSC64qbkuUXlc7274hi2AuuIeEkleCfaOR4g=.70f82985-9afd-4415-bb50-cdc706a725f7@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <o-5t1pTLSC64qbkuUXlc7274hi2AuuIeEkleCfaOR4g=.70f82985-9afd-4415-bb50-cdc706a725f7@github.com>
Message-ID: <MxKLn3scw77jXktBcwEGVNdICU9rlroE_SnueisvYvM=.0216f1f3-0091-4f0c-8716-6ae8343cd5b3@github.com>

On Fri, 11 Dec 2020 13:25:14 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
>> 
>> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
>> 
>> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
>> 
>> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
>> 
>> Tested: 
>> * local tier1
>> * jdk-submit
>> * codesign[2] with hardened runtime and allow-jit but without 
>> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
>> 
>> (adding GC group as suggested by @dholmes-ora)
>> 
>> 
>> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
>> [2]
>>  
>>   codesign \
>>     --sign - \
>>     --options runtime \
>>     --entitlements ents.plist \
>>     --timestamp \
>>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
>> [3]
>>   <?xml version="1.0" encoding="UTF-8"?>
>>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>>   <plist version="1.0">
>>     <dict>
>>       <key>com.apple.security.cs.allow-jit</key>
>>       <true/>
>>       <key>com.apple.security.cs.disable-library-validation</key>
>>       <true/>
>>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>>       <true/>
>>     </dict>
>>   </plist>
>
> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update style

Hi Anton,

1) can you please make the executable parameter on os::uncommit() default false? Lets minimize the impact for the vast majority of callers which do not need protected memory. The only caller needing this is ReservedSpace. Would also be more in sync with the default false for executable on os::reserve(). 

2) Personal nit, I really find this `ExecMem` jarring. We don't do this (pass named aliases for boolean flags) for any other arguments AFAICS. The usual way to emphasize arg names is with comments:
bool result = os::commit_memory(base, size, /*exec*/ false);
If you still prefer to use it, could you leave at least those places unchanged which are unaffected by your patch?

3) There is some code explicitly dealing with the supposed inability of attempt_reserve_memory_at() using MAP_JIT. I don't understand. I thought the problem was thjat MAP_JIT and MAP_FIXED don't mix. But attempt_reserve_memory_at() does explicitely not use MAP_FIXED, it just attempts to map at a non-null wish address. Does that also not work with MAP_JIT?

4) For my taste there are too many unrelated changes, especially in ReservedSpace. Making all those file scope static helpers members of ReservedSpace causes a lot of diffs. Makes sense cleanup-wise but will make backporting your patch more difficult later (I expect this will be a strong candidate for backporting). Please tone down the patch a bit. I pointed some parts out directly below. Beyond those, I leave it up to you how far you minimize the patch.

Thanks, Thomas

src/hotspot/os/bsd/os_bsd.cpp line 1690:

> 1688:   if (::mprotect(addr, size, prot) == 0) {
> 1689:     return true;
> 1690:   }

You need to handle mprotect failure here too. Probably just by returning false. There is no point in doing the mmap below as fallback. The same applies for the OpenBSD path too.

mprotect may, at least on Linux, fail if the new mapping introduced by changing the protection would bring the process above the system limit for numbrer of mappings. I strongly believe there must be a similar error scenario on Mac. At least on BSD there is (https://man.openbsd.org/mprotect.2), see ENOMEM.

src/hotspot/share/runtime/os.hpp line 326:

> 324:   // Does not overwrite existing mappings.
> 325:   // It's intentionally cannot reserve executable mapping, as some platforms does not allow that
> 326:   // (e.g. macOS with proper MAP_JIT use).

This is a note to a future implementor, not to the user. I would move this out of the header to the posix implementation.

Also, see my question above, why would this not work?

src/hotspot/share/memory/virtualspace.cpp line 88:

> 86:   }
> 87:   assert(!_special, "should not call this");
> 88:   assert(!_executable, "unsupported");

Why is this unsupported? Could I not use MAP_JIT without MAP_FIXED but with a non-null attach address?

src/hotspot/share/memory/virtualspace.cpp line 311:

> 309:     _base -= _noaccess_prefix;
> 310:     _size += _noaccess_prefix;
> 311: 

Since you revert the steps taken at establish_noaccess_prefix, I'd move the _noaccess_prefix=0 up to here. For aestethic reasons mainly :) alternatively, I'd use temp variables like the code did before.

src/hotspot/share/memory/virtualspace.hpp line 61:

> 59:   char* reserve_memory(size_t size);
> 60:   char* reserve_memory_aligned(size_t size, size_t alignment);
> 61:   void  release_memory(char* base, size_t size);

I liked the old names (..._map_or_reserve_....) better. Can you please rename them back? Their whole point is multiplexing between anonymous and mmaped reservation calls. Also, in their current form they read identical to the os::... functions which is really confusing.
For release I propose `release_mapped_or_reserved_memory'. Its a mouthful but it clearly states what it does.

src/hotspot/share/memory/virtualspace.cpp line 399:

> 397:              p2i(base), alignment);
> 398:     } else {
> 399:       _special = false;

special->_special: This change has no connection to your patch. Can you leave this out please? I have nothing against cleanups but please in a separate RFE. Makes the patch clearer and easier to backport later.

src/hotspot/share/memory/virtualspace.cpp line 194:

> 192:              p2i(base), alignment);
> 193:     } else {
> 194:       _special = false;

Changes (special->_special) are Cleanup, please do them in a separate RFE if needed.

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/294


From stuefe at openjdk.java.net  Sat Dec 12 09:32:00 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Sat, 12 Dec 2020 09:32:00 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v8]
In-Reply-To: <MxKLn3scw77jXktBcwEGVNdICU9rlroE_SnueisvYvM=.0216f1f3-0091-4f0c-8716-6ae8343cd5b3@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <o-5t1pTLSC64qbkuUXlc7274hi2AuuIeEkleCfaOR4g=.70f82985-9afd-4415-bb50-cdc706a725f7@github.com>
 <MxKLn3scw77jXktBcwEGVNdICU9rlroE_SnueisvYvM=.0216f1f3-0091-4f0c-8716-6ae8343cd5b3@github.com>
Message-ID: <YL6mEMlHRfqfB-d-E7Lt457gOyYY8kojcYawzesNBqw=.7f8694ce-0c16-465e-9b13-6deb8d1b44a8@github.com>

On Sat, 12 Dec 2020 07:43:12 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update style
>
> src/hotspot/os/bsd/os_bsd.cpp line 1690:
> 
>> 1688:   if (::mprotect(addr, size, prot) == 0) {
>> 1689:     return true;
>> 1690:   }
> 
> You need to handle mprotect failure here too. Probably just by returning false. There is no point in doing the mmap below as fallback. The same applies for the OpenBSD path too.
> 
> mprotect may, at least on Linux, fail if the new mapping introduced by changing the protection would bring the process above the system limit for numbrer of mappings. I strongly believe there must be a similar error scenario on Mac. At least on BSD there is (https://man.openbsd.org/mprotect.2), see ENOMEM.

Also, this is asymetric to uncommit now for the !exec case. There, we mmap(MAP_NORESERVE, PROT_NONE). We have established that MAP_NORESERVE is a noop, so this would be probably fine. Still, I'd do the mmap(PROT_RW) for commit instead for !exec:

if (exec)
  // Do not replace MAP_JIT mappings, see JDK-8234930
  return mprotect() == 0;
} else {
  mmap ...
}
If not, I would remove MAP_NORESERVE from this code.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From kbarrett at openjdk.java.net  Sun Dec 13 01:51:04 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Sun, 13 Dec 2020 01:51:04 GMT
Subject: RFR: 8258142: Simplify G1RedirtyCardsQueue
Message-ID: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>

Please review this change to G1RedirtyCardsQueue to separate the local qset
from the queue. This simplifies the implementation, though requires clients
to deal with the local qset explicitly. This is an enabling step toward
desired simplifications of the PtrQueue hierarchy. This change also
simplifies the interaction between the local qset and the global qset.

Testing:
mach5 tier1-5

-------------

Commit messages:
 - separate local redirty qset from redirty queue

Changes: https://git.openjdk.java.net/jdk/pull/1755/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1755&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258142
  Stats: 79 lines in 5 files changed: 16 ins; 32 del; 31 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1755.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1755/head:pull/1755

PR: https://git.openjdk.java.net/jdk/pull/1755


From rrich at openjdk.java.net  Mon Dec 14 09:35:58 2020
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Mon, 14 Dec 2020 09:35:58 GMT
Subject: [jdk16] RFR: 8258094: AIX build fails after 8257602
In-Reply-To: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
References: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
Message-ID: <ue-tmDMOn9Z_3ynWhT6JlEru6OQyycYzNudhbIfg5Ls=.ee0423ac-b6ca-4730-a436-84e264ef6df9@github.com>

On Fri, 11 Dec 2020 13:41:02 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

> Greetings,
> 
> AIX/xlc does not support the THREAD_LOCAL macro.
> 
> This change moves more involved event posting from gc/shared into JFR.
> 
> Thanks
> Markus

Hi Markus,

this fix looks good to me. Aix build succeeds with it.

Thanks for fixing,
Richard.

-------------

Marked as reviewed by rrich (Committer).

PR: https://git.openjdk.java.net/jdk16/pull/11


From stuefe at openjdk.java.net  Mon Dec 14 09:55:03 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Mon, 14 Dec 2020 09:55:03 GMT
Subject: [jdk16] RFR: 8258094: AIX build fails after 8257602
In-Reply-To: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
References: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
Message-ID: <HqHeZatnNEO-2RYLN1EO7yYm-zb2tJdUjVh7aVKYMxc=.064e0778-068a-4f06-9d71-33ce86f4a7b9@github.com>

On Fri, 11 Dec 2020 13:41:02 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

> Greetings,
> 
> AIX/xlc does not support the THREAD_LOCAL macro.
> 
> This change moves more involved event posting from gc/shared into JFR.
> 
> Thanks
> Markus

Looks fine. Thanks for fixing.

-------------

Marked as reviewed by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/11


From mgronlun at openjdk.java.net  Mon Dec 14 11:16:15 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Mon, 14 Dec 2020 11:16:15 GMT
Subject: [jdk16] RFR: 8258094: AIX build fails after 8257602 [v2]
In-Reply-To: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
References: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
Message-ID: <J84lTj9I8s_l2BWxnyDt8M6K2Jo2MkxvokH65iLDeR0=.48f15430-e669-4d97-bbf8-1d48a08f6f25@github.com>

> Greetings,
> 
> AIX/xlc does not support the THREAD_LOCAL macro.
> 
> This change moves more involved event posting from gc/shared into JFR.
> 
> Thanks
> Markus

Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:

  StackObj -> AllStatic

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/11/files
  - new: https://git.openjdk.java.net/jdk16/pull/11/files/aed88806..10edfe4a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=11&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=11&range=00-01

  Stats: 8 lines in 3 files changed: 1 ins; 2 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/11.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/11/head:pull/11

PR: https://git.openjdk.java.net/jdk16/pull/11


From egahlin at openjdk.java.net  Mon Dec 14 11:20:02 2020
From: egahlin at openjdk.java.net (Erik Gahlin)
Date: Mon, 14 Dec 2020 11:20:02 GMT
Subject: [jdk16] RFR: 8258094: AIX build fails after 8257602 [v2]
In-Reply-To: <J84lTj9I8s_l2BWxnyDt8M6K2Jo2MkxvokH65iLDeR0=.48f15430-e669-4d97-bbf8-1d48a08f6f25@github.com>
References: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
 <J84lTj9I8s_l2BWxnyDt8M6K2Jo2MkxvokH65iLDeR0=.48f15430-e669-4d97-bbf8-1d48a08f6f25@github.com>
Message-ID: <-OKqrZ463NZXpBPE2L1P-TPFvkykuwQOcdrv94pCiCo=.231b1449-4521-41fd-a623-adc58bea1418@github.com>

On Mon, 14 Dec 2020 11:16:15 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

>> Greetings,
>> 
>> AIX/xlc does not support the THREAD_LOCAL macro.
>> 
>> This change moves more involved event posting from gc/shared into JFR.
>> 
>> Thanks
>> Markus
>
> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:
> 
>   StackObj -> AllStatic

Marked as reviewed by egahlin (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk16/pull/11


From mgronlun at openjdk.java.net  Mon Dec 14 11:25:57 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Mon, 14 Dec 2020 11:25:57 GMT
Subject: [jdk16] RFR: 8258094: AIX build fails after 8257602 [v2]
In-Reply-To: <ue-tmDMOn9Z_3ynWhT6JlEru6OQyycYzNudhbIfg5Ls=.ee0423ac-b6ca-4730-a436-84e264ef6df9@github.com>
References: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
 <ue-tmDMOn9Z_3ynWhT6JlEru6OQyycYzNudhbIfg5Ls=.ee0423ac-b6ca-4730-a436-84e264ef6df9@github.com>
Message-ID: <IrySWraXnXwz2x9GKztouBaXdyA2Sl9cORD1AzNamjM=.430b2a37-7a76-4ef0-9510-2ae68f359c77@github.com>

On Mon, 14 Dec 2020 09:33:12 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   StackObj -> AllStatic
>
> Hi Markus,
> 
> this fix looks good to me. Aix build succeeds with it.
> 
> Thanks for fixing,
> Richard.

Thanks @reinrich , @tstuefe and @egahlin for your reviews!

-------------

PR: https://git.openjdk.java.net/jdk16/pull/11


From mgronlun at openjdk.java.net  Mon Dec 14 11:39:01 2020
From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=)
Date: Mon, 14 Dec 2020 11:39:01 GMT
Subject: [jdk16] Integrated: 8258094: AIX build fails after 8257602
In-Reply-To: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
References: <-ajLwELRtZYMYAMYYIaogFmsNbH7KEW8hbrLI9MXiaY=.cf36144a-fe2f-4eea-b4ee-c5e83a7977e3@github.com>
Message-ID: <7yp2ZEa_mG1SSfSLQY4wW96aX_6VZPHwdit4E3lcdfg=.9abdd7e3-aa8d-4650-9757-e29bf0da0d36@github.com>

On Fri, 11 Dec 2020 13:41:02 GMT, Markus Gr?nlund <mgronlun at openjdk.org> wrote:

> Greetings,
> 
> AIX/xlc does not support the THREAD_LOCAL macro.
> 
> This change moves more involved event posting from gc/shared into JFR.
> 
> Thanks
> Markus

This pull request has now been integrated.

Changeset: afc44414
Author:    Markus Gr?nlund <mgronlun at openjdk.org>
URL:       https://git.openjdk.java.net/jdk16/commit/afc44414
Stats:     238 lines in 5 files changed: 155 ins; 79 del; 4 mod

8258094: AIX build fails after 8257602

Reviewed-by: rrich, stuefe, egahlin

-------------

PR: https://git.openjdk.java.net/jdk16/pull/11


From tschatzl at openjdk.java.net  Mon Dec 14 12:11:55 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 14 Dec 2020 12:11:55 GMT
Subject: RFR: 8258142: Simplify G1RedirtyCardsQueue
In-Reply-To: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
References: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
Message-ID: <M_vlvzPxOjVaXcR4o8kSPCKPeN3c3QRpjiAFNHT3aXQ=.da137937-d8c4-4f37-989a-f4fab5dcfeb6@github.com>

On Sun, 13 Dec 2020 01:45:20 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to G1RedirtyCardsQueue to separate the local qset
> from the queue. This simplifies the implementation, though requires clients
> to deal with the local qset explicitly. This is an enabling step toward
> desired simplifications of the PtrQueue hierarchy. This change also
> simplifies the interaction between the local qset and the global qset.
> 
> Testing:
> mach5 tier1-5

Lgtm. Thanks.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1755


From iwalulya at openjdk.java.net  Mon Dec 14 12:48:55 2020
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Mon, 14 Dec 2020 12:48:55 GMT
Subject: RFR: 8258142: Simplify G1RedirtyCardsQueue
In-Reply-To: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
References: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
Message-ID: <uowZWib32ooyccAvidsYJKcI1xoiLBQ1iMShy4CzpTQ=.76151faa-beb7-4086-bec7-a6861b235120@github.com>

On Sun, 13 Dec 2020 01:45:20 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to G1RedirtyCardsQueue to separate the local qset
> from the queue. This simplifies the implementation, though requires clients
> to deal with the local qset explicitly. This is an enabling step toward
> desired simplifications of the PtrQueue hierarchy. This change also
> simplifies the interaction between the local qset and the global qset.
> 
> Testing:
> mach5 tier1-5

looks good

minor:
maybe maintain the postfix `_qset ` in the naming of `_rdclqs` as done for `_shared_qset` and `_local_qset`

-------------

Marked as reviewed by iwalulya (Committer).

PR: https://git.openjdk.java.net/jdk/pull/1755


From zgu at openjdk.java.net  Mon Dec 14 15:25:01 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 14 Dec 2020 15:25:01 GMT
Subject: RFR: 8258239: Shenandoah: Used wrong closure to mark concurrent roots
Message-ID: <w24HqDlGeuE8_tb4VKgJn83jQhgYk8lJMSj8kkzk0d0=.f09a87ae-d9da-4a21-b708-864b2523e4da@github.com>

During concurrent mark phase, there should not have forwarded objects. Therefore, it should use ShenandoahMarkRefsClosure to mark concurrent roots, instead of ShenandoahMarkResolveRefsClosure.

Note: this is *not* a correctness bug, but performance one, as ShenandoahMarkResolveRefsClosure closure unnecessarily resolves forwarding pointers, where always resolved to themselves.

- [x] hotspot_gc_shenandoah

-------------

Commit messages:
 - JDK-8258239

Changes: https://git.openjdk.java.net/jdk/pull/1768/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1768&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258239
  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1768.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1768/head:pull/1768

PR: https://git.openjdk.java.net/jdk/pull/1768


From rkennke at openjdk.java.net  Mon Dec 14 15:31:57 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 14 Dec 2020 15:31:57 GMT
Subject: RFR: 8258239: Shenandoah: Used wrong closure to mark concurrent
 roots
In-Reply-To: <w24HqDlGeuE8_tb4VKgJn83jQhgYk8lJMSj8kkzk0d0=.f09a87ae-d9da-4a21-b708-864b2523e4da@github.com>
References: <w24HqDlGeuE8_tb4VKgJn83jQhgYk8lJMSj8kkzk0d0=.f09a87ae-d9da-4a21-b708-864b2523e4da@github.com>
Message-ID: <yfA9X8JDWo9x1GJRpCnRWdwbhf5WgmpyGn4GdgwZOGo=.3698ddf5-a099-4610-8105-2b5af2a67aec@github.com>

On Mon, 14 Dec 2020 15:20:42 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> During concurrent mark phase, there should not have forwarded objects. Therefore, it should use ShenandoahMarkRefsClosure to mark concurrent roots, instead of ShenandoahMarkResolveRefsClosure.
> 
> Note: this is *not* a correctness bug, but performance one, as ShenandoahMarkResolveRefsClosure closure unnecessarily resolves forwarding pointers, where always resolved to themselves.
> 
> - [x] hotspot_gc_shenandoah

Looks good to me!

-------------

Marked as reviewed by rkennke (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1768


From kbarrett at openjdk.java.net  Mon Dec 14 16:02:55 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 14 Dec 2020 16:02:55 GMT
Subject: RFR: 8258142: Simplify G1RedirtyCardsQueue
In-Reply-To: <uowZWib32ooyccAvidsYJKcI1xoiLBQ1iMShy4CzpTQ=.76151faa-beb7-4086-bec7-a6861b235120@github.com>
References: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
 <uowZWib32ooyccAvidsYJKcI1xoiLBQ1iMShy4CzpTQ=.76151faa-beb7-4086-bec7-a6861b235120@github.com>
Message-ID: <F-gI3EIqFPDlyChFFTyhTKKmGLBVsU89DjTdMT2-7g0=.41a212ec-f668-456b-986f-2decf4a2b5a8@github.com>

On Mon, 14 Dec 2020 12:45:52 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

> minor:
> maybe maintain the postfix `_qset ` in the naming of `_rdclqs` as done for `_shared_qset` and `_local_qset`

Yeah, "rdclqs" is a little opaque.  Changing to rdc_local_qset.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1755


From kbarrett at openjdk.java.net  Mon Dec 14 16:02:56 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 14 Dec 2020 16:02:56 GMT
Subject: RFR: 8258142: Simplify G1RedirtyCardsQueue
In-Reply-To: <M_vlvzPxOjVaXcR4o8kSPCKPeN3c3QRpjiAFNHT3aXQ=.da137937-d8c4-4f37-989a-f4fab5dcfeb6@github.com>
References: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
 <M_vlvzPxOjVaXcR4o8kSPCKPeN3c3QRpjiAFNHT3aXQ=.da137937-d8c4-4f37-989a-f4fab5dcfeb6@github.com>
Message-ID: <hQefk_NlFdWXwCj7H7EG35THPkZZePmeEyVi6UxGaLI=.ba01088e-f989-4e61-88f9-f82e7261a765@github.com>

On Mon, 14 Dec 2020 12:09:36 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Please review this change to G1RedirtyCardsQueue to separate the local qset
>> from the queue. This simplifies the implementation, though requires clients
>> to deal with the local qset explicitly. This is an enabling step toward
>> desired simplifications of the PtrQueue hierarchy. This change also
>> simplifies the interaction between the local qset and the global qset.
>> 
>> Testing:
>> mach5 tier1-5
>
> Lgtm. Thanks.

Thanks @tschatzl and @walulyai for reviewing.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1755


From kbarrett at openjdk.java.net  Mon Dec 14 16:13:11 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 14 Dec 2020 16:13:11 GMT
Subject: RFR: 8258142: Simplify G1RedirtyCardsQueue [v2]
In-Reply-To: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
References: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
Message-ID: <Fies5pgQv2TO3Y-7HtpRuk2XpygmO7VDn_qMGVELNcw=.a594fe37-1308-49df-813f-b4db39d58fa0@github.com>

> Please review this change to G1RedirtyCardsQueue to separate the local qset
> from the queue. This simplifies the implementation, though requires clients
> to deal with the local qset explicitly. This is an enabling step toward
> desired simplifications of the PtrQueue hierarchy. This change also
> simplifies the interaction between the local qset and the global qset.
> 
> Testing:
> mach5 tier1-5

Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:

  iwalulya review - expand some abbrevs

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1755/files
  - new: https://git.openjdk.java.net/jdk/pull/1755/files/6a48914f..78c7c36b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1755&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1755&range=00-01

  Stats: 8 lines in 3 files changed: 0 ins; 0 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1755.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1755/head:pull/1755

PR: https://git.openjdk.java.net/jdk/pull/1755


From kbarrett at openjdk.java.net  Mon Dec 14 16:16:55 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 14 Dec 2020 16:16:55 GMT
Subject: Integrated: 8258142: Simplify G1RedirtyCardsQueue
In-Reply-To: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
References: <sJQm88j9YSprkx6w8LQgXfoYg1D2JSZl4LMgFP3KiaE=.81c137e8-3b74-4e6d-bbb5-7561a844a89d@github.com>
Message-ID: <Mm0BD78if0TIpZv5jUqMzqOYFfQbvXRWCGTl80AiVac=.f05e8e15-a648-41df-8ca9-ef919d617586@github.com>

On Sun, 13 Dec 2020 01:45:20 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to G1RedirtyCardsQueue to separate the local qset
> from the queue. This simplifies the implementation, though requires clients
> to deal with the local qset explicitly. This is an enabling step toward
> desired simplifications of the PtrQueue hierarchy. This change also
> simplifies the interaction between the local qset and the global qset.
> 
> Testing:
> mach5 tier1-5

This pull request has now been integrated.

Changeset: 1ff0f167
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/1ff0f167
Stats:     79 lines in 5 files changed: 16 ins; 32 del; 31 mod

8258142: Simplify G1RedirtyCardsQueue

Separate local redirty qset from redirty queue.

Reviewed-by: tschatzl, iwalulya

-------------

PR: https://git.openjdk.java.net/jdk/pull/1755


From zgu at openjdk.java.net  Mon Dec 14 17:58:56 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 14 Dec 2020 17:58:56 GMT
Subject: Integrated: 8258239: Shenandoah: Used wrong closure to mark
 concurrent roots
In-Reply-To: <w24HqDlGeuE8_tb4VKgJn83jQhgYk8lJMSj8kkzk0d0=.f09a87ae-d9da-4a21-b708-864b2523e4da@github.com>
References: <w24HqDlGeuE8_tb4VKgJn83jQhgYk8lJMSj8kkzk0d0=.f09a87ae-d9da-4a21-b708-864b2523e4da@github.com>
Message-ID: <50JgcSzpQ8oOuEnH6ZeRaE1zlFY0a6qtcqZtCOgebcA=.4d84ddb7-7ced-4a17-ac33-b7fe053555d9@github.com>

On Mon, 14 Dec 2020 15:20:42 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> During concurrent mark phase, there should not have forwarded objects in roots. Therefore, it should use ShenandoahMarkRefsClosure to mark concurrent roots, instead of ShenandoahMarkResolveRefsClosure.
> 
> Note: this is *not* a correctness bug, but performance one, as ShenandoahMarkResolveRefsClosure closure unnecessarily resolves forwarding pointers, where always resolved to themselves.
> 
> - [x] hotspot_gc_shenandoah

This pull request has now been integrated.

Changeset: 2c3ae19a
Author:    Zhengyu Gu <zgu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/2c3ae19a
Stats:     1 line in 1 file changed: 0 ins; 0 del; 1 mod

8258239: Shenandoah: Used wrong closure to mark concurrent roots

Reviewed-by: rkennke

-------------

PR: https://git.openjdk.java.net/jdk/pull/1768


From zgu at openjdk.java.net  Mon Dec 14 19:58:05 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 14 Dec 2020 19:58:05 GMT
Subject: RFR: 8258244: Shenandoah: Not expecting forwarded object in roots
 during mark after JDK-8240868
Message-ID: <AeBvsI_EMdeYAOhuEuZhPASEvOb4FU0BU28k8cGLYn4=.a1142065-db4c-4e08-9d24-08bee4639b60@github.com>

This is a cleanup, no forwarded objects are expected in roots during mark phase after JDK-8240868.

There may be forwarded objects during full gc marking, if it is upgraded from degenerated GC, but roots are fixed before full gc marking happens.

- [x] hotspot_gc_shenandoah

-------------

Commit messages:
 - Merge branch 'master' into JDK-8258244-no-forwarded-mark
 - JDK-8258244
 - JDK-8258239

Changes: https://git.openjdk.java.net/jdk/pull/1772/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1772&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258244
  Stats: 20 lines in 1 file changed: 0 ins; 14 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1772.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1772/head:pull/1772

PR: https://git.openjdk.java.net/jdk/pull/1772


From rkennke at openjdk.java.net  Tue Dec 15 11:25:55 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Tue, 15 Dec 2020 11:25:55 GMT
Subject: RFR: 8258244: Shenandoah: Not expecting forwarded object in roots
 during mark after JDK-8240868
In-Reply-To: <AeBvsI_EMdeYAOhuEuZhPASEvOb4FU0BU28k8cGLYn4=.a1142065-db4c-4e08-9d24-08bee4639b60@github.com>
References: <AeBvsI_EMdeYAOhuEuZhPASEvOb4FU0BU28k8cGLYn4=.a1142065-db4c-4e08-9d24-08bee4639b60@github.com>
Message-ID: <WqYd86AEHWwwo97vaWaufHhQR2JCmnH1rPtbQYmCIRc=.d16f3ef6-7ba1-4382-bcbf-c855da3010e0@github.com>

On Mon, 14 Dec 2020 19:53:13 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> This is a cleanup, no forwarded objects are expected in roots during mark phase after JDK-8240868.
> 
> There may be forwarded objects during full gc marking, if it is upgraded from degenerated GC, but roots are fixed before full gc marking happens.
> 
> - [x] hotspot_gc_shenandoah

Yes, that makes sense and looks good. Thank you!

-------------

Marked as reviewed by rkennke (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1772


From zgu at openjdk.java.net  Tue Dec 15 13:24:58 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Tue, 15 Dec 2020 13:24:58 GMT
Subject: Integrated: 8258244: Shenandoah: Not expecting forwarded object in
 roots during mark after JDK-8240868
In-Reply-To: <AeBvsI_EMdeYAOhuEuZhPASEvOb4FU0BU28k8cGLYn4=.a1142065-db4c-4e08-9d24-08bee4639b60@github.com>
References: <AeBvsI_EMdeYAOhuEuZhPASEvOb4FU0BU28k8cGLYn4=.a1142065-db4c-4e08-9d24-08bee4639b60@github.com>
Message-ID: <6-4dBXmv0rDAW9Vx0P7RmUx_7bhZkM8R1dDmeDuaqGI=.45878cf1-98eb-4ad7-8c02-24f4bfe2f2f9@github.com>

On Mon, 14 Dec 2020 19:53:13 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> This is a cleanup, no forwarded objects are expected in roots during mark phase after JDK-8240868.
> 
> There may be forwarded objects during full gc marking, if it is upgraded from degenerated GC, but roots are fixed before full gc marking happens.
> 
> - [x] hotspot_gc_shenandoah

This pull request has now been integrated.

Changeset: a372be4b
Author:    Zhengyu Gu <zgu at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/a372be4b
Stats:     20 lines in 1 file changed: 0 ins; 14 del; 6 mod

8258244: Shenandoah: Not expecting forwarded object in roots during mark after JDK-8240868

Reviewed-by: rkennke

-------------

PR: https://git.openjdk.java.net/jdk/pull/1772


From akozlov at openjdk.java.net  Tue Dec 15 14:35:17 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 14:35:17 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v9]
In-Reply-To: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
Message-ID: <95S7DbygBemN4yE6wxka4-ETsFJEm4XES-9o6P8Kl78=.3a804e4f-8cbb-4c0e-b58d-6e95c71e511b@github.com>

> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
> 
> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
> 
> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
> 
> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
> 
> Tested: 
> * local tier1
> * jdk-submit
> * codesign[2] with hardened runtime and allow-jit but without 
> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
> 
> (adding GC group as suggested by @dholmes-ora)
> 
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2]
>  
>   codesign \
>     --sign - \
>     --options runtime \
>     --entitlements ents.plist \
>     --timestamp \
>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
> [3]
>   <?xml version="1.0" encoding="UTF-8"?>
>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>   <plist version="1.0">
>     <dict>
>       <key>com.apple.security.cs.allow-jit</key>
>       <true/>
>       <key>com.apple.security.cs.disable-library-validation</key>
>       <true/>
>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>       <true/>
>     </dict>
>   </plist>

Anton Kozlov has updated the pull request incrementally with four additional commits since the last revision:

 - Use exec in os for consistency
 - Simplify virtualspace
 - Add exec to attempt_reserve_at
 - Add default value for uncommit

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/294/files
  - new: https://git.openjdk.java.net/jdk/pull/294/files/31fe1fb0..ec32e144

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=08
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=07-08

  Stats: 109 lines in 16 files changed: 17 ins; 23 del; 69 mod
  Patch: https://git.openjdk.java.net/jdk/pull/294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 14:53:11 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 14:53:11 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v10]
In-Reply-To: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
Message-ID: <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>

> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
> 
> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
> 
> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
> 
> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
> 
> Tested: 
> * local tier1
> * jdk-submit
> * codesign[2] with hardened runtime and allow-jit but without 
> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
> 
> (adding GC group as suggested by @dholmes-ora)
> 
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2]
>  
>   codesign \
>     --sign - \
>     --options runtime \
>     --entitlements ents.plist \
>     --timestamp \
>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
> [3]
>   <?xml version="1.0" encoding="UTF-8"?>
>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>   <plist version="1.0">
>     <dict>
>       <key>com.apple.security.cs.allow-jit</key>
>       <true/>
>       <key>com.apple.security.cs.disable-library-validation</key>
>       <true/>
>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>       <true/>
>     </dict>
>   </plist>

Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:

  Update pd_commit_memory on bsd

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/294/files
  - new: https://git.openjdk.java.net/jdk/pull/294/files/ec32e144..e40337e6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=09
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=294&range=08-09

  Stats: 11 lines in 1 file changed: 9 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/294.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/294/head:pull/294

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 15:31:58 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 15:31:58 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v8]
In-Reply-To: <MxKLn3scw77jXktBcwEGVNdICU9rlroE_SnueisvYvM=.0216f1f3-0091-4f0c-8716-6ae8343cd5b3@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <o-5t1pTLSC64qbkuUXlc7274hi2AuuIeEkleCfaOR4g=.70f82985-9afd-4415-bb50-cdc706a725f7@github.com>
 <MxKLn3scw77jXktBcwEGVNdICU9rlroE_SnueisvYvM=.0216f1f3-0091-4f0c-8716-6ae8343cd5b3@github.com>
Message-ID: <HFhkKF6QHxOy4JwZLPnozP3gJCFHec5E5jEQVwDCmxk=.f3aabe57-2b27-4cfb-9db0-848db0237afa@github.com>

On Sat, 12 Dec 2020 09:29:08 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update style
>
> Hi Anton,
> 
> 1) can you please make the executable parameter on os::uncommit() default false? Lets minimize the impact for the vast majority of callers which do not need protected memory. The only caller needing this is ReservedSpace. Would also be more in sync with the default false for executable on os::reserve(). 
> 
> 2) Personal nit, I really find this `ExecMem` jarring. We don't do this (pass named aliases for boolean flags) for any other arguments AFAICS. The usual way to emphasize arg names is with comments:
> bool result = os::commit_memory(base, size, /*exec*/ false);
> If you still prefer to use it, could you leave at least those places unchanged which are unaffected by your patch?
> 
> 3) There is some code explicitly dealing with the supposed inability of attempt_reserve_memory_at() using MAP_JIT. I don't understand. I thought the problem was thjat MAP_JIT and MAP_FIXED don't mix. But attempt_reserve_memory_at() does explicitely not use MAP_FIXED, it just attempts to map at a non-null wish address. Does that also not work with MAP_JIT?
> 
> 4) For my taste there are too many unrelated changes, especially in ReservedSpace. Making all those file scope static helpers members of ReservedSpace causes a lot of diffs. Makes sense cleanup-wise but will make backporting your patch more difficult later (I expect this will be a strong candidate for backporting). Please tone down the patch a bit. I pointed some parts out directly below. Beyond those, I leave it up to you how far you minimize the patch.
> 
> Thanks, Thomas

Hi Thomas, 

Thank you for review!

> 1. can you please make the executable parameter on os::uncommit() default false? 

Ok, fixed.

> 2. Personal nit, I really find this `ExecMem` jarring.
> If you still prefer to use it, could you leave at least those places unchanged which are unaffected by your patch?

The only two places where I had to replace false with !ExecMem are https://github.com/openjdk/jdk/pull/294/files#diff-80d6a105c4da7337cbc8c4602c8a1582c6a1beb771797e1a84928bd864afe563R105

It would be really inconsistent to maintain them as 
char* base = os::reserve_memory(size, !ExecMem, mtThreadStack);
bool result = os::commit_memory(base, size, false);
(Please note that the false without comment didn't obey any style, so I had to fix these anyway)

> 3. There is some code explicitly dealing with the supposed inability of attempt_reserve_memory_at() using MAP_JIT. I don't understand. I thought the problem was thjat MAP_JIT and MAP_FIXED don't mix. But attempt_reserve_memory_at() does explicitely not use MAP_FIXED, it just attempts to map at a non-null wish address. Does that also not work with MAP_JIT?

Interesting and funny, it does work. I would expected it does not (due to security reasons I could image behind forbidding MAP_JIT|MAP_FIXED). But since it works for now, I've added the executable parameter to attemp_reserve_memory_at as well.

> 2. For my taste there are too many unrelated changes, especially in ReservedSpace. Making all those file scope static helpers members of ReservedSpace causes a lot of diffs. Makes sense cleanup-wise but will make backporting your patch more difficult later (I expect this will be a strong candidate for backporting). Please tone down the patch a bit. I pointed some parts out directly below. Beyond those, I leave it up to you how far you minimize the patch.

Ok, it is possible to do the clean up after. Thanks for your comment, I'll account them in the future clean-up RFR. 

Thanks,
Anton

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 15:36:57 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 15:36:57 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v8]
In-Reply-To: <YL6mEMlHRfqfB-d-E7Lt457gOyYY8kojcYawzesNBqw=.7f8694ce-0c16-465e-9b13-6deb8d1b44a8@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <o-5t1pTLSC64qbkuUXlc7274hi2AuuIeEkleCfaOR4g=.70f82985-9afd-4415-bb50-cdc706a725f7@github.com>
 <MxKLn3scw77jXktBcwEGVNdICU9rlroE_SnueisvYvM=.0216f1f3-0091-4f0c-8716-6ae8343cd5b3@github.com>
 <YL6mEMlHRfqfB-d-E7Lt457gOyYY8kojcYawzesNBqw=.7f8694ce-0c16-465e-9b13-6deb8d1b44a8@github.com>
Message-ID: <LY0yfZURL-EOwBm2z6tziZ7TLGAE6b67Mq_kPqcdVe4=.b71800db-107c-406d-a427-171dbf027045@github.com>

On Sat, 12 Dec 2020 07:52:45 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> src/hotspot/os/bsd/os_bsd.cpp line 1690:
>> 
>>> 1688:   if (::mprotect(addr, size, prot) == 0) {
>>> 1689:     return true;
>>> 1690:   }
>> 
>> You need to handle mprotect failure here too. Probably just by returning false. There is no point in doing the mmap below as fallback. The same applies for the OpenBSD path too.
>> 
>> mprotect may, at least on Linux, fail if the new mapping introduced by changing the protection would bring the process above the system limit for numbrer of mappings. I strongly believe there must be a similar error scenario on Mac. At least on BSD there is (https://man.openbsd.org/mprotect.2), see ENOMEM.
>
> Also, this is asymetric to uncommit now for the !exec case. There, we mmap(MAP_NORESERVE, PROT_NONE). We have established that MAP_NORESERVE is a noop, so this would be probably fine. Still, I'd do the mmap(PROT_RW) for commit instead for !exec:
> 
> if (exec)
>   // Do not replace MAP_JIT mappings, see JDK-8234930
>   return mprotect() == 0;
> } else {
>   mmap ...
> }
> If not, I would remove MAP_NORESERVE from this code.

> You need to handle mprotect failure here too.

They are handled later https://github.com/openjdk/jdk/pull/294/files#diff-1f93205c2e57bee432f8fb7a0725ba1dfdbe5b901ac63010ea0b43922e34ac12R1708

> Also, this is asymetric to uncommit now for the !exec case. Still, I'd do the mmap(PROT_RW) for commit instead for !exec:

Thanks, this looks good, I've applied the suggestion.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From stuefe at openjdk.java.net  Tue Dec 15 16:09:00 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 15 Dec 2020 16:09:00 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v10]
In-Reply-To: <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>
Message-ID: <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>

On Tue, 15 Dec 2020 14:53:11 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
>> 
>> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
>> 
>> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
>> 
>> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
>> 
>> Tested: 
>> * local tier1
>> * jdk-submit
>> * codesign[2] with hardened runtime and allow-jit but without 
>> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
>> 
>> (adding GC group as suggested by @dholmes-ora)
>> 
>> 
>> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
>> [2]
>>  
>>   codesign \
>>     --sign - \
>>     --options runtime \
>>     --entitlements ents.plist \
>>     --timestamp \
>>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
>> [3]
>>   <?xml version="1.0" encoding="UTF-8"?>
>>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>>   <plist version="1.0">
>>     <dict>
>>       <key>com.apple.security.cs.allow-jit</key>
>>       <true/>
>>       <key>com.apple.security.cs.disable-library-validation</key>
>>       <true/>
>>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>>       <true/>
>>     </dict>
>>   </plist>
>
> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update pd_commit_memory on bsd

> Hi Thomas,
> 
> Thank you for review!
> 
> > 1. can you please make the executable parameter on os::uncommit() default false?
> 
> Ok, fixed.

Thanks!

> 
> > 1. Personal nit, I really find this `ExecMem` jarring.
> >    If you still prefer to use it, could you leave at least those places unchanged which are unaffected by your patch?
> 
> The only two places where I had to replace false with !ExecMem are https://github.com/openjdk/jdk/pull/294/files#diff-80d6a105c4da7337cbc8c4602c8a1582c6a1beb771797e1a84928bd864afe563R105
> 
> It would be really inconsistent to maintain them as
> 
> ```
> char* base = os::reserve_memory(size, !ExecMem, mtThreadStack);
> bool result = os::commit_memory(base, size, false);
> ```
> 
> (Please note that the false without comment didn't obey any style, so I had to fix these anyway)

Okay.

> 
> > 1. There is some code explicitly dealing with the supposed inability of attempt_reserve_memory_at() using MAP_JIT. I don't understand. I thought the problem was thjat MAP_JIT and MAP_FIXED don't mix. But attempt_reserve_memory_at() does explicitely not use MAP_FIXED, it just attempts to map at a non-null wish address. Does that also not work with MAP_JIT?
> 
> Interesting and funny, it does work. I would expected it does not (due to security reasons I could image behind forbidding MAP_JIT|MAP_FIXED). But since it works for now, I've added the executable parameter to attemp_reserve_memory_at as well.

I am not that surprised. Wish address != NULL with MAP_FIXED=0 just establishes a brand new mapping, does not change an existing mapping. I think they just forbid to modify existing mappings with MAP_JIT once established.

> 
> > 1. For my taste there are too many unrelated changes, especially in ReservedSpace. Making all those file scope static helpers members of ReservedSpace causes a lot of diffs. Makes sense cleanup-wise but will make backporting your patch more difficult later (I expect this will be a strong candidate for backporting). Please tone down the patch a bit. I pointed some parts out directly below. Beyond those, I leave it up to you how far you minimize the patch.
> 
> Ok, it is possible to do the clean up after. Thanks for your comment, I'll account them in the future clean-up RFR.
> 

Great, thanks, the change looks much cleaner now.

> Thanks,
> Anton

There are two remaining very minor nits, I leave it up to you if you fix them. From my eyes this is fine. 

-----

I think we should shake up that coding at some point and improve the API. I imagine something along the line of

struct mappinginfo_t { 
   pagesize, // may be dynamically chosen by the OS layer but caller wants to know
   exec, 
   base, size, ... // maybe
   + maybe opaque OS specific information
};
address os::reserve_xxxx(size, ..., reservation_info_t* p_info = NULL);
bool os::commit_memory(addr, size, const reservation_info_t* info);
bool os::commit_memory(addr, size, const reservation_info_t* info);

The concrete form can be shaped however, but the base idea is to return more information than just the reservation pointer:
- some attributes may be chosen or adapted by os::reserve_... (eg PageSize), but caller could just be told to spare him having to second-guess  os::reserve_memory_special().
- some attributes may be only interesting to the os layer, so they can be opaque for the caller - but caller could hold onto that information until commit/uncommit. Examples for that are AIX: mmap-or-shmat , or Windows: NUMA-striped-allocation or not.
- some information may be caller specified (eg exec) but returning this in a handle-like structure relieves the caller from holding on to that particular information, and having to pass each argument separately-

I know this is crossing territory into what ReservedSpace does today, but that class is quite polluted with VM specific stuff, eg that noaccess zone. Well, lets see how this goes.

Thanks alot for your perseverance!

Cheers, Thomas

test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp line 172:

> 170:     const size_t num_pages = 4;
> 171:     const size_t size = num_pages * page_sz;
> 172:     char* base = os::reserve_memory(size, !ExecMem, mtTest);

This test function seems to leak this reservation. I leave it up to you if you want to fix it, has nothing to do with your test. If you don't could you please open a issue for this? 

(Took me some minutes to figure out that all these tests are NMT related)

src/hotspot/os/windows/os_windows.cpp line 3271:

> 3269: 
> 3270: char* os::reserve_memory_aligned(size_t size, size_t alignment, bool exec) {
> 3271:   // exec can be ignored

Change to "Support for exec not implemented" ? (Maybe even with an assert) (Or - leave that up to you - leave this parameter out of os::reserve_memory_aligned completely.)

-------------

Marked as reviewed by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 17:32:01 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 17:32:01 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v10]
In-Reply-To: <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>
 <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
Message-ID: <CeXCp7OaMK5Qak6QNelrLb-HvgFWbB--yiZ_bD27jt8=.437051fa-ff46-4d20-96c2-99ac5cca8a79@github.com>

On Tue, 15 Dec 2020 15:41:43 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update pd_commit_memory on bsd
>
> test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp line 172:
> 
>> 170:     const size_t num_pages = 4;
>> 171:     const size_t size = num_pages * page_sz;
>> 172:     char* base = os::reserve_memory(size, !ExecMem, mtTest);
> 
> This test function seems to leak this reservation. I leave it up to you if you want to fix it, has nothing to do with your test. If you don't could you please open a issue for this? 
> 
> (Took me some minutes to figure out that all these tests are NMT related)

Right. Tracked in https://bugs.openjdk.java.net/browse/JDK-8258415, probably it's worth review tests for similar issues

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 17:43:02 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 17:43:02 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v10]
In-Reply-To: <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>
 <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
Message-ID: <LYf0GSMM099jFoXSbYtmf6iWR7CWyt9BPYnxDxiOQaE=.a042c07a-796e-4304-a99d-b563b30ebd98@github.com>

On Tue, 15 Dec 2020 15:44:26 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update pd_commit_memory on bsd
>
> src/hotspot/os/windows/os_windows.cpp line 3271:
> 
>> 3269: 
>> 3270: char* os::reserve_memory_aligned(size_t size, size_t alignment, bool exec) {
>> 3271:   // exec can be ignored
> 
> Change to "Support for exec not implemented" ? (Maybe even with an assert) (Or - leave that up to you - leave this parameter out of os::reserve_memory_aligned completely.)

Actually, since exec and non-exec reservations are equal on windows, ignoring exec is a correct implementation. Assert would not fit here. Please let me know if the comment fails to deliver this message. Also, I have a prototype implementation of CDS support for macOS/AArch64 and it needs executable aligned mapping, I think we'll need this parameter anyway.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From stuefe at openjdk.java.net  Tue Dec 15 17:49:59 2020
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Tue, 15 Dec 2020 17:49:59 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v10]
In-Reply-To: <LYf0GSMM099jFoXSbYtmf6iWR7CWyt9BPYnxDxiOQaE=.a042c07a-796e-4304-a99d-b563b30ebd98@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>
 <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
 <LYf0GSMM099jFoXSbYtmf6iWR7CWyt9BPYnxDxiOQaE=.a042c07a-796e-4304-a99d-b563b30ebd98@github.com>
Message-ID: <lPLnC6RB6QCaAjruhnVs-Tahc5UIvm46-Qp2AyLsKc0=.cf30dc35-c7a5-459f-ba16-88442ff90f73@github.com>

On Tue, 15 Dec 2020 17:40:39 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> src/hotspot/os/windows/os_windows.cpp line 3271:
>> 
>>> 3269: 
>>> 3270: char* os::reserve_memory_aligned(size_t size, size_t alignment, bool exec) {
>>> 3271:   // exec can be ignored
>> 
>> Change to "Support for exec not implemented" ? (Maybe even with an assert) (Or - leave that up to you - leave this parameter out of os::reserve_memory_aligned completely.)
>
> Actually, since exec and non-exec reservations are equal on windows, ignoring exec is a correct implementation. Assert would not fit here. Please let me know if the comment fails to deliver this message. Also, I have a prototype implementation of CDS support for macOS/AArch64 and it needs executable aligned mapping, I think we'll need this parameter anyway.

Okay, leave it as it is.

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 18:03:00 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 18:03:00 GMT
Subject: RFR: 8234930: Use MAP_JIT when allocating pages for code cache on
 macOS [v10]
In-Reply-To: <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
 <GuaWve1wUuqG0tEzGtsvY11eiuMlNMJB2NzNnYUxTmc=.8c53bda4-3c4b-472c-85c0-92c01d1f0657@github.com>
 <2i6S1b7ul-EJ9WyiCHGIAAiwPcnybhFQ1cAYhd52EQU=.fd1905d3-af3d-451d-885d-2f5758603a5f@github.com>
Message-ID: <2aPTSZ_1FPGC9oQPQ-CSfxm7C6QfE6PcgF_zOxdwSjU=.b3767b8c-8e27-4317-9b7c-35ed85df58fc@github.com>

On Tue, 15 Dec 2020 16:06:23 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update pd_commit_memory on bsd
>
>> Hi Thomas,
>> 
>> Thank you for review!
>> 
>> > 1. can you please make the executable parameter on os::uncommit() default false?
>> 
>> Ok, fixed.
> 
> Thanks!
> 
>> 
>> > 1. Personal nit, I really find this `ExecMem` jarring.
>> >    If you still prefer to use it, could you leave at least those places unchanged which are unaffected by your patch?
>> 
>> The only two places where I had to replace false with !ExecMem are https://github.com/openjdk/jdk/pull/294/files#diff-80d6a105c4da7337cbc8c4602c8a1582c6a1beb771797e1a84928bd864afe563R105
>> 
>> It would be really inconsistent to maintain them as
>> 
>> ```
>> char* base = os::reserve_memory(size, !ExecMem, mtThreadStack);
>> bool result = os::commit_memory(base, size, false);
>> ```
>> 
>> (Please note that the false without comment didn't obey any style, so I had to fix these anyway)
> 
> Okay.
> 
>> 
>> > 1. There is some code explicitly dealing with the supposed inability of attempt_reserve_memory_at() using MAP_JIT. I don't understand. I thought the problem was thjat MAP_JIT and MAP_FIXED don't mix. But attempt_reserve_memory_at() does explicitely not use MAP_FIXED, it just attempts to map at a non-null wish address. Does that also not work with MAP_JIT?
>> 
>> Interesting and funny, it does work. I would expected it does not (due to security reasons I could image behind forbidding MAP_JIT|MAP_FIXED). But since it works for now, I've added the executable parameter to attemp_reserve_memory_at as well.
> 
> I am not that surprised. Wish address != NULL with MAP_FIXED=0 just establishes a brand new mapping, does not change an existing mapping. I think they just forbid to modify existing mappings with MAP_JIT once established.
> 
>> 
>> > 1. For my taste there are too many unrelated changes, especially in ReservedSpace. Making all those file scope static helpers members of ReservedSpace causes a lot of diffs. Makes sense cleanup-wise but will make backporting your patch more difficult later (I expect this will be a strong candidate for backporting). Please tone down the patch a bit. I pointed some parts out directly below. Beyond those, I leave it up to you how far you minimize the patch.
>> 
>> Ok, it is possible to do the clean up after. Thanks for your comment, I'll account them in the future clean-up RFR.
>> 
> 
> Great, thanks, the change looks much cleaner now.
> 
>> Thanks,
>> Anton
> 
> There are two remaining very minor nits, I leave it up to you if you fix them. From my eyes this is fine. 
> 
> -----
> 
> I think we should shake up that coding at some point and improve the API. I imagine something along the line of
> 
> struct mappinginfo_t { 
>    pagesize, // may be dynamically chosen by the OS layer but caller wants to know
>    exec, 
>    base, size, ... // maybe
>    + maybe opaque OS specific information
> };
> address os::reserve_xxxx(size, ..., reservation_info_t* p_info = NULL);
> bool os::commit_memory(addr, size, const reservation_info_t* info);
> bool os::commit_memory(addr, size, const reservation_info_t* info);
> 
> The concrete form can be shaped however, but the base idea is to return more information than just the reservation pointer:
> - some attributes may be chosen or adapted by os::reserve_... (eg PageSize), but caller could just be told to spare him having to second-guess  os::reserve_memory_special().
> - some attributes may be only interesting to the os layer, so they can be opaque for the caller - but caller could hold onto that information until commit/uncommit. Examples for that are AIX: mmap-or-shmat , or Windows: NUMA-striped-allocation or not.
> - some information may be caller specified (eg exec) but returning this in a handle-like structure relieves the caller from holding on to that particular information, and having to pass each argument separately-
> 
> I know this is crossing territory into what ReservedSpace does today, but that class is quite polluted with VM specific stuff, eg that noaccess zone. Well, lets see how this goes.
> 
> Thanks alot for your perseverance!
> 
> Cheers, Thomas

Thomas, thank you very much for all the comments, insights, and all of your time. I think a have a good queue of future enhancements from various approaches we've discussed and tried, small and big ones :)

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From akozlov at openjdk.java.net  Tue Dec 15 18:46:57 2020
From: akozlov at openjdk.java.net (Anton Kozlov)
Date: Tue, 15 Dec 2020 18:46:57 GMT
Subject: Integrated: 8234930: Use MAP_JIT when allocating pages for code cache
 on macOS
In-Reply-To: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
References: <f0P7ljTefK623rhbrkXx4SzvII8wy8wldw8k6Ndw8cA=.16560fd0-88eb-480f-9552-bb894713ccde@github.com>
Message-ID: <nVdbIHbkUMYeTMFv17qhR18vUhDeu05R9nUOnhlPi3k=.56d27b86-56b2-4405-bfdb-27fd045bba09@github.com>

On Tue, 22 Sep 2020 07:08:35 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

> Please review an updated RFR from https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/041463.html
> 
> On macOS, MAP_JIT cannot be used with MAP_FIXED[1]. So pd_reserve_memory have to provide MAP_JIT for mmap(NULL, PROT_NONE), the function was made aware of exec permissions.
> 
> For executable and data regions, pd_commit_memory only unlocks the memory with mprotect, this should make no difference compared with old code.
> 
> For data regions, pd_uncommit_memory still uses a new overlapping anonymous mmap which returns pages to the OS and immediately reflects this in diagnostic tools like ps.  For executable regions it would require MAP_FIXED|MAP_JIT, so instead madvise(MADV_FREE)+mprotect(PROT_NONE) are used. They should also allow OS to reclaim pages, but apparently this does not happen immediately. In practice, it should not be a problem for executable regions, as codecache does not shrink (if I haven't missed anything, by the implementation and in principle).
> 
> Tested: 
> * local tier1
> * jdk-submit
> * codesign[2] with hardened runtime and allow-jit but without 
> allow-unsigned-executable-memory entitlements[3] produce a working bundle.
> 
> (adding GC group as suggested by @dholmes-ora)
> 
> 
> [1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_mman.c#L227
> [2]
>  
>   codesign \
>     --sign - \
>     --options runtime \
>     --entitlements ents.plist \
>     --timestamp \
>     $J/bin/* $J/lib/server/*.dylib $J/lib/*.dylib
> [3]
>   <?xml version="1.0" encoding="UTF-8"?>
>   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
>   <plist version="1.0">
>     <dict>
>       <key>com.apple.security.cs.allow-jit</key>
>       <true/>
>       <key>com.apple.security.cs.disable-library-validation</key>
>       <true/>
>       <key>com.apple.security.cs.allow-dyld-environment-variables</key>
>       <true/>
>     </dict>
>   </plist>

This pull request has now been integrated.

Changeset: 2273f955
Author:    Anton Kozlov <akozlov at openjdk.org>
Committer: Thomas Stuefe <stuefe at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/2273f955
Stats:     85 lines in 11 files changed: 26 ins; 0 del; 59 mod

8234930: Use MAP_JIT when allocating pages for code cache on macOS

Reviewed-by: stuefe, iklam, burban

-------------

PR: https://git.openjdk.java.net/jdk/pull/294


From github.com+168222+mgkwill at openjdk.java.net  Tue Dec 15 18:48:05 2020
From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams)
Date: Tue, 15 Dec 2020 18:48:05 GMT
Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes,
 select smallest page size in reserve_memory_special_huge_tlbfs* [v15]
In-Reply-To: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
References: <Sko_D6ZRMREOcNlqo66EuvHISa3MiD5vQJ5Xx0zx_M0=.abfb16f6-d7c0-4416-b05b-22bd5e1f748f@github.com>
Message-ID: <tse-ad7XQXfuEk8UUslcqVSBufsMsEA1RefC3L50KKs=.9dd6281c-7a95-4f51-9fff-90c5f91529fd@github.com>

> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using
> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k).
> 
> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size().
> 
> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved.

Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits:

 - Merge branch 'master' into update_hlp
 - Merge branch 'master' into update_hlp
 - Remove extraneous ' from warning
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'master' into update_hlp
 - Merge branch 'master' into update_hlp
 - Merge branch 'master' into update_hlp
 - Fix os::large_page_size() in last update
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Ivan W. Requested Changes
   
   Removed os::Linux::select_large_page_size and
   use os::page_size_for_region instead
   
   Removed Linux::find_large_page_size and use
   register_large_page_sizes. Streamlined
   Linux::setup_large_page_size
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Fix space format, use Linux:: for local func.
   
   Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
 - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp
 - ... and 13 more: https://git.openjdk.java.net/jdk/compare/da2415fe...d73e7a4c

-------------

Changes: https://git.openjdk.java.net/jdk/pull/1153/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=14
  Stats: 63 lines in 2 files changed: 24 ins; 11 del; 28 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1153.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153

PR: https://git.openjdk.java.net/jdk/pull/1153


From kbarrett at openjdk.java.net  Wed Dec 16 14:11:10 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 16 Dec 2020 14:11:10 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
Message-ID: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>

Please review this change to ParallelGC oldgen allocation, adding a missing
memory barrier.

The problem arises in the interaction between concurrent oldgen allocations,
where each would, if done serially (in either order), require expansion of
the generation.

An allocation of size N compares the mutable space's (end - top) with N to
determine if space is available.  If available, use top as the start of the
object of size N (adjusting top atomically) and assert the resulting memory
region is in the covered area.  If not, then expand.

Expansion updates the covered region, then updates the space (i.e. end).
There is currently no memory barrier between those operations.

As a result, we can have thread1 having done an expansion, updating the
covered region and the space end. Because there's no memory barrier there,
the space end may be updated before the covered region as far as some other
thread is concerned.

Meanwhile thread2's allocation reads the new end and goes ahead with the
allocation (which would not have fit with the old end value), then fails the
covered region check because it used the old covered range.  Although the
reads of end and the covered range are ordered here by the intervening CAS
of top, that doesn't help if the writes by thread1 are not also properly
ordered.

There is even a comment about this in PSOldGen::post_resize(), saying the
space update must be last (including after the covered region update).  But
without a memory barrier, there's nothing other than source order to ensure
that ordering.  So add a memory barrier.

I'm not sure whether this out-of-order update of the space end could lead to  
problems in a product build (where the assert doesn't apply).  Without
looking carefully, there appear to be opportunities for problems, such as
accessing uncovered parts of the card table.
 
There's another issue that I'm not addressing with this change.  Various
values are being read while subject to concurrent writes, without being in
any way tagged as atomic. (The writes are under the ExpandHeap_lock, the
reads are not.) This includes at least the covering region bounds and space
end.

Testing:
mach5 tier1
I was unable to reproduce the failure, so can't show any before / after
improvement.

-------------

Commit messages:
 - add memory barrier

Changes: https://git.openjdk.java.net/jdk16/pull/35/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=35&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257999
  Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/35.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/35/head:pull/35

PR: https://git.openjdk.java.net/jdk16/pull/35


From zgu at openjdk.java.net  Wed Dec 16 17:40:04 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Wed, 16 Dec 2020 17:40:04 GMT
Subject: RFR: 8258490: Shenandoah: Full GC does not need to remark threads and
 drain SATB buffers	
Message-ID: <T-8ZkIWf8STxuOGB8lBCm5p2xwhX2jv7oxMJGcueF84=.841c3879-7ff6-49ae-bc2a-bbb4f2ae5539@github.com>

Full GC marks heap at a pause with SATB deactivated, therefore, we don't need to remark threads and drain SATB buffers during final mark phase.

- [x] hotspot_gc_shenandoah

-------------

Commit messages:
 - Silent MacOSX build
 - JDK-8258490

Changes: https://git.openjdk.java.net/jdk/pull/1805/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1805&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258490
  Stats: 45 lines in 1 file changed: 22 ins; 15 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1805.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1805/head:pull/1805

PR: https://git.openjdk.java.net/jdk/pull/1805


From zgu at openjdk.java.net  Wed Dec 16 19:04:09 2020
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Wed, 16 Dec 2020 19:04:09 GMT
Subject: RFR: 8258490: Shenandoah: Full GC does not need to remark threads
 and drain SATB buffers	 [v2]
In-Reply-To: <T-8ZkIWf8STxuOGB8lBCm5p2xwhX2jv7oxMJGcueF84=.841c3879-7ff6-49ae-bc2a-bbb4f2ae5539@github.com>
References: <T-8ZkIWf8STxuOGB8lBCm5p2xwhX2jv7oxMJGcueF84=.841c3879-7ff6-49ae-bc2a-bbb4f2ae5539@github.com>
Message-ID: <xUU4X7egwYKYYPo9LRvgTH-fOCnjJa-ejoyLWqAheiI=.db8eafa8-2bdd-4ab8-b475-90db4615ac43@github.com>

> Full GC marks heap at a pause with SATB deactivated, therefore, we don't need to remark threads and drain SATB buffers during final mark phase.
> 
> - [x] hotspot_gc_shenandoah

Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision:

  Minor update

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1805/files
  - new: https://git.openjdk.java.net/jdk/pull/1805/files/7178b3b0..602347da

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1805&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1805&range=00-01

  Stats: 3 lines in 1 file changed: 1 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1805.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1805/head:pull/1805

PR: https://git.openjdk.java.net/jdk/pull/1805


From cgracie at openjdk.java.net  Wed Dec 16 19:19:04 2020
From: cgracie at openjdk.java.net (Charlie Gracie)
Date: Wed, 16 Dec 2020 19:19:04 GMT
Subject: RFR: 8257774: G1: Trigger collect when free region count drops
 below threshold to prevent evacuation failures
In-Reply-To: <yn2ArD8wEb88pT11grc5HGZz-iq65DzczE6e2bI9OVE=.07cdcb44-c591-4d8a-bd7d-1ac5f2499433@github.com>
References: <JJHix9SgIjJWYZtIIrz3WDudrV1jApcffIsVanoNNqc=.8d8555dd-68c8-44f9-87eb-a6b9fbf10780@github.com>
 <yn2ArD8wEb88pT11grc5HGZz-iq65DzczE6e2bI9OVE=.07cdcb44-c591-4d8a-bd7d-1ac5f2499433@github.com>
Message-ID: <izXVE18hDP4bOlQm9HWURq-Mt1CnTDCajB68ahLK0QI=.71e42240-324a-42da-955b-0c2e048c00aa@github.com>

On Thu, 10 Dec 2020 11:58:26 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Bursts of short lived Humongous object allocations can cause GCs to be initiated with 0 free regions. When these GCs happen they take significantly longer to complete. No objects are evacuated so there is a large amount of time spent in reversing self forwarded pointers and the only memory recovered is from the short lived humongous objects. My proposal is to add a check to the slow allocation path which will force a GC to happen if the number of free regions drops below the amount that would be required to complete the GC if it happened at that moment. The threshold will be based on the survival rates from Eden and survivor spaces along with the space required for Tenure space evacuations.
>> 
>> The goal is to resolve the issue with bursts of short lived humongous objects without impacting other workloads negatively. I would appreciate reviews and any feedback that you might have. Thanks.
>> 
>> Here are the links to the threads on the mailing list where I initially discussion the issue and my idea to resolve it:
>> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-November/032189.html
>> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-December/032677.html
>
> test/hotspot/jtreg/gc/g1/TestGCLogMessages.java line 316:
> 
>> 314:         private static byte[] garbage;
>> 315:         private static byte[] largeObject;
>> 316:         private static Object[] holder = new Object[800]; // Must be larger than G1EvacuationFailureALotCount
> 
> Just curious about these changes: it is not immediately obvious to me why they are necessary as the mechanism to force evacuation failure (G1EvacuationFailureALotCount et al) should be independent of these changes.
> 
> And the 17MB (for the humonguous object)+ 16MB of garbage should be enough for at least one gc; but maybe these changes trigger an early gc?

Yes the GC was being triggered earlier and not getting the evacuation failure. With these adjustments it consistently gets an evacuation failure. I was seeing this with my initial prototype so I will verify that it is still required.

> src/hotspot/share/gc/g1/g1VMOperations.hpp line 71:
> 
>> 69: class VM_G1CollectForAllocation : public VM_CollectForAllocation {
>> 70:   bool _gc_succeeded;
>> 71:   bool _force_gc;
> 
> Not completely happy about using an extra flag for these forced GCs here; also this makes them indistinguishable with other GCs in the logs as far as I can see.
> What do you think about adding  GCCause(s) instead to automatically make them stand out in the logs?
> Or at least make sure that they stand out in the logs for debugging issues.

I will remove the _force_gc flag and add a Preemptive GCCause.

> src/hotspot/share/gc/g1/g1Policy.hpp line 102:
> 
>> 100: 
>> 101:   size_t _predicted_survival_bytes_from_survivor;
>> 102:   size_t _predicted_survival_bytes_from_old;
> 
> As a non-English native speaker I think "survival_bytes" is strange as "survival" isn't an adjective. Maybe "surviving_bytes" sounds better?
> The code in `calculate_required_regions_for_next_collect` uses "survivor" btw. I would still prefer "surviving" in some way as it differs from the "survivor" in "survivor regions", but let's keep nomenclature at least consistent.
> 
> Also please add a comment what these are used for.

I will change the names and properly comment them.

> src/hotspot/share/gc/g1/g1Policy.hpp line 368:
> 
>> 366:                                                  uint& num_optional_regions);
>> 367: 
>> 368:   bool can_mutator_consume_free_regions(uint region_count);
> 
> Comments missing.

I will add a comment in my next revision

> src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 423:
> 
>> 421:   for (uint try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
>> 422:     bool should_try_gc;
>> 423:     bool force_gc = false;
> 
> `force_gc` and `should_try_gc` seems to overlap a bit here. At least the naming isn't perfect because we may not do a gc even if `force_gc` is true which I'd kind of expect.
> 
> I do not have a good new name right now how to fix this.

It will be removed as part of my next round of changes

> src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 428:
> 
>> 426:     {
>> 427:       MutexLocker x(Heap_lock);
>> 428:       if (policy()->can_mutator_consume_free_regions(1)) {
> 
> I would prefer if `force_gc` (or whatever name it will have) would be set here unconditionally as the `else` is pretty far away here.
> 
> I.e.
> force_gc = policy()->can_mutator_consume_free_regions(1);
>   
>   if (force_gc) { // needing to use the name "force_gc" here shows that the name is wrong...
>     ... try allocation
>     ... check if we should expand young gen beyond regular size due to GCLocker
>   }
> The other issue I have with using `can_mutator_consume_free_regions()` here is that there is already a very similar `G1Policy::should_allocate_mutator_region`; and anyway, the `attempt_allocation_locked` call may actually succeed without requiring a new region (actually, it is not uncommon that another thread got a free region while trying to take the `Heap_lock`.
> 
> I think a better place for `can_mutator_consume_free_regions()` is in `G1Policy::should_allocate_mutator_region()` for this case.
> 
> `attempt_allocation_locked` however does not return a reason for why allocation failed (at the moment). Maybe it is better to let it return a tuple with result and reason (or a second "out" parameter)? (I haven't tried how this would look like, it seems worth trying and better than the current way of handling this).
> 
> This one could be used in the following code.

For this code I am investigating moving the check into either 'G1Policy::should_allocate_mutator_region()' or to the caller of 'G1Policy::should_allocate_mutator_region()' so I can distinguish the reason why it failed to allocate. Hopefully I have something ready to push this week. I am on vacation so my responses are a little slow

-------------

PR: https://git.openjdk.java.net/jdk/pull/1650


From github.com+13173904+lhtin at openjdk.java.net  Thu Dec 17 02:09:01 2020
From: github.com+13173904+lhtin at openjdk.java.net (Tintin)
Date: Thu, 17 Dec 2020 02:09:01 GMT
Subject: RFR: 8258534: Epsilon: clean up unused includes
Message-ID: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>

Hi all,

CLion IDE shows two warnings of unused includes(`#include "utilities/macros.hpp"`) in EpsilonGC's code. these maybe can be removed.

Testing: macosx-x86_64-server-{release,fastdebug,slowdebug}

-------------

Commit messages:
 - Epsilon: clean up unused includes

Changes: https://git.openjdk.java.net/jdk/pull/1745/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1745&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258534
  Stats: 2 lines in 2 files changed: 0 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1745.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1745/head:pull/1745

PR: https://git.openjdk.java.net/jdk/pull/1745


From shade at openjdk.java.net  Thu Dec 17 02:09:01 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 17 Dec 2020 02:09:01 GMT
Subject: RFR: 8258534: Epsilon: clean up unused includes
In-Reply-To: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>
References: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>
Message-ID: <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>

On Fri, 11 Dec 2020 05:03:56 GMT, Tintin <github.com+13173904+lhtin at openjdk.org> wrote:

> Hi all,
> 
> CLion IDE shows two warnings of unused includes(`#include "utilities/macros.hpp"`) in EpsilonGC's code. these maybe can be removed.
> 
> Testing: macosx-x86_64-server-{release,fastdebug,slowdebug}

Unfortunately, CLion makes the incorrect call here. `macros.hpp` is included to get access to `COMPILER1` and `COMPILER2` macros. So we cannot really remove that `#include`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1745


From github.com+13173904+lhtin at openjdk.java.net  Thu Dec 17 02:09:02 2020
From: github.com+13173904+lhtin at openjdk.java.net (Tintin)
Date: Thu, 17 Dec 2020 02:09:02 GMT
Subject: RFR: 8258534: Epsilon: clean up unused includes
In-Reply-To: <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>
References: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>
 <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>
Message-ID: <0sHy0zJ1AXh9KMt1rsSzbhubJ4TwCMufOHEr89mLL6Q=.23bd199a-83d8-4a43-ad4f-49b51d8c28d2@github.com>

On Fri, 11 Dec 2020 17:28:24 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Hi all,
>> 
>> CLion IDE shows two warnings of unused includes(`#include "utilities/macros.hpp"`) in EpsilonGC's code. these maybe can be removed.
>> 
>> Testing: macosx-x86_64-server-{release,fastdebug,slowdebug}
>
> Unfortunately, CLion makes the incorrect call here. `macros.hpp` is included to get access to `COMPILER1` and `COMPILER2` macros. So we cannot really remove that `#include`.

Thank you for your review. When the CLion shows the two warnings, I try to found `COMPILER1` and `COMPILER2` macros define from `utilities/macros.hpp` file but can not. Or the two macros be added to `utilities/macros.hpp` file when building? My understanding is that the two macros come from `C_FLAGS`(`-DCOMPILER1`, `-DCOMPILER2`) when building.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1745


From jiefu at openjdk.java.net  Thu Dec 17 02:46:57 2020
From: jiefu at openjdk.java.net (Jie Fu)
Date: Thu, 17 Dec 2020 02:46:57 GMT
Subject: RFR: 8258534: Epsilon: clean up unused includes
In-Reply-To: <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>
References: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>
 <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>
Message-ID: <jWGTQALjlC0lDn5il3oHu4OHQ02Fk5hUOICinp_ba54=.b20cf57e-3403-4a42-aac9-3cade3c20719@github.com>

On Fri, 11 Dec 2020 17:28:24 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Unfortunately, CLion makes the incorrect call here. `macros.hpp` is included to get access to `COMPILER1` and `COMPILER2` macros. So we cannot really remove that `#include`.

Hi @shipilev ,

Build tests passed on our Linux/x64 machines with this patch.
So I think the change is fine.
Am I missing something?

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1745


From sjohanss at openjdk.java.net  Thu Dec 17 08:47:57 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 08:47:57 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
In-Reply-To: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
Message-ID: <lWco7auFuQr21r9NHhAc3OpkDwijP9kXNt7i0EEoid8=.1a75086b-0510-42b9-8dc0-6767dc5b292e@github.com>

On Wed, 16 Dec 2020 14:05:13 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to ParallelGC oldgen allocation, adding a missing
> memory barrier.
> 
> The problem arises in the interaction between concurrent oldgen allocations,
> where each would, if done serially (in either order), require expansion of
> the generation.
> 
> An allocation of size N compares the mutable space's (end - top) with N to
> determine if space is available.  If available, use top as the start of the
> object of size N (adjusting top atomically) and assert the resulting memory
> region is in the covered area.  If not, then expand.
> 
> Expansion updates the covered region, then updates the space (i.e. end).
> There is currently no memory barrier between those operations.
> 
> As a result, we can have thread1 having done an expansion, updating the
> covered region and the space end. Because there's no memory barrier there,
> the space end may be updated before the covered region as far as some other
> thread is concerned.
> 
> Meanwhile thread2's allocation reads the new end and goes ahead with the
> allocation (which would not have fit with the old end value), then fails the
> covered region check because it used the old covered range.  Although the
> reads of end and the covered range are ordered here by the intervening CAS
> of top, that doesn't help if the writes by thread1 are not also properly
> ordered.
> 
> There is even a comment about this in PSOldGen::post_resize(), saying the
> space update must be last (including after the covered region update).  But
> without a memory barrier, there's nothing other than source order to ensure
> that ordering.  So add a memory barrier.
> 
> I'm not sure whether this out-of-order update of the space end could lead to  
> problems in a product build (where the assert doesn't apply).  Without
> looking carefully, there appear to be opportunities for problems, such as
> accessing uncovered parts of the card table.
>  
> There's another issue that I'm not addressing with this change.  Various
> values are being read while subject to concurrent writes, without being in
> any way tagged as atomic. (The writes are under the ExpandHeap_lock, the
> reads are not.) This includes at least the covering region bounds and space
> end.
> 
> Testing:
> mach5 tier1
> I was unable to reproduce the failure, so can't show any before / after
> improvement.

Looks good, just a comment about a comment that you can address if you agree.

src/hotspot/share/gc/parallel/psOldGen.cpp line 385:

> 383: 
> 384:   // ALWAYS do this last!!
> 385:   OrderAccess::storestore();

Maybe update the comment to use less caps and '!'. Instead tie back to the function comment explaining that the barrier is needed to guarantee the order in which the data structures get visible to other threads.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/35


From tschatzl at openjdk.java.net  Thu Dec 17 09:15:56 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 17 Dec 2020 09:15:56 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
In-Reply-To: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
Message-ID: <XoRysRdQpA1xy-0lDe28rub1zBk5kXGkP-bc_-dk7mE=.f0406742-5d89-44b6-b5af-0608cadfbc43@github.com>

On Wed, 16 Dec 2020 14:05:13 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to ParallelGC oldgen allocation, adding a missing
> memory barrier.
> 
> The problem arises in the interaction between concurrent oldgen allocations,
> where each would, if done serially (in either order), require expansion of
> the generation.
> 
> An allocation of size N compares the mutable space's (end - top) with N to
> determine if space is available.  If available, use top as the start of the
> object of size N (adjusting top atomically) and assert the resulting memory
> region is in the covered area.  If not, then expand.
> 
> Expansion updates the covered region, then updates the space (i.e. end).
> There is currently no memory barrier between those operations.
> 
> As a result, we can have thread1 having done an expansion, updating the
> covered region and the space end. Because there's no memory barrier there,
> the space end may be updated before the covered region as far as some other
> thread is concerned.
> 
> Meanwhile thread2's allocation reads the new end and goes ahead with the
> allocation (which would not have fit with the old end value), then fails the
> covered region check because it used the old covered range.  Although the
> reads of end and the covered range are ordered here by the intervening CAS
> of top, that doesn't help if the writes by thread1 are not also properly
> ordered.
> 
> There is even a comment about this in PSOldGen::post_resize(), saying the
> space update must be last (including after the covered region update).  But
> without a memory barrier, there's nothing other than source order to ensure
> that ordering.  So add a memory barrier.
> 
> I'm not sure whether this out-of-order update of the space end could lead to  
> problems in a product build (where the assert doesn't apply).  Without
> looking carefully, there appear to be opportunities for problems, such as
> accessing uncovered parts of the card table.
>  
> There's another issue that I'm not addressing with this change.  Various
> values are being read while subject to concurrent writes, without being in
> any way tagged as atomic. (The writes are under the ExpandHeap_lock, the
> reads are not.) This includes at least the covering region bounds and space
> end.
> 
> Testing:
> mach5 tier1
> I was unable to reproduce the failure, so can't show any before / after
> improvement.

Lgtm. Please adjust the comment a little as Stefan suggested :)

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/35


From kbarrett at openjdk.java.net  Thu Dec 17 11:05:02 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 11:05:02 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
In-Reply-To: <lWco7auFuQr21r9NHhAc3OpkDwijP9kXNt7i0EEoid8=.1a75086b-0510-42b9-8dc0-6767dc5b292e@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
 <lWco7auFuQr21r9NHhAc3OpkDwijP9kXNt7i0EEoid8=.1a75086b-0510-42b9-8dc0-6767dc5b292e@github.com>
Message-ID: <WkxAo_onrpOmxiplSD7gcUmmGjhL8ZGYsx7A-BT_4-s=.77598a6b-321f-40c2-a4df-dad4c6e000f1@github.com>

On Thu, 17 Dec 2020 08:45:40 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Please review this change to ParallelGC oldgen allocation, adding a missing
>> memory barrier.
>> 
>> The problem arises in the interaction between concurrent oldgen allocations,
>> where each would, if done serially (in either order), require expansion of
>> the generation.
>> 
>> An allocation of size N compares the mutable space's (end - top) with N to
>> determine if space is available.  If available, use top as the start of the
>> object of size N (adjusting top atomically) and assert the resulting memory
>> region is in the covered area.  If not, then expand.
>> 
>> Expansion updates the covered region, then updates the space (i.e. end).
>> There is currently no memory barrier between those operations.
>> 
>> As a result, we can have thread1 having done an expansion, updating the
>> covered region and the space end. Because there's no memory barrier there,
>> the space end may be updated before the covered region as far as some other
>> thread is concerned.
>> 
>> Meanwhile thread2's allocation reads the new end and goes ahead with the
>> allocation (which would not have fit with the old end value), then fails the
>> covered region check because it used the old covered range.  Although the
>> reads of end and the covered range are ordered here by the intervening CAS
>> of top, that doesn't help if the writes by thread1 are not also properly
>> ordered.
>> 
>> There is even a comment about this in PSOldGen::post_resize(), saying the
>> space update must be last (including after the covered region update).  But
>> without a memory barrier, there's nothing other than source order to ensure
>> that ordering.  So add a memory barrier.
>> 
>> I'm not sure whether this out-of-order update of the space end could lead to  
>> problems in a product build (where the assert doesn't apply).  Without
>> looking carefully, there appear to be opportunities for problems, such as
>> accessing uncovered parts of the card table.
>>  
>> There's another issue that I'm not addressing with this change.  Various
>> values are being read while subject to concurrent writes, without being in
>> any way tagged as atomic. (The writes are under the ExpandHeap_lock, the
>> reads are not.) This includes at least the covering region bounds and space
>> end.
>> 
>> Testing:
>> mach5 tier1
>> I was unable to reproduce the failure, so can't show any before / after
>> improvement.
>
> Looks good, just a comment about a comment that you can address if you agree.

Thanks @kstefanj and @tschatzl for reviewing.

> src/hotspot/share/gc/parallel/psOldGen.cpp line 385:
> 
>> 383: 
>> 384:   // ALWAYS do this last!!
>> 385:   OrderAccess::storestore();
> 
> Maybe update the comment to use less caps and '!'. Instead tie back to the function comment explaining that the barrier is needed to guarantee the order in which the data structures get visible to other threads.

Good idea.  Here's the revised comment:

-  // ALWAYS do this last!!
+  // Ensure the space bounds are updated are made visible to other
+  // threads after the other data structures have been resized.
   OrderAccess::storestore();

-------------

PR: https://git.openjdk.java.net/jdk16/pull/35


From sjohanss at openjdk.java.net  Thu Dec 17 11:19:56 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 11:19:56 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
In-Reply-To: <WkxAo_onrpOmxiplSD7gcUmmGjhL8ZGYsx7A-BT_4-s=.77598a6b-321f-40c2-a4df-dad4c6e000f1@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
 <lWco7auFuQr21r9NHhAc3OpkDwijP9kXNt7i0EEoid8=.1a75086b-0510-42b9-8dc0-6767dc5b292e@github.com>
 <WkxAo_onrpOmxiplSD7gcUmmGjhL8ZGYsx7A-BT_4-s=.77598a6b-321f-40c2-a4df-dad4c6e000f1@github.com>
Message-ID: <uasjAfs4thWNbVjQ42_hbpfBoQm7NJF4VOz0D4AnpVQ=.ec2104ae-d71b-4a4d-99f1-1a4acc353d6c@github.com>

On Thu, 17 Dec 2020 11:01:33 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> src/hotspot/share/gc/parallel/psOldGen.cpp line 385:
>> 
>>> 383: 
>>> 384:   // ALWAYS do this last!!
>>> 385:   OrderAccess::storestore();
>> 
>> Maybe update the comment to use less caps and '!'. Instead tie back to the function comment explaining that the barrier is needed to guarantee the order in which the data structures get visible to other threads.
>
> Good idea.  Here's the revised comment:
> 
> -  // ALWAYS do this last!!
> +  // Ensure the space bounds are updated are made visible to other
> +  // threads after the other data structures have been resized.
>    OrderAccess::storestore();

The second "are" should be an "and", right? Otherwise looks great!

-------------

PR: https://git.openjdk.java.net/jdk16/pull/35


From shade at openjdk.java.net  Thu Dec 17 12:38:57 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Thu, 17 Dec 2020 12:38:57 GMT
Subject: RFR: 8258534: Epsilon: clean up unused includes
In-Reply-To: <jWGTQALjlC0lDn5il3oHu4OHQ02Fk5hUOICinp_ba54=.b20cf57e-3403-4a42-aac9-3cade3c20719@github.com>
References: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>
 <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>
 <jWGTQALjlC0lDn5il3oHu4OHQ02Fk5hUOICinp_ba54=.b20cf57e-3403-4a42-aac9-3cade3c20719@github.com>
Message-ID: <POU4F68cqNIxXhbwk99e0A6dVSlo6TL7InX68yCEChc=.929d718a-adf6-469b-887b-737296a8a05c@github.com>

On Thu, 17 Dec 2020 02:44:19 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Unfortunately, CLion makes the incorrect call here. `macros.hpp` is included to get access to `COMPILER1` and `COMPILER2` macros. So we cannot really remove that `#include`.
>
>> Unfortunately, CLion makes the incorrect call here. `macros.hpp` is included to get access to `COMPILER1` and `COMPILER2` macros. So we cannot really remove that `#include`.
> 
> Hi @shipilev ,
> 
> Build tests passed on our Linux/x64 machines with this patch.
> So I think the change is fine.
> Am I missing something?
> 
> Thanks.

This looks like a cleanup and not very time-pressing, right? I'll take a look after NY holidays.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1745


From tschatzl at openjdk.java.net  Thu Dec 17 12:58:58 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 17 Dec 2020 12:58:58 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
In-Reply-To: <uasjAfs4thWNbVjQ42_hbpfBoQm7NJF4VOz0D4AnpVQ=.ec2104ae-d71b-4a4d-99f1-1a4acc353d6c@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
 <lWco7auFuQr21r9NHhAc3OpkDwijP9kXNt7i0EEoid8=.1a75086b-0510-42b9-8dc0-6767dc5b292e@github.com>
 <WkxAo_onrpOmxiplSD7gcUmmGjhL8ZGYsx7A-BT_4-s=.77598a6b-321f-40c2-a4df-dad4c6e000f1@github.com>
 <uasjAfs4thWNbVjQ42_hbpfBoQm7NJF4VOz0D4AnpVQ=.ec2104ae-d71b-4a4d-99f1-1a4acc353d6c@github.com>
Message-ID: <vMA04Lg7sHykN73EduCGocfwbSt7azOluTrYUdkgR-E=.cd181920-bf56-4089-8848-760d5dbaf884@github.com>

On Thu, 17 Dec 2020 11:17:28 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Good idea.  Here's the revised comment:
>> 
>> -  // ALWAYS do this last!!
>> +  // Ensure the space bounds are updated are made visible to other
>> +  // threads after the other data structures have been resized.
>>    OrderAccess::storestore();
>
> The second "are" should be an "and", right? Otherwise looks great!

+1

-------------

PR: https://git.openjdk.java.net/jdk16/pull/35


From kbarrett at openjdk.java.net  Thu Dec 17 13:21:03 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 13:21:03 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue
Message-ID: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>

Please review this change to the PtrQueue hierarchy, moving the support for
"active" queues and qset to SATBMarkQueue[Set], which is the only user of
this feature.  Other classes derived from PtrQueue[Set] currently work
around or ignore this feature.  This change removes it from consideration
entirely for those other classes.

Testing:
mach5 tier1
local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC

In the process of doing this refactoring I noticed that some of the
vmStructs support around PtrQueue[Set] didn't get moved from G1 to shared,
and the G1 parts are incomplete.  Filed JDK-8258581 to address that.

-------------

Commit messages:
 - move_active

Changes: https://git.openjdk.java.net/jdk/pull/1820/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1820&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258255
  Stats: 78 lines in 8 files changed: 22 ins; 38 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1820.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1820/head:pull/1820

PR: https://git.openjdk.java.net/jdk/pull/1820


From kbarrett at openjdk.java.net  Thu Dec 17 14:21:14 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 14:21:14 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region [v2]
In-Reply-To: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
Message-ID: <QNDp6V-00TXgrihYLUIHQ7xsHt3VMeP0HBHDrFaimy0=.f27d7fdf-c2b3-4fae-b833-55589e6f69f0@github.com>

> Please review this change to ParallelGC oldgen allocation, adding a missing
> memory barrier.
> 
> The problem arises in the interaction between concurrent oldgen allocations,
> where each would, if done serially (in either order), require expansion of
> the generation.
> 
> An allocation of size N compares the mutable space's (end - top) with N to
> determine if space is available.  If available, use top as the start of the
> object of size N (adjusting top atomically) and assert the resulting memory
> region is in the covered area.  If not, then expand.
> 
> Expansion updates the covered region, then updates the space (i.e. end).
> There is currently no memory barrier between those operations.
> 
> As a result, we can have thread1 having done an expansion, updating the
> covered region and the space end. Because there's no memory barrier there,
> the space end may be updated before the covered region as far as some other
> thread is concerned.
> 
> Meanwhile thread2's allocation reads the new end and goes ahead with the
> allocation (which would not have fit with the old end value), then fails the
> covered region check because it used the old covered range.  Although the
> reads of end and the covered range are ordered here by the intervening CAS
> of top, that doesn't help if the writes by thread1 are not also properly
> ordered.
> 
> There is even a comment about this in PSOldGen::post_resize(), saying the
> space update must be last (including after the covered region update).  But
> without a memory barrier, there's nothing other than source order to ensure
> that ordering.  So add a memory barrier.
> 
> I'm not sure whether this out-of-order update of the space end could lead to  
> problems in a product build (where the assert doesn't apply).  Without
> looking carefully, there appear to be opportunities for problems, such as
> accessing uncovered parts of the card table.
>  
> There's another issue that I'm not addressing with this change.  Various
> values are being read while subject to concurrent writes, without being in
> any way tagged as atomic. (The writes are under the ExpandHeap_lock, the
> reads are not.) This includes at least the covering region bounds and space
> end.
> 
> Testing:
> mach5 tier1
> I was unable to reproduce the failure, so can't show any before / after
> improvement.

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Merge branch 'master' into shrink_heap_crash
 - stefanj review
 - add memory barrier

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/35/files
  - new: https://git.openjdk.java.net/jdk16/pull/35/files/8a5b6ce5..50ee993d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=35&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=35&range=00-01

  Stats: 545 lines in 30 files changed: 462 ins; 2 del; 81 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/35.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/35/head:pull/35

PR: https://git.openjdk.java.net/jdk16/pull/35


From kbarrett at openjdk.java.net  Thu Dec 17 14:21:14 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 14:21:14 GMT
Subject: [jdk16] RFR: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region [v2]
In-Reply-To: <vMA04Lg7sHykN73EduCGocfwbSt7azOluTrYUdkgR-E=.cd181920-bf56-4089-8848-760d5dbaf884@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
 <lWco7auFuQr21r9NHhAc3OpkDwijP9kXNt7i0EEoid8=.1a75086b-0510-42b9-8dc0-6767dc5b292e@github.com>
 <WkxAo_onrpOmxiplSD7gcUmmGjhL8ZGYsx7A-BT_4-s=.77598a6b-321f-40c2-a4df-dad4c6e000f1@github.com>
 <uasjAfs4thWNbVjQ42_hbpfBoQm7NJF4VOz0D4AnpVQ=.ec2104ae-d71b-4a4d-99f1-1a4acc353d6c@github.com>
 <vMA04Lg7sHykN73EduCGocfwbSt7azOluTrYUdkgR-E=.cd181920-bf56-4089-8848-760d5dbaf884@github.com>
Message-ID: <jImimr5FBBjh5VnD27WaaQl1yPb3JnmcRCWXHotb7Sg=.fbfb20fd-db10-4103-acf1-fd546c5ff70d@github.com>

On Thu, 17 Dec 2020 12:56:12 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> The second "are" should be an "and", right? Otherwise looks great!
>
> +1

Drat.  Will fix.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/35


From kbarrett at openjdk.java.net  Thu Dec 17 14:21:15 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 14:21:15 GMT
Subject: [jdk16] Integrated: 8257999: Parallel GC crash in
 gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region
In-Reply-To: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
References: <sGoYVvcmKZQTfVM8C9qzzcx2oqr0HUy88N4NsOpwW3o=.c3480fd6-665a-478e-9a46-2037ec010d36@github.com>
Message-ID: <9KJbUF0K-YnEvqXyzdQznygPZa3kwhbbv7DG2rT8LIg=.96c71cc8-67f7-48f6-b6d0-82a03aa603bf@github.com>

On Wed, 16 Dec 2020 14:05:13 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to ParallelGC oldgen allocation, adding a missing
> memory barrier.
> 
> The problem arises in the interaction between concurrent oldgen allocations,
> where each would, if done serially (in either order), require expansion of
> the generation.
> 
> An allocation of size N compares the mutable space's (end - top) with N to
> determine if space is available.  If available, use top as the start of the
> object of size N (adjusting top atomically) and assert the resulting memory
> region is in the covered area.  If not, then expand.
> 
> Expansion updates the covered region, then updates the space (i.e. end).
> There is currently no memory barrier between those operations.
> 
> As a result, we can have thread1 having done an expansion, updating the
> covered region and the space end. Because there's no memory barrier there,
> the space end may be updated before the covered region as far as some other
> thread is concerned.
> 
> Meanwhile thread2's allocation reads the new end and goes ahead with the
> allocation (which would not have fit with the old end value), then fails the
> covered region check because it used the old covered range.  Although the
> reads of end and the covered range are ordered here by the intervening CAS
> of top, that doesn't help if the writes by thread1 are not also properly
> ordered.
> 
> There is even a comment about this in PSOldGen::post_resize(), saying the
> space update must be last (including after the covered region update).  But
> without a memory barrier, there's nothing other than source order to ensure
> that ordering.  So add a memory barrier.
> 
> I'm not sure whether this out-of-order update of the space end could lead to  
> problems in a product build (where the assert doesn't apply).  Without
> looking carefully, there appear to be opportunities for problems, such as
> accessing uncovered parts of the card table.
>  
> There's another issue that I'm not addressing with this change.  Various
> values are being read while subject to concurrent writes, without being in
> any way tagged as atomic. (The writes are under the ExpandHeap_lock, the
> reads are not.) This includes at least the covering region bounds and space
> end.
> 
> Testing:
> mach5 tier1
> I was unable to reproduce the failure, so can't show any before / after
> improvement.

This pull request has now been integrated.

Changeset: 61390d8e
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk16/commit/61390d8e
Stats:     4 lines in 1 file changed: 3 ins; 0 del; 1 mod

8257999: Parallel GC crash in gc/parallel/TestDynShrinkHeap.java: new region is not in covered_region

Reviewed-by: sjohanss, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk16/pull/35


From sjohanss at openjdk.java.net  Thu Dec 17 14:44:05 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 14:44:05 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large after
 JDK-8236926
Message-ID: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>

Please review this fix to avoid the regression in DaCapo-lusearch-large.

**Summary** 
Doing uncommit concurrently with this benchmark was the cause of this regression. Using JFR we can see a lot of the java threads are blocking on a monitor when using the build with concurrent uncommit (but not otherwise). We know that the uncommit will affect the application threads and if a thread holding a lock gets stalled this will affect the overall performance more than expected. To avoid stalling the application threads more than necessary we can be a bit less aggressive when doing the uncommit. The initial version of concurrent uncommit tries to return the memory as quickly as possible after the GC, but there is no contract that we need to return the memory that quick and doing it a bit more lazy have more than one positive effect. 

The proposed change does three things: 
* first we delay the first uncommit after the GC shrinking the heap by 100ms, 
* after that each new invocation of the uncommit task will be delayed by 10ms (instead of run back to back)
* the size of each uncommit is changed from 256m to 128m, this will also lower the impact of each uncommit call. 

Doing these things will make the uncommit take a longer time and for applications where the uncommitted memory is needed quite quickly again, this will have the nice effect that instead of uncommitting/committing, we can just reuse the memory without involving the OS.

We have also done some experiments using `madvise` to do the uncommitting, but this doesn't fix the whole regression and needs some more investigation. A bit more information around this can be found in the bug report.

**Testing**
Tier1-3 in Mach5 and a lot of manual performance testing both locally and in our internal systems. Regression has been removed and in we instead see a small improvement, due to being able to reuse memory instead of uncommiting and then committing it again.

-------------

Commit messages:
 - 8257974: Regression 21% in DaCapo-lusearch-large after JDK-8236926

Changes: https://git.openjdk.java.net/jdk16/pull/42/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=42&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257974
  Stats: 10 lines in 2 files changed: 4 ins; 1 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/42.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/42/head:pull/42

PR: https://git.openjdk.java.net/jdk16/pull/42


From ayang at openjdk.java.net  Thu Dec 17 14:53:56 2020
From: ayang at openjdk.java.net (Albert Mingkun Yang)
Date: Thu, 17 Dec 2020 14:53:56 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926
In-Reply-To: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
Message-ID: <Rb85zzcepMs_2p2I_IDktrhQvt5yf-nNLHQzCe8If5U=.1a6b9949-06f1-4a17-9d80-46db0109cea3@github.com>

On Thu, 17 Dec 2020 14:35:51 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Please review this fix to avoid the regression in DaCapo-lusearch-large.
> 
> **Summary** 
> Doing uncommit concurrently with this benchmark was the cause of this regression. Using JFR we can see a lot of the java threads are blocking on a monitor when using the build with concurrent uncommit (but not otherwise). We know that the uncommit will affect the application threads and if a thread holding a lock gets stalled this will affect the overall performance more than expected. To avoid stalling the application threads more than necessary we can be a bit less aggressive when doing the uncommit. The initial version of concurrent uncommit tries to return the memory as quickly as possible after the GC, but there is no contract that we need to return the memory that quick and doing it a bit more lazy have more than one positive effect. 
> 
> The proposed change does three things: 
> * first we delay the first uncommit after the GC shrinking the heap by 100ms, 
> * after that each new invocation of the uncommit task will be delayed by 10ms (instead of run back to back)
> * the size of each uncommit is changed from 256m to 128m, this will also lower the impact of each uncommit call. 
> 
> Doing these things will make the uncommit take a longer time and for applications where the uncommitted memory is needed quite quickly again, this will have the nice effect that instead of uncommitting/committing, we can just reuse the memory without involving the OS.
> 
> We have also done some experiments using `madvise` to do the uncommitting, but this doesn't fix the whole regression and needs some more investigation. A bit more information around this can be found in the bug report.
> 
> **Testing**
> Tier1-3 in Mach5 and a lot of manual performance testing both locally and in our internal systems. Regression has been removed and in we instead see a small improvement, due to being able to reuse memory instead of uncommiting and then committing it again.

Marked as reviewed by ayang (Author).

-------------

PR: https://git.openjdk.java.net/jdk16/pull/42


From redestad at openjdk.java.net  Thu Dec 17 14:53:57 2020
From: redestad at openjdk.java.net (Claes Redestad)
Date: Thu, 17 Dec 2020 14:53:57 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926
In-Reply-To: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
Message-ID: <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>

On Thu, 17 Dec 2020 14:35:51 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Please review this fix to avoid the regression in DaCapo-lusearch-large.
> 
> **Summary** 
> Doing uncommit concurrently with this benchmark was the cause of this regression. Using JFR we can see a lot of the java threads are blocking on a monitor when using the build with concurrent uncommit (but not otherwise). We know that the uncommit will affect the application threads and if a thread holding a lock gets stalled this will affect the overall performance more than expected. To avoid stalling the application threads more than necessary we can be a bit less aggressive when doing the uncommit. The initial version of concurrent uncommit tries to return the memory as quickly as possible after the GC, but there is no contract that we need to return the memory that quick and doing it a bit more lazy have more than one positive effect. 
> 
> The proposed change does three things: 
> * first we delay the first uncommit after the GC shrinking the heap by 100ms, 
> * after that each new invocation of the uncommit task will be delayed by 10ms (instead of run back to back)
> * the size of each uncommit is changed from 256m to 128m, this will also lower the impact of each uncommit call. 
> 
> Doing these things will make the uncommit take a longer time and for applications where the uncommitted memory is needed quite quickly again, this will have the nice effect that instead of uncommitting/committing, we can just reuse the memory without involving the OS.
> 
> We have also done some experiments using `madvise` to do the uncommitting, but this doesn't fix the whole regression and needs some more investigation. A bit more information around this can be found in the bug report.
> 
> **Testing**
> Tier1-3 in Mach5 and a lot of manual performance testing both locally and in our internal systems. Regression has been removed and in we instead see a small improvement, due to being able to reuse memory instead of uncommiting and then committing it again.

LGTM - seems like a reasonable approach. I'm sure an argument can be made for being able to control the delays (I can picture situations where tuning in either direction might make sense), but I agree with fixing this regression by making these delays constant for now.

src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp line 62:

> 60:   G1UncommitRegionTask* uncommit_task = instance();
> 61:   if (!uncommit_task->is_active()) {
> 62:     // Change state to active and schedule with no delay.

Comment needs an update

-------------

Marked as reviewed by redestad (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/42


From tschatzl at openjdk.java.net  Thu Dec 17 15:01:58 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 17 Dec 2020 15:01:58 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926
In-Reply-To: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
Message-ID: <SEt0REj1mNmlaT93J9CZq61kaMkLqTj5rNjQPHswWxE=.3737ea0e-3765-4c7a-a31a-85b1ece7cb6f@github.com>

On Thu, 17 Dec 2020 14:35:51 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Please review this fix to avoid the regression in DaCapo-lusearch-large.
> 
> **Summary** 
> Doing uncommit concurrently with this benchmark was the cause of this regression. Using JFR we can see a lot of the java threads are blocking on a monitor when using the build with concurrent uncommit (but not otherwise). We know that the uncommit will affect the application threads and if a thread holding a lock gets stalled this will affect the overall performance more than expected. To avoid stalling the application threads more than necessary we can be a bit less aggressive when doing the uncommit. The initial version of concurrent uncommit tries to return the memory as quickly as possible after the GC, but there is no contract that we need to return the memory that quick and doing it a bit more lazy have more than one positive effect. 
> 
> The proposed change does three things: 
> * first we delay the first uncommit after the GC shrinking the heap by 100ms, 
> * after that each new invocation of the uncommit task will be delayed by 10ms (instead of run back to back)
> * the size of each uncommit is changed from 256m to 128m, this will also lower the impact of each uncommit call. 
> 
> Doing these things will make the uncommit take a longer time and for applications where the uncommitted memory is needed quite quickly again, this will have the nice effect that instead of uncommitting/committing, we can just reuse the memory without involving the OS.
> 
> We have also done some experiments using `madvise` to do the uncommitting, but this doesn't fix the whole regression and needs some more investigation. A bit more information around this can be found in the bug report.
> 
> **Testing**
> Tier1-3 in Mach5 and a lot of manual performance testing both locally and in our internal systems. Regression has been removed and in we instead see a small improvement, due to being able to reuse memory instead of uncommiting and then committing it again.

Lgtm sans the comment update.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/42


From tschatzl at openjdk.java.net  Thu Dec 17 15:11:58 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Thu, 17 Dec 2020 15:11:58 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue
In-Reply-To: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
Message-ID: <_GqCHUyJnKu2xksb3tGl6TrVMgTStu-VSIvW1GmIxEE=.e41dcab2-385b-453f-b0c5-3822defeafe1@github.com>

On Thu, 17 Dec 2020 13:17:11 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the PtrQueue hierarchy, moving the support for
> "active" queues and qset to SATBMarkQueue[Set], which is the only user of
> this feature.  Other classes derived from PtrQueue[Set] currently work
> around or ignore this feature.  This change removes it from consideration
> entirely for those other classes.
> 
> Testing:
> mach5 tier1
> local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC
> 
> In the process of doing this refactoring I noticed that some of the
> vmStructs support around PtrQueue[Set] didn't get moved from G1 to shared,
> and the G1 parts are incomplete.  Filed JDK-8258581 to address that.

Apart from that request with the comment, lgtm.

src/hotspot/share/gc/shared/satbMarkQueue.hpp line 106:

> 104:   size_t _process_completed_buffers_threshold;
> 105:   size_t _buffer_enqueue_threshold;
> 106:   // SATB is only active during marking.  Enqueuing is not done when inactive.

I would prefer to avoid the double-negation in the second sentence.

src/hotspot/share/gc/shared/satbMarkQueue.hpp line 155:

> 153:   // When active, add obj to queue by calling enqueue_known_active.
> 154:   void enqueue(SATBMarkQueue& queue, oop obj) {
> 155:     if (queue.is_active()) enqueue_known_active(queue, obj);

Maybe something for a different CR and probably pre-existing: `enqueue()` in some contexts is already called with `queue.is_active() == true`. I.e. this check is superfluous, particularly when called for object arrays.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1820


From kbarrett at openjdk.java.net  Thu Dec 17 16:50:13 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 16:50:13 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue [v2]
In-Reply-To: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
Message-ID: <yRp0qORfw3rkTdZP2hM_SKDK4ZwmoQkf7xoAlbD9DU4=.4229aae8-7989-4e05-92b7-7725070f216e@github.com>

> Please review this change to the PtrQueue hierarchy, moving the support for
> "active" queues and qset to SATBMarkQueue[Set], which is the only user of
> this feature.  Other classes derived from PtrQueue[Set] currently work
> around or ignore this feature.  This change removes it from consideration
> entirely for those other classes.
> 
> Testing:
> mach5 tier1
> local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC
> 
> In the process of doing this refactoring I noticed that some of the
> vmStructs support around PtrQueue[Set] didn't get moved from G1 to shared,
> and the G1 parts are incomplete.  Filed JDK-8258581 to address that.

Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:

  tschatzl review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1820/files
  - new: https://git.openjdk.java.net/jdk/pull/1820/files/137b3b39..dccbeb46

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1820&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1820&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1820.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1820/head:pull/1820

PR: https://git.openjdk.java.net/jdk/pull/1820


From kbarrett at openjdk.java.net  Thu Dec 17 16:50:15 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 17 Dec 2020 16:50:15 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue [v2]
In-Reply-To: <_GqCHUyJnKu2xksb3tGl6TrVMgTStu-VSIvW1GmIxEE=.e41dcab2-385b-453f-b0c5-3822defeafe1@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
 <_GqCHUyJnKu2xksb3tGl6TrVMgTStu-VSIvW1GmIxEE=.e41dcab2-385b-453f-b0c5-3822defeafe1@github.com>
Message-ID: <6hoa5In9m_uPR--fb7CxqmHO4fNToOXaiue1EZlhYpE=.826cafda-63d6-4ce6-a41d-5a5e27c078a9@github.com>

On Thu, 17 Dec 2020 15:07:44 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   tschatzl review
>
> src/hotspot/share/gc/shared/satbMarkQueue.hpp line 155:
> 
>> 153:   // When active, add obj to queue by calling enqueue_known_active.
>> 154:   void enqueue(SATBMarkQueue& queue, oop obj) {
>> 155:     if (queue.is_active()) enqueue_known_active(queue, obj);
> 
> Maybe something for a different CR and probably pre-existing: `enqueue()` in some contexts is already called with `queue.is_active() == true`. I.e. this check is superfluous sometimes, particularly when called for object arrays.

I was planning to look at that later.  I filed a reminder: https://bugs.openjdk.java.net/browse/JDK-8258607

> src/hotspot/share/gc/shared/satbMarkQueue.hpp line 106:
> 
>> 104:   size_t _process_completed_buffers_threshold;
>> 105:   size_t _buffer_enqueue_threshold;
>> 106:   // SATB is only active during marking.  Enqueuing is not done when inactive.
> 
> I would prefer to avoid the double-negation in the second sentence.

Sure.  Changing it to
  // SATB is only active during marking.  Enqueuing is only done when active.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1820


From sjohanss at openjdk.java.net  Thu Dec 17 17:10:11 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 17:10:11 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926 [v2]
In-Reply-To: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
Message-ID: <6odU3w94sBuG7ILOpcYBKjRe0Iwd97pPuqy5gpZrPzc=.f7c293a3-a1c7-40d9-94eb-4a2fe5415ed0@github.com>

> Please review this fix to avoid the regression in DaCapo-lusearch-large.
> 
> **Summary** 
> Doing uncommit concurrently with this benchmark was the cause of this regression. Using JFR we can see a lot of the java threads are blocking on a monitor when using the build with concurrent uncommit (but not otherwise). We know that the uncommit will affect the application threads and if a thread holding a lock gets stalled this will affect the overall performance more than expected. To avoid stalling the application threads more than necessary we can be a bit less aggressive when doing the uncommit. The initial version of concurrent uncommit tries to return the memory as quickly as possible after the GC, but there is no contract that we need to return the memory that quick and doing it a bit more lazy have more than one positive effect. 
> 
> The proposed change does three things: 
> * first we delay the first uncommit after the GC shrinking the heap by 100ms, 
> * after that each new invocation of the uncommit task will be delayed by 10ms (instead of run back to back)
> * the size of each uncommit is changed from 256m to 128m, this will also lower the impact of each uncommit call. 
> 
> Doing these things will make the uncommit take a longer time and for applications where the uncommitted memory is needed quite quickly again, this will have the nice effect that instead of uncommitting/committing, we can just reuse the memory without involving the OS.
> 
> We have also done some experiments using `madvise` to do the uncommitting, but this doesn't fix the whole regression and needs some more investigation. A bit more information around this can be found in the bug report.
> 
> **Testing**
> Tier1-3 in Mach5 and a lot of manual performance testing both locally and in our internal systems. Regression has been removed and in we instead see a small improvement, due to being able to reuse memory instead of uncommiting and then committing it again.

Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision:

  claes review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/42/files
  - new: https://git.openjdk.java.net/jdk16/pull/42/files/2d5c88b4..bf6c654c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=42&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=42&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/42.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/42/head:pull/42

PR: https://git.openjdk.java.net/jdk16/pull/42


From sjohanss at openjdk.java.net  Thu Dec 17 17:10:12 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 17:10:12 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926 [v2]
In-Reply-To: <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
 <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>
Message-ID: <AGTM8irZSzcf0WOcjC5SowGfsJ3Rr_qccZVJRcXUJcs=.58e84b1a-b9fc-45a1-a47f-951de90a8438@github.com>

On Thu, 17 Dec 2020 14:47:22 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   claes review
>
> src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp line 62:
> 
>> 60:   G1UncommitRegionTask* uncommit_task = instance();
>> 61:   if (!uncommit_task->is_active()) {
>> 62:     // Change state to active and schedule with no delay.
> 
> Comment needs an update

Nice catch!

-------------

PR: https://git.openjdk.java.net/jdk16/pull/42


From sjohanss at openjdk.java.net  Thu Dec 17 17:18:02 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 17:18:02 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926 [v2]
In-Reply-To: <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
 <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>
Message-ID: <akA3E3A5Ee_8poY31KEi0HfgLVt3PIB1XVV1g4IeG14=.5a88ecdb-7bfd-46a7-8024-b9ef3899291c@github.com>

On Thu, 17 Dec 2020 14:51:39 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> LGTM - seems like a reasonable approach. I'm sure an argument can be made for being able to control the delays (I can picture situations where tuning in either direction might make sense), but I agree with fixing this regression by making these delays constant for now.

Thanks for the review.

Yes, we might end up adding flags for these delays in a later release. Or we might tune the values for the constants if we find something that seem to work better. As long as we don't see this causing big problems, I would prefer if we can have well balanced constants, instead of adding flags that don't add much value.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/42


From sjohanss at openjdk.java.net  Thu Dec 17 20:05:58 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Thu, 17 Dec 2020 20:05:58 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue [v2]
In-Reply-To: <yRp0qORfw3rkTdZP2hM_SKDK4ZwmoQkf7xoAlbD9DU4=.4229aae8-7989-4e05-92b7-7725070f216e@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
 <yRp0qORfw3rkTdZP2hM_SKDK4ZwmoQkf7xoAlbD9DU4=.4229aae8-7989-4e05-92b7-7725070f216e@github.com>
Message-ID: <dtX28kzyiqDSWcZFPDIpaNDmEFADliGqSgWyAWkpAnA=.af02cc42-b6c1-4b3b-a682-d3cddc5f8368@github.com>

On Thu, 17 Dec 2020 16:50:13 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Please review this change to the PtrQueue hierarchy, moving the support for
>> "active" queues and qset to SATBMarkQueue[Set], which is the only user of
>> this feature.  Other classes derived from PtrQueue[Set] currently work
>> around or ignore this feature.  This change removes it from consideration
>> entirely for those other classes.
>> 
>> Testing:
>> mach5 tier1
>> local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC
>> 
>> In the process of doing this refactoring I noticed that some of the
>> vmStructs support around PtrQueue[Set] didn't get moved from G1 to shared,
>> and the G1 parts are incomplete.  Filed JDK-8258581 to address that.
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
> 
>   tschatzl review

Nice cleanup, look good.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1820


From github.com+13173904+lhtin at openjdk.java.net  Fri Dec 18 02:52:56 2020
From: github.com+13173904+lhtin at openjdk.java.net (Lehua Ding)
Date: Fri, 18 Dec 2020 02:52:56 GMT
Subject: RFR: 8258534: Epsilon: clean up unused includes
In-Reply-To: <POU4F68cqNIxXhbwk99e0A6dVSlo6TL7InX68yCEChc=.929d718a-adf6-469b-887b-737296a8a05c@github.com>
References: <ukT48t_j3cJC-FS37sc2dNhQl1SyGx9wfaJgZiHfw8Y=.c7b8722e-2acd-487f-995e-66b7e8549812@github.com>
 <5Jk5jtIjF2jeeLlivaEK3dAZBDK2yn-EopO1vfL3Y5Q=.36b85fd5-4985-4be3-b61e-f16c4ebd810c@github.com>
 <jWGTQALjlC0lDn5il3oHu4OHQ02Fk5hUOICinp_ba54=.b20cf57e-3403-4a42-aac9-3cade3c20719@github.com>
 <POU4F68cqNIxXhbwk99e0A6dVSlo6TL7InX68yCEChc=.929d718a-adf6-469b-887b-737296a8a05c@github.com>
Message-ID: <q_r93f8PsoyOaFox85F_AsusrFbJefriZ7mc5Q85Kto=.efbdf855-5978-45fa-9595-5dea89fcd4a6@github.com>

On Thu, 17 Dec 2020 12:36:14 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This looks like a cleanup and not very time-pressing, right? I'll take a look after NY holidays.

Yes, just a small change. Take your time.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1745


From sjohanss at openjdk.java.net  Fri Dec 18 08:18:58 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 18 Dec 2020 08:18:58 GMT
Subject: [jdk16] RFR: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926 [v2]
In-Reply-To: <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
 <O9qYcqOJNcul9B32H11LiJd-Iux0_l9EScLOPhcEPyE=.0630dabf-a815-4ab1-b8ef-28b264460771@github.com>
Message-ID: <dDM1eQl0kH0nfhjXfLceTCiw00kzKLi6Ol_GBjTF9dw=.1a31cb03-b790-4f80-b3d9-fcafec12ac3b@github.com>

On Thu, 17 Dec 2020 14:51:39 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   claes review
>
> LGTM - seems like a reasonable approach. I'm sure an argument can be made for being able to control the delays (I can picture situations where tuning in either direction might make sense), but I agree with fixing this regression by making these delays constant for now.

Thanks @cl4es, @albertnetymk and @tschatzl for the reviews.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/42


From sjohanss at openjdk.java.net  Fri Dec 18 08:18:59 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 18 Dec 2020 08:18:59 GMT
Subject: [jdk16] Integrated: 8257974: Regression 21% in DaCapo-lusearch-large
 after JDK-8236926
In-Reply-To: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
References: <pNF5ubFVM3Eiz6ACWnv0IA82Y9_0Ephro_XqCY3fsxo=.176062b1-ccf6-405e-9b02-83a345ea1eb7@github.com>
Message-ID: <F99U8cmFghv3WxRk5bKIWXsDRWOcpsHNsPTpcw6xrNU=.87ed236e-8edf-4f64-a78a-8cba94342feb@github.com>

On Thu, 17 Dec 2020 14:35:51 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Please review this fix to avoid the regression in DaCapo-lusearch-large.
> 
> **Summary** 
> Doing uncommit concurrently with this benchmark was the cause of this regression. Using JFR we can see a lot of the java threads are blocking on a monitor when using the build with concurrent uncommit (but not otherwise). We know that the uncommit will affect the application threads and if a thread holding a lock gets stalled this will affect the overall performance more than expected. To avoid stalling the application threads more than necessary we can be a bit less aggressive when doing the uncommit. The initial version of concurrent uncommit tries to return the memory as quickly as possible after the GC, but there is no contract that we need to return the memory that quick and doing it a bit more lazy have more than one positive effect. 
> 
> The proposed change does three things: 
> * first we delay the first uncommit after the GC shrinking the heap by 100ms, 
> * after that each new invocation of the uncommit task will be delayed by 10ms (instead of run back to back)
> * the size of each uncommit is changed from 256m to 128m, this will also lower the impact of each uncommit call. 
> 
> Doing these things will make the uncommit take a longer time and for applications where the uncommitted memory is needed quite quickly again, this will have the nice effect that instead of uncommitting/committing, we can just reuse the memory without involving the OS.
> 
> We have also done some experiments using `madvise` to do the uncommitting, but this doesn't fix the whole regression and needs some more investigation. A bit more information around this can be found in the bug report.
> 
> **Testing**
> Tier1-3 in Mach5 and a lot of manual performance testing both locally and in our internal systems. Regression has been removed and in we instead see a small improvement, due to being able to reuse memory instead of uncommiting and then committing it again.

This pull request has now been integrated.

Changeset: 38593a4f
Author:    Stefan Johansson <sjohanss at openjdk.org>
URL:       https://git.openjdk.java.net/jdk16/commit/38593a4f
Stats:     11 lines in 2 files changed: 4 ins; 1 del; 6 mod

8257974: Regression 21% in DaCapo-lusearch-large after JDK-8236926

Reviewed-by: ayang, redestad, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk16/pull/42


From rriggs at openjdk.java.net  Fri Dec 18 14:42:30 2020
From: rriggs at openjdk.java.net (Roger Riggs)
Date: Fri, 18 Dec 2020 14:42:30 GMT
Subject: RFR: 8230523: Remove abortOnException diagnostic option from
 TestHumongosNonArrayAllocation.java
Message-ID: <WHHhKUcHK5eHeqx1IWlZQ_4CjKcLADc4hFrOAXvExP0=.ded072bb-6bb7-4818-942f-911bb6c910f7@github.com>

Test cleanup to remove a command line option used to gather more information about a previous failure.
The option is no longer needed.

The original test issue was:     8249217: Unexpected StackOverflowError in "process reaper" thread still happens

-------------

Commit messages:
 - 8230523: Remove abortOnException diagnostic option from TestHumongousNonArrayAllocation.java

Changes: https://git.openjdk.java.net/jdk/pull/1841/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1841&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8230523
  Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1841.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1841/head:pull/1841

PR: https://git.openjdk.java.net/jdk/pull/1841


From kbarrett at openjdk.java.net  Fri Dec 18 14:56:23 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 18 Dec 2020 14:56:23 GMT
Subject: RFR: 8230523: Remove abortOnException diagnostic option from
 TestHumongosNonArrayAllocation.java
In-Reply-To: <WHHhKUcHK5eHeqx1IWlZQ_4CjKcLADc4hFrOAXvExP0=.ded072bb-6bb7-4818-942f-911bb6c910f7@github.com>
References: <WHHhKUcHK5eHeqx1IWlZQ_4CjKcLADc4hFrOAXvExP0=.ded072bb-6bb7-4818-942f-911bb6c910f7@github.com>
Message-ID: <xQW4CopNQ086l-1OcxQ0rye-rAOj74l3naPRqGs860M=.f8945f50-9735-4dca-b6be-fe49bb9c92b7@github.com>

On Fri, 18 Dec 2020 14:38:02 GMT, Roger Riggs <rriggs at openjdk.org> wrote:

> Test cleanup to remove a command line option used to gather more information about a previous failure.
> The option is no longer needed.
> 
> The original test issue was:     8249217: Unexpected StackOverflowError in "process reaper" thread still happens

Looks good, and trivial.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1841


From kbarrett at openjdk.java.net  Fri Dec 18 14:58:27 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 18 Dec 2020 14:58:27 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue [v2]
In-Reply-To: <_GqCHUyJnKu2xksb3tGl6TrVMgTStu-VSIvW1GmIxEE=.e41dcab2-385b-453f-b0c5-3822defeafe1@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
 <_GqCHUyJnKu2xksb3tGl6TrVMgTStu-VSIvW1GmIxEE=.e41dcab2-385b-453f-b0c5-3822defeafe1@github.com>
Message-ID: <vpW_5kby2FIs3ip1nGsXjkzaf3xgT2QufOJmfg1ZlPQ=.2baaa572-5540-4ffd-a1c3-ba0ee553a881@github.com>

On Thu, 17 Dec 2020 15:09:04 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   tschatzl review
>
> Apart from that request with the comment, lgtm.

Thanks @tschatzl and @kstefanj for reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1820


From kbarrett at openjdk.java.net  Fri Dec 18 15:13:43 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 18 Dec 2020 15:13:43 GMT
Subject: RFR: 8258255: Move PtrQueue active flag to SATBMarkQueue [v3]
In-Reply-To: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
Message-ID: <WyTC75yMDioHdAl_-Uadne0X4uTuNTQC7HfL848Zgow=.62789205-db93-4dc0-b308-cb1b8aa4bd93@github.com>

> Please review this change to the PtrQueue hierarchy, moving the support for
> "active" queues and qset to SATBMarkQueue[Set], which is the only user of
> this feature.  Other classes derived from PtrQueue[Set] currently work
> around or ignore this feature.  This change removes it from consideration
> entirely for those other classes.
> 
> Testing:
> mach5 tier1
> local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC
> 
> In the process of doing this refactoring I noticed that some of the
> vmStructs support around PtrQueue[Set] didn't get moved from G1 to shared,
> and the G1 parts are incomplete.  Filed JDK-8258581 to address that.

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Merge branch 'master' into move_active
 - tschatzl review
 - move_active

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1820/files
  - new: https://git.openjdk.java.net/jdk/pull/1820/files/dccbeb46..96fec70f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1820&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1820&range=01-02

  Stats: 1835 lines in 122 files changed: 1211 ins; 253 del; 371 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1820.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1820/head:pull/1820

PR: https://git.openjdk.java.net/jdk/pull/1820


From kbarrett at openjdk.java.net  Fri Dec 18 15:13:44 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 18 Dec 2020 15:13:44 GMT
Subject: Integrated: 8258255: Move PtrQueue active flag to SATBMarkQueue
In-Reply-To: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
References: <ORQAKlSgmsD2W5wtsqlp9OjZmL_CokD7DBDSiUzpw2w=.b4942e8c-cf9c-4844-9c44-eedd0804a247@github.com>
Message-ID: <urr6mcXxLOGhTpBG9SNWs31wDY2Jfz20Z7Sv8uWys34=.a4af4461-d3c2-432b-9b9f-ec7da583c3ff@github.com>

On Thu, 17 Dec 2020 13:17:11 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the PtrQueue hierarchy, moving the support for
> "active" queues and qset to SATBMarkQueue[Set], which is the only user of
> this feature.  Other classes derived from PtrQueue[Set] currently work
> around or ignore this feature.  This change removes it from consideration
> entirely for those other classes.
> 
> Testing:
> mach5 tier1
> local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC
> 
> In the process of doing this refactoring I noticed that some of the
> vmStructs support around PtrQueue[Set] didn't get moved from G1 to shared,
> and the G1 parts are incomplete.  Filed JDK-8258581 to address that.

This pull request has now been integrated.

Changeset: 00d80fdd
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/00d80fdd
Stats:     78 lines in 8 files changed: 22 ins; 38 del; 18 mod

8258255: Move PtrQueue active flag to SATBMarkQueue

Reviewed-by: tschatzl, sjohanss

-------------

PR: https://git.openjdk.java.net/jdk/pull/1820


From tschatzl at openjdk.java.net  Fri Dec 18 15:40:25 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 18 Dec 2020 15:40:25 GMT
Subject: RFR: 8258481: gc.g1.plab.TestPLABPromotion fails on Linux x86
Message-ID: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>

Hi all,

  can I have reviews for this test bug fix on x86 (but it did not do the correct thing on 64 bit platforms either)?

There are sone test cases that allocates byte arrays of ~3500 bytes, and expect that almost all allocations occur in the PLABs given a PLAB waste threshold of some percentage, in this case 20%.

On x64 this is good, as the PLAB size is 4096 *words*, i.e. 32kb, and 20% of that is ~6.5kb. So all objects are allocated in PLABs as expected

On x86 the PLAB size of 4096 words is only 16kb, and 20% of that is ~3.2kb. This threshold is less than these 3500 bytes, so the test fails.

It does not fail always (but very often) because of the broken calculation for meeting the threshold: unless really *all* objects copied are of that 3500 byte size (and hence directly allocated), the current checking using integer calculation results in 0% waste, which is below the expected 20%.

The suggested fix is to lower the size of that array to 3250 bytes, which meets the criteria on both 32 and 64 bit platforms (and fix the broken calculations).

Note that we should not change this array size to much lower, because there is another test that fails otherwise.

Testing:
100 successful test runs on x86 and x64 linux each

Thanks,
  Thomas

-------------

Commit messages:
 - Initial commit

Changes: https://git.openjdk.java.net/jdk/pull/1842/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1842&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258481
  Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1842.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1842/head:pull/1842

PR: https://git.openjdk.java.net/jdk/pull/1842


From sjohanss at openjdk.java.net  Fri Dec 18 16:28:53 2020
From: sjohanss at openjdk.java.net (Stefan Johansson)
Date: Fri, 18 Dec 2020 16:28:53 GMT
Subject: RFR: 8258481: gc.g1.plab.TestPLABPromotion fails on Linux x86
In-Reply-To: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
References: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
Message-ID: <S3Im7ziititKvYFBIFNRiRQ5OYPqaYJNw31aHTbWGqI=.ca86e2f0-ceb4-478e-9e88-943d7191ccbe@github.com>

On Fri, 18 Dec 2020 15:17:26 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this test bug fix on x86 (but it did not do the correct thing on 64 bit platforms either)?
> 
> There are sone test cases that allocates byte arrays of ~3500 bytes, and expect that almost all allocations occur in the PLABs given a PLAB waste threshold of some percentage, in this case 20%.
> 
> On x64 this is good, as the PLAB size is 4096 *words*, i.e. 32kb, and 20% of that is ~6.5kb. So all objects are allocated in PLABs as expected
> 
> On x86 the PLAB size of 4096 words is only 16kb, and 20% of that is ~3.2kb. This threshold is less than these 3500 bytes, so the test fails.
> 
> It does not fail always (but very often) because of the broken calculation for meeting the threshold: unless really *all* objects copied are of that 3500 byte size (and hence directly allocated), the current checking using integer calculation results in 0% waste, which is below the expected 20%.
> 
> The suggested fix is to lower the size of that array to 3250 bytes, which meets the criteria on both 32 and 64 bit platforms (and fix the broken calculations).
> 
> Note that we should not change this array size to much lower, because there is another test that fails otherwise.
> 
> Testing:
> 100 successful test runs on x86 and x64 linux each
> 
> Thanks,
>   Thomas

Looks good.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1842


From rriggs at openjdk.java.net  Fri Dec 18 16:36:56 2020
From: rriggs at openjdk.java.net (Roger Riggs)
Date: Fri, 18 Dec 2020 16:36:56 GMT
Subject: Integrated: 8250523: Remove abortOnException diagnostic option from
 TestHumongousNonArrayAllocation.java
In-Reply-To: <WHHhKUcHK5eHeqx1IWlZQ_4CjKcLADc4hFrOAXvExP0=.ded072bb-6bb7-4818-942f-911bb6c910f7@github.com>
References: <WHHhKUcHK5eHeqx1IWlZQ_4CjKcLADc4hFrOAXvExP0=.ded072bb-6bb7-4818-942f-911bb6c910f7@github.com>
Message-ID: <ztfuDPr-_7EmbBG4l36iq2IOS8I9vdITVrLndQ2Uhn8=.ee734935-87db-4190-9310-6ea92d47ce86@github.com>

On Fri, 18 Dec 2020 14:38:02 GMT, Roger Riggs <rriggs at openjdk.org> wrote:

> Test cleanup to remove a command line option used to gather more information about a previous failure.
> The option is no longer needed.
> 
> The original test issue was:     8249217: Unexpected StackOverflowError in "process reaper" thread still happens

This pull request has now been integrated.

Changeset: 1dae45d7
Author:    Roger Riggs <rriggs at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/1dae45d7
Stats:     5 lines in 1 file changed: 0 ins; 0 del; 5 mod

8250523: Remove abortOnException diagnostic option from TestHumongousNonArrayAllocation.java

Reviewed-by: kbarrett

-------------

PR: https://git.openjdk.java.net/jdk/pull/1841


From kbarrett at openjdk.java.net  Fri Dec 18 17:08:55 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 18 Dec 2020 17:08:55 GMT
Subject: RFR: 8258481: gc.g1.plab.TestPLABPromotion fails on Linux x86
In-Reply-To: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
References: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
Message-ID: <tbfE6PF6iFqU2yfVxkKJsFfdQDKniUipo9KG4DPPxB0=.4f7741ed-3307-49cc-9943-04b76b525768@github.com>

On Fri, 18 Dec 2020 15:17:26 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I have reviews for this test bug fix on x86 (but it did not do the correct thing on 64 bit platforms either)?
> 
> There are sone test cases that allocates byte arrays of ~3500 bytes, and expect that almost all allocations occur in the PLABs given a PLAB waste threshold of some percentage, in this case 20%.
> 
> On x64 this is good, as the PLAB size is 4096 *words*, i.e. 32kb, and 20% of that is ~6.5kb. So all objects are allocated in PLABs as expected
> 
> On x86 the PLAB size of 4096 words is only 16kb, and 20% of that is ~3.2kb. This threshold is less than these 3500 bytes, so the test fails.
> 
> It does not fail always (but very often) because of the broken calculation for meeting the threshold: unless really *all* objects copied are of that 3500 byte size (and hence directly allocated), the current checking using integer calculation results in 0% waste, which is below the expected 20%.
> 
> The suggested fix is to lower the size of that array to 3250 bytes, which meets the criteria on both 32 and 64 bit platforms (and fix the broken calculations).
> 
> Note that we should not change this array size to much lower, because there is another test that fails otherwise.
> 
> Testing:
> 100 successful test runs on x86 and x64 linux each
> 
> Thanks,
>   Thomas

test/hotspot/jtreg/gc/g1/plab/TestPLABPromotion.java line 225:

> 223:      */
> 224:     private static boolean checkRatio(long checkedValue, long controlValue) {
> 225:         return ((double)Math.abs(checkedValue) / controlValue) * 100L < MEM_DIFFERENCE_PCT;

Rather than casting to double (I'm not fond of casting in Java either), why not
`(Math.abs(checkedValue) * 100.0) / controlValue < MEM_DIFFERENCE_PCT`
Similarly for checkDifferenceRatio below.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1842


From kbarrett at openjdk.java.net  Sun Dec 20 10:09:03 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Sun, 20 Dec 2020 10:09:03 GMT
Subject: RFR: 8258254: Move PtrQueue flush to PtrQueueSet subclasses
Message-ID: <-QQP60eAp1I0kSia-QngOIk1qZ09fGriqo2A2866xv4=.60d6cb4d-1a13-4443-8cb6-389568113c9c@github.com>

Please review this change to the PtrQueue hierarchy, changing queue flushing
from an intrinsic operation of the queue to an operation the qset performs on
a queue.  This is a piece of the refactoring being done under JDK-8258251,
separated out for easier review.

This change also removes a couple of no longer used internal helper functions
from PtrQueue.

Testing:
mach5 tier1
local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC

-------------

Commit messages:
 - move flush

Changes: https://git.openjdk.java.net/jdk/pull/1851/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1851&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258254
  Stats: 135 lines in 12 files changed: 55 ins; 69 del; 11 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1851.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1851/head:pull/1851

PR: https://git.openjdk.java.net/jdk/pull/1851


From ghueller at outlook.com  Sun Dec 20 19:18:44 2020
From: ghueller at outlook.com (Gerhard Hueller)
Date: Sun, 20 Dec 2020 19:18:44 +0000
Subject: State of "simplified barriers" for G1
Message-ID: <VI1PR07MB5343ACC292436432A36E1E0AD4C10@VI1PR07MB5343.eurprd07.prod.outlook.com>

Hi,

I remember a slide deck talking about the improvements to G1 since JDK8/9 and one bullet point on the todo-list was simplified barriers for G1.

I wonder what happened to this improvement, has it been already implemented? Is this the non-concurrent refinement option implemented by google some time ago?
Improvements in this area would be really great, CMS still provides better throughput for most workloads - with the only real advantage of G1 does offer are avoiding those degenerated STW full GCs.

Thanks, Gerhard


From tschatzl at openjdk.java.net  Sun Dec 20 21:22:16 2020
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Sun, 20 Dec 2020 21:22:16 GMT
Subject: RFR: 8258481: gc.g1.plab.TestPLABPromotion fails on Linux x86 [v2]
In-Reply-To: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
References: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
Message-ID: <omcp0Cu6VoZ41pxpOfzl5R_PP6NYaVCHs-wUASKjz7M=.9e284450-274b-4eb3-9b40-34ea16946fbb@github.com>

> Hi all,
> 
>   can I have reviews for this test bug fix on x86 (but it did not do the correct thing on 64 bit platforms either)?
> 
> There are sone test cases that allocates byte arrays of ~3500 bytes, and expect that almost all allocations occur in the PLABs given a PLAB waste threshold of some percentage, in this case 20%.
> 
> On x64 this is good, as the PLAB size is 4096 *words*, i.e. 32kb, and 20% of that is ~6.5kb. So all objects are allocated in PLABs as expected
> 
> On x86 the PLAB size of 4096 words is only 16kb, and 20% of that is ~3.2kb. This threshold is less than these 3500 bytes, so the test fails.
> 
> It does not fail always (but very often) because of the broken calculation for meeting the threshold: unless really *all* objects copied are of that 3500 byte size (and hence directly allocated), the current checking using integer calculation results in 0% waste, which is below the expected 20%.
> 
> The suggested fix is to lower the size of that array to 3250 bytes, which meets the criteria on both 32 and 64 bit platforms (and fix the broken calculations).
> 
> Note that we should not change this array size to much lower, because there is another test that fails otherwise.
> 
> Testing:
> 100 successful test runs on x86 and x64 linux each
> 
> Thanks,
>   Thomas

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  kbarrett review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1842/files
  - new: https://git.openjdk.java.net/jdk/pull/1842/files/b328d337..3a6e03cf

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1842&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1842&range=00-01

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1842.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1842/head:pull/1842

PR: https://git.openjdk.java.net/jdk/pull/1842


From kbarrett at openjdk.java.net  Sun Dec 20 22:16:56 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Sun, 20 Dec 2020 22:16:56 GMT
Subject: RFR: 8258481: gc.g1.plab.TestPLABPromotion fails on Linux x86 [v2]
In-Reply-To: <omcp0Cu6VoZ41pxpOfzl5R_PP6NYaVCHs-wUASKjz7M=.9e284450-274b-4eb3-9b40-34ea16946fbb@github.com>
References: <Ty_8KOb2TaDv6CHse05PWvIXsKMWT9cSj1Hvd0H4jb0=.9a88cad1-0315-4313-8e19-b4d38f1dbfff@github.com>
 <omcp0Cu6VoZ41pxpOfzl5R_PP6NYaVCHs-wUASKjz7M=.9e284450-274b-4eb3-9b40-34ea16946fbb@github.com>
Message-ID: <L0OerS6Aqre3K8jMVE4K-lvpfF88SE6L550sTHsCXsY=.2efd2d10-bc32-4c4b-99c2-45d5cd534ada@github.com>

On Sun, 20 Dec 2020 21:22:16 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I have reviews for this test bug fix on x86 (but it did not do the correct thing on 64 bit platforms either)?
>> 
>> There are sone test cases that allocates byte arrays of ~3500 bytes, and expect that almost all allocations occur in the PLABs given a PLAB waste threshold of some percentage, in this case 20%.
>> 
>> On x64 this is good, as the PLAB size is 4096 *words*, i.e. 32kb, and 20% of that is ~6.5kb. So all objects are allocated in PLABs as expected
>> 
>> On x86 the PLAB size of 4096 words is only 16kb, and 20% of that is ~3.2kb. This threshold is less than these 3500 bytes, so the test fails.
>> 
>> It does not fail always (but very often) because of the broken calculation for meeting the threshold: unless really *all* objects copied are of that 3500 byte size (and hence directly allocated), the current checking using integer calculation results in 0% waste, which is below the expected 20%.
>> 
>> The suggested fix is to lower the size of that array to 3250 bytes, which meets the criteria on both 32 and 64 bit platforms (and fix the broken calculations).
>> 
>> Note that we should not change this array size to much lower, because there is another test that fails otherwise.
>> 
>> Testing:
>> 100 successful test runs on x86 and x64 linux each
>> 
>> Thanks,
>>   Thomas
>
> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
> 
>   kbarrett review

Marked as reviewed by kbarrett (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/1842


From rkennke at openjdk.java.net  Mon Dec 21 11:42:01 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 21 Dec 2020 11:42:01 GMT
Subject: RFR: 8258714: Shenandoah: Process references before evacuation during
 degen
Message-ID: <Fot4tEQNFE1S9fk5Tidb8wtJVw8xQzP2I3wS2jacX1Q=.def057b0-c122-414e-b724-fb19e23d3439@github.com>

Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
(Note that we already prevent recycle-assist in concurrent phase, except JDK-8258706)

The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.

Testing: 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before

Ok?

-------------

Commit messages:
 - 8258714: Shenandoah: Process references before evacuation during degen

Changes: https://git.openjdk.java.net/jdk/pull/1859/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1859&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258714
  Stats: 20 lines in 4 files changed: 13 ins; 7 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1859.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1859/head:pull/1859

PR: https://git.openjdk.java.net/jdk/pull/1859


From shade at openjdk.java.net  Mon Dec 21 12:01:55 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 21 Dec 2020 12:01:55 GMT
Subject: RFR: 8258714: Shenandoah: Process references before evacuation
 during degen
In-Reply-To: <Fot4tEQNFE1S9fk5Tidb8wtJVw8xQzP2I3wS2jacX1Q=.def057b0-c122-414e-b724-fb19e23d3439@github.com>
References: <Fot4tEQNFE1S9fk5Tidb8wtJVw8xQzP2I3wS2jacX1Q=.def057b0-c122-414e-b724-fb19e23d3439@github.com>
Message-ID: <MiKpVMI6g6FXsnoIvnOiGgd-3rIhHpf7Re9DDXPOnnc=.e7fd1525-b268-4e3b-8be4-33a475aff6c8@github.com>

On Mon, 21 Dec 2020 11:37:31 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
> (Note that we already prevent recycle-assist in concurrent phase)
> 
> The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.
> 
> Testing: 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before
> 
> Ok?

This looks good, but I think this PR should be against `openjdk/jdk16` to get it fixed in JDK 16 (where I think the bug is).

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1859


From rkennke at openjdk.java.net  Mon Dec 21 12:04:54 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 21 Dec 2020 12:04:54 GMT
Subject: Withdrawn: 8258714: Shenandoah: Process references before evacuation
 during degen
In-Reply-To: <Fot4tEQNFE1S9fk5Tidb8wtJVw8xQzP2I3wS2jacX1Q=.def057b0-c122-414e-b724-fb19e23d3439@github.com>
References: <Fot4tEQNFE1S9fk5Tidb8wtJVw8xQzP2I3wS2jacX1Q=.def057b0-c122-414e-b724-fb19e23d3439@github.com>
Message-ID: <5Anpd927uGEPn52Cl1C3zFW8Ar4jwV1_ga7lx2eiuRI=.18038174-1f75-4c61-bb31-002ae2198bed@github.com>

On Mon, 21 Dec 2020 11:37:31 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
> (Note that we already prevent recycle-assist in concurrent phase)
> 
> The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.
> 
> Testing: 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before
> 
> Ok?

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1859


From rkennke at openjdk.java.net  Mon Dec 21 12:13:08 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 21 Dec 2020 12:13:08 GMT
Subject: [jdk16] RFR: 8258714: Shenandoah: Process references before
 evacuation during degen
Message-ID: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>

Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
(Note that we already prevent recycle-assist in concurrent phase)

The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.

Testing:
 - [x] 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before

Ok?

-------------

Commit messages:
 - 8258714: Shenandoah: Process references before evacuation during degen

Changes: https://git.openjdk.java.net/jdk16/pull/55/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=55&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258714
  Stats: 20 lines in 4 files changed: 13 ins; 7 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/55.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/55/head:pull/55

PR: https://git.openjdk.java.net/jdk16/pull/55


From shade at openjdk.java.net  Mon Dec 21 12:13:08 2020
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 21 Dec 2020 12:13:08 GMT
Subject: [jdk16] RFR: 8258714: Shenandoah: Process references before
 evacuation during degen
In-Reply-To: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>
References: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>
Message-ID: <_YAUxikbT2fNmlwITba9aUt-hTgBhNG0RWQ3qrZOP0E=.6fc721d5-cc05-41cc-b5b5-4f997ceb4d5d@github.com>

On Mon, 21 Dec 2020 12:05:23 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
> (Note that we already prevent recycle-assist in concurrent phase)
> 
> The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.
> 
> Testing:
>  - [x] 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before
> 
> Ok?

Looks good!

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk16/pull/55


From rkennke at openjdk.java.net  Mon Dec 21 12:32:13 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 21 Dec 2020 12:32:13 GMT
Subject: [jdk16] RFR: 8258714: Shenandoah: Process references before
 evacuation during degen [v2]
In-Reply-To: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>
References: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>
Message-ID: <te7m4l_a_cW67jQKpU3HROhAsewhbTZNinpEFpQIbGI=.bdc62ccb-df12-4e4d-a0e9-483e17fc864b@github.com>

> Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
> (Note that we already prevent recycle-assist in concurrent phase)
> 
> The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.
> 
> Testing:
>  - [x] 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before
> 
> Ok?

Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:

  Add new _weakrefs_process phases to ShPhaseTimings::is_worker_phase()

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk16/pull/55/files
  - new: https://git.openjdk.java.net/jdk16/pull/55/files/fbeb6f77..29b98531

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk16&pr=55&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk16&pr=55&range=00-01

  Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/55.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/55/head:pull/55

PR: https://git.openjdk.java.net/jdk16/pull/55


From rkennke at openjdk.java.net  Mon Dec 21 12:46:00 2020
From: rkennke at openjdk.java.net (Roman Kennke)
Date: Mon, 21 Dec 2020 12:46:00 GMT
Subject: [jdk16] Integrated: 8258714: Shenandoah: Process references before
 evacuation during degen
In-Reply-To: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>
References: <IlWvcnpDD1kTdrbEABuWbSPsoWEd8ec8soNfP1oFzww=.b5b4b188-e524-4e6f-9379-c1d4b16eea05@github.com>
Message-ID: <3Omh7nUUlAnuq9UlVgI79K4NOt_-K-rZM_pI61V47GY=.6738ce78-e4df-49f3-af65-6113c69abf08@github.com>

On Mon, 21 Dec 2020 12:05:23 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> Currently, when doing degen-cycle, we process references right before immediate-garbage cleanup. It is imperative that we process references before any immediate garbage gets recycled, or else we may end up with bad references during reference-processing. However, the trouble is that immediate garbage can be recycled even before cleanup phase by recycle-assist. For this reason, we must process references before any evacuation during degen GC. It is also more natural: we process refs before weak roots and class-unloading during concurrent cycle, and should do the same during degen cycle.
> (Note that we already prevent recycle-assist in concurrent phase)
> 
> The change also adds STW timing for the weak-refs-processing, rather than polluting the conc-weak-refs timings.
> 
> Testing:
>  - [x] 30 good runs of hotspot_gc_shenandoah, which showed the crash fairly reliably before
> 
> Ok?

This pull request has now been integrated.

Changeset: 2525f39d
Author:    Roman Kennke <rkennke at openjdk.org>
URL:       https://git.openjdk.java.net/jdk16/commit/2525f39d
Stats:     22 lines in 5 files changed: 15 ins; 7 del; 0 mod

8258714: Shenandoah: Process references before evacuation during degen

Reviewed-by: shade

-------------

PR: https://git.openjdk.java.net/jdk16/pull/55


From kbarrett at openjdk.java.net  Tue Dec 22 05:10:08 2020
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 22 Dec 2020 05:10:08 GMT
Subject: RFR: 8256814: WeakProcessorPhases may be redundant
Message-ID: <iiBGYS2V4eoVnlG7_7eniYayNzj1qc1lKjL0bawLyl4=.b16a24a7-cc4a-4a93-becd-9cdd1ac28908@github.com>

Please review this change which eliminates the WeakProcessorPhase class.

The OopStorageSet class is changed to provide scoped enums for the different
categories: StrongId, WeakId, and Id (for the union of strong and weak).
An accessor is provided for obtaining the storage corresponding to a
category value.

Various other enumerator ranges, array sizes and indices, and iterations are
derived directly from the corresponding OopStorageSet category's enum range.

Iteration over a category of enumerators can be done via EnumIterator.  The
iteration over storage objects makes use of that enum iteration, rather than
having a bespoke implementation.  Some use-cases need iteration of the
enumerators, with storage lookup from the enumerator; other use-cases just
need the storage objects.

Testing:
mach5 tier1-6
Local (linux-x64) hotspot:tier1 with -XX:+UseShenandoahGC

-------------

Commit messages:
 - Remove WeakProcessorPhase, adding scoped enum categories to OopStorageSet.

Changes: https://git.openjdk.java.net/jdk/pull/1862/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1862&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256814
  Stats: 1034 lines in 25 files changed: 400 ins; 465 del; 169 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1862.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1862/head:pull/1862

PR: https://git.openjdk.java.net/jdk/pull/1862


From yude.lyd at alibaba-inc.com  Tue Dec 22 10:31:02 2020
From: yude.lyd at alibaba-inc.com (=?UTF-8?B?5p6X6IKy5b63?=)
Date: Tue, 22 Dec 2020 18:31:02 +0800
Subject: =?UTF-8?B?5Zue5aSN77yaU3RhdGUgb2YgInNpbXBsaWZpZWQgYmFycmllcnMiIGZvciBHMQ==?=
In-Reply-To: <VI1PR07MB5343ACC292436432A36E1E0AD4C10@VI1PR07MB5343.eurprd07.prod.outlook.com>
References: <VI1PR07MB5343ACC292436432A36E1E0AD4C10@VI1PR07MB5343.eurprd07.prod.outlook.com>
Message-ID: <fae0cae6-e2c3-43b5-b37a-893d530a8a7d.yude.lyd@alibaba-inc.com>

Hi All,

We are also interested in any follow-ups on this topic. If I recall correctly, when this was discussed in JDK-8226197, one of the TODOs was that the storeload fence can be skipped when Conc Refine is turned off. Regarding this, I'd like to share an idea we have been experimenting in the last couple of months. We took "skipping the fence" a little further and tried to improve the throughput with less harm to pause time.

This is from the observation that many card dirtying operations can go away without concurrent refine. More specifically, writes that produce a reference OldObj1.foo->OldObj2 need not dirty the card corresponding to OldObj1 during young-gc-only phase. Currently, with Conc Refine, this operation will dirty that card, then the card will be refined (thrown away) by the refinement thread, because it discovers that the reference points to an Old region, which is "untracked" during young-gc-only phase.

The refinement thread does this concurrently so that GC doesn't have to do it during a pause. But we (~lmao) realized that we can use a flag to indicate whether a region is tracked, and discard the card dirtying operation immediately in the barrier (after testing against the flag). We can do it without any atomics/fences, just ~5 instructions in the barrier. This way, we get rid of the storeload mem barrier, with Conc Refine turned off, while still getting the same pause time guarantee in young-gc-only phase. But as you can see, Mixed GCs still suffer from having no concurrent refinement.

We saw improvements on Alibaba JDK11u across the benchmarks we used (positive number means better):
Dacapo: cases vary from -3.3% to +5.1%, on average +0.3%
specjbb2015 on 96x2.50GHz, 16 GC threads, 24g mem: critical-jOPS +1.9%, max-jOPS +2.8%
specjbb2015 on 8x2.50GHz, 8 GC threads, 16g mem (observed more Mixed GCs): critical-jOPS +0.1%, max-jOPS +5.7%
specjvm2008: cases vary from -0.7% to +23.4%, on average +3.1%
Extremem: cases vary from -2.1% to +7.8%, on average +1.0%
I'd love to hear any feedbacks, comments, what problems you can see in this approach, conceptually or practically, and back to the topic, whether this idea can be incorporated into your future work/plan of creating a simplified barrier.

Yude Lin


------------------------------------------------------------------
????Gerhard Hueller <ghueller at outlook.com>
?????2020?12?21?(???) 03:19
????hotspot-gc-dev at openjdk.java.net <hotspot-gc-dev at openjdk.java.net>
????State of "simplified barriers" for G1

Hi,

I remember a slide deck talking about the improvements to G1 since JDK8/9 and one bullet point on the todo-list was simplified barriers for G1.

I wonder what happened to this improvement, has it been already implemented? Is this the non-concurrent refinement option implemented by google some time ago?
Improvements in this area would be really great, CMS still provides better throughput for most workloads - with the only real advantage of G1 does offer are avoiding those degenerated STW full GCs.

Thanks, Gerhard

From github.com+16932759+shqking at openjdk.java.net  Fri Dec 25 10:32:09 2020
From: github.com+16932759+shqking at openjdk.java.net (Hao Sun)
Date: Fri, 25 Dec 2020 10:32:09 GMT
Subject: RFR: 8258382: Fix optimization-unstable code involving pointer
 overflow [v2]
In-Reply-To: <DCQ62IxE1y80hguu-H1z4kj2N6mP9gNHv4rD5vlGrsA=.ce8b6f6c-c76f-495e-9256-07f9251a032f@github.com>
References: <DCQ62IxE1y80hguu-H1z4kj2N6mP9gNHv4rD5vlGrsA=.ce8b6f6c-c76f-495e-9256-07f9251a032f@github.com>
Message-ID: <yJwvmBxjLlhpifTWAl_-onjJsLDRzajCrlCvycO16qA=.22cf8416-c5fa-4bd4-a195-abbb96127c82@github.com>

> Optimization-unstable code refers to code that is unexpectedly discarded
> by compiler optimizations due to undefined behavior in the program.
> 
> We applied a static checker called STACK (prototype from SOSP'13 paper
> [1]) to OpenJDK source code and found the following two sites of
> potential unstable code involving pointer overflow.
> 
> Removing undefined behaviors would make the code stable.
> 
> [1] https://css.csail.mit.edu/stack/
> 
> --------
> Note that we tested locally Jtreg tests ( tier1 and jdk::tier3) were passed on Linux x86/aarch64 machines after apply this patch.

Hao Sun has updated the pull request incrementally with one additional commit since the last revision:

  Fix unstable code involving pointer overflow only
  
  Move the patch involving singed integer overflow into another PR.
  In this patch we only fix optimization-unstable code involving pointer
  overflow.
  
  Update the patch based on feedback from upstream.
  1) Remove unnecessary comment.
  2) Remove unnecessary check between end() and top()
  3) Use pointer_delta() to compute the offset between two addresses.
  
  Change-Id: Icade8e1a4b684081036c85fd2a2b65b5c3b27f54
  CustomizedGitHooks: yes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1886/files
  - new: https://git.openjdk.java.net/jdk/pull/1886/files/ca1fbaee..4fc79491

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1886&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1886&range=00-01

  Stats: 12 lines in 5 files changed: 0 ins; 2 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1886.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1886/head:pull/1886

PR: https://git.openjdk.java.net/jdk/pull/1886


From ysuenaga at openjdk.java.net  Thu Dec 31 00:19:09 2020
From: ysuenaga at openjdk.java.net (Yasumasa Suenaga)
Date: Thu, 31 Dec 2020 00:19:09 GMT
Subject: RFR: 8259009: G1 heap summary should be shown in "Heap Parameters"
 window on HSDB
Message-ID: <0H5ICCOuS2CMe6xjvEKSAi_T3qGsTsamid4pzp7EV18=.b7219f71-9be3-40e0-8605-3d0b7edd9e99@github.com>

G1 heap summary (G1 Heap, summaries for each spaces) is shown on console even though I chosen "Heap Parameters" menu on HSDB. It should be shown on "Heap Parameters" window on HSDB.

-------------

Commit messages:
 - G1 heap summary should be shown in "Heap Parameters" window on HSDB

Changes: https://git.openjdk.java.net/jdk/pull/1911/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1911&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8259009
  Stats: 31 lines in 2 files changed: 14 ins; 1 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1911.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1911/head:pull/1911

PR: https://git.openjdk.java.net/jdk/pull/1911


From xliu at openjdk.java.net  Thu Dec 31 08:17:07 2020
From: xliu at openjdk.java.net (Xin Liu)
Date: Thu, 31 Dec 2020 08:17:07 GMT
Subject: RFR: 8259020: null-check of g1 write_ref_field_pre_entry is not
 necessary
Message-ID: <beAjuMaB1ztLbAxkl002Bcw-qVxBzaRezey7r3Vflu4=.d94221ba-569f-429f-b9bb-0e7aec1a1b5a@github.com>

orig is not null because G1BarrierSetC2 won't invoke write_ref_field_pre_entry 
if pre_val is NULL.

-------------

Commit messages:
 - 8259020: null-check of g1 write_ref_field_pre_entry is not necessary

Changes: https://git.openjdk.java.net/jdk/pull/1913/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1913&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8259020
  Stats: 5 lines in 1 file changed: 0 ins; 3 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1913.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1913/head:pull/1913

PR: https://git.openjdk.java.net/jdk/pull/1913