Stacks, safepoints, snapshotting and GC

Erik Österlund erik.osterlund at
Thu Jan 9 23:02:09 UTC 2020

Hi Stuart,

The plan is to utilize something I call a stack watermark barrier. It involves rewriting the thread-local handshakes of today to use a conditional branch. For polls we today on returns the conditional branch will compare the stack pointer (rsp) to a thread-local value. This improved polling scheme can trap thread execution for 1) safepoints, 2) handshakes, 3) returns to unprocessed frames. This type of barrier allows the GC to concurrently slide the watermark (disarm frames) without having to poke at it.

There won’t be any forseeable issues with weak memory ordering here. You are gonna love it!


> On 9 Jan 2020, at 18:06, Stuart Monteith <stuart.monteith at> wrote:
> Hi Per,
>   I'm curious as to how we'll implement this on Aarch64, the concern
> being how this would be made efficient with a weak memory model. Are
> you looking at using return barriers?
> BR,
>   Stuart
>> On Tue, 7 Jan 2020 at 12:59, Per Liden <per.liden at> wrote:
>> While we're on this topic I thought I could mention that part of the
>> plans to make ZGC a true sub-millisecond max pause time GC includes
>> removing thread stacks completely from the GC root set. I.e. in such a
>> world ZGC will not scan any thread stacks (virtual or not) during STW,
>> instead they will be scanned concurrently.
>> But we're not quite there yet...
>> cheers,
>> Per
>>> On 12/19/19 1:52 PM, Ron Pressler wrote:
>>> This is a very good question. Virtual thread stacks (which are actually
>>> continuation stacks from the VM’s perspective) are not GC roots, and so are
>>> not scanned as part of the STW root-scanning. How and when they are scanned
>>> is one of the core differences between the default implementation and the
>>> new one, enabled with -XX:+UseContinuationChunks.
>>> Virtual threads shouldn’t make any impact on time-to-safepoint, and,
>>> depending on the implementation, they may or may not make an impact
>>> on STW young-generation collection. How the different implementations
>>> impact ZGC/Shenandoah, the non-generational low-pause collectors is yet
>>> to be explored and addressed. I would assume that their current impact
>>> is that they simply crash them :)
>>> - Ron
>>>> On 19 December 2019 at 11:40:03, Holger Hoffstätte (holger at at wrote:
>>>> Hi,
>>>> Quick question - not sure if this is an actual issue or somethign that has
>>>> been addressed yet; pointers to docs welcome.
>>>> How does (or will) Loom impact stack snapshotting and TTSP latency?
>>>> There have been some amazing advances in GC with Shenandoah and ZGC recently,
>>>> but their low pause times directly depend on the ability to quickly reach
>>>> safepoints and take stack snapshots for liveliness analysis.
>>>> How will this work with potentially one or two orders of magnitude more
>>>> virtual thread stacks? If I understand correctly TTSP should only depend
>>>> on the number of carrier threads (which fortunately should be much lower
>>>> than in legacy designs), but somehow the virtual stacks stil need to be
>>>> scraped..right?
>>>> thanks,
>>>> Holger

More information about the zgc-dev mailing list