Jetty and Loom

Tue Jan 5 14:57:12 UTC 2021

BTW, just to let me understand if there’s something of interest in the deep
stacks case, when the Loom test had a 1.5 GC pause and the platform threads
had zero, how many actual GC collections happened in the platform thread case?

The reasons I’m asking is that it’s possible that the work the GC does is the
same in both cases, it’s just that a GC was never triggered in the platform thread
case, but in a real application it will.

If a GC is triggered in both cases, then the two cases *should* require similar 
amount of work in the GC, but due to a bug in Loom, virtual threads *may* require
more. If that is the case, that’s another thing to fix.

— Ron

On 5 January 2021 at 14:47:22, Ron Pressler (ron.pressler at oracle.com (mailto:ron.pressler at oracle.com)) wrote:

> Both the 4% CPU increase and GC pauses (I’ll get to what I mean later) are
> bugs that we’ll try to fix. Especially the GC interaction uses code that is
> currently at constant flux and is known to be suboptimal. I’m genuinely happy
> that you’ve reported those bugs, but bugs are not limitations of the model.
>
> Having said that, the interesting thing I see in the GC behaviour may not be
> what you think is interesting. I don’t think the deep-stack test actually
> exposed a problem of any kind, because when two things have slightly different
> kinds of overhead, you can easily reduce the actual work to zero, and
> make the impact of overhead as high as you like, but that’s not interesting
> for real work. I could be wrong, but I’ll give it another look.
>
> The one thing in the posts — and thank you for them! — that immediately flashed
> in blinking red to me as some serious issue is the following:
>
> Platform:
>
> Elapsed Time: 10568 ms
> Time in Young GC: 5 ms (2 collections)
> Time in Old GC: 0 ms (0 collections)
>
> Virtual:
>
> Elapsed Time: 10560 ms
> Time in Young GC: 23 ms (8 collections)
> Time in Old GC: 0 ms (0 collections)
>
> See that increase in young collection pause? That is the one thing that
> actually touches on some core issue re virtual-threads’ design (they interact
> with the GC and GC barriers in a particular way that could change the young
> collection), and might signify a potentially serious bug.
>
> And no, it is not only not obvious but downright wrong that moving stacks from
> C stacks to the Java heap increases GC work assuming there is actual real
> work in the system, too. GCs generally don’t work like people imagine they do.
> The reason I said that GC work might be reduced is because of some internal
> details: the virtual thread stacks is mutated in a special way and at a special
> time so that it doesn’t require GC barriers; this is not true for Java objects
> in general.
>
> I’m reminded that about a year ago, I think, I saw a post about some product
> written in Java. The product appears to be very good, but the post said
> something specific that induced a face-palm. They said that their product is
> GC “transparent” because they do all their allocations upfront. I suggested
> that instead of just using Parallel GC they try with G1, and immediately
> they came back, surprised, that they’d seen a 15% performance hit. The reason
> is that allocations and mutation (and even reads) cause different work at
> different times by different GC, and mutating one object at one specific time
> might be more or less costly than allocating a new one, depending on the
> GC and depending on the precise usage and timing of that particular object.
>
> The lesson is that trying to reverse engineer and out-think the VM is not
> only futile — not only because there are too many variables but also because
> the implementation is constantly changing — but it can result in downright bad
> advice that’s a result of overfitting the advice to very particular circumstances.
>
> Instead, it’s important to focus on generalities. The goal of project Loom
> is to make resource management around scheduling easy and efficient. When it
> doesn’t do that, it’s a bug. I don’t agree at all with your characterisation
> of what’s a limitation and what isn’t, but I don’t care: think of them however
> you like. If you find bugs, we all win! But try harder, because I think you’ve
> just scratched the surface.
>
> — Ron
>
>
> On 5 January 2021 at 13:58:40, Greg Wilkins (gregw at webtide.com (mailto:gregw at webtide.com)) wrote:
>
> >
> > Ron,
> >
> > On Tue, 5 Jan 2021 at 13:19, Ron Pressler wrote:
> > > If the listener might think it means that virtual
> > > threads somehow *harm* the execution of CPU bound tasks, then it’s misleading.
> > I've demonstrated (https://urldefense.com/v3/__https://github.com/webtide/loom-trial/blob/main/src/main/java/org/webtide/loom/CPUBound.java__;!!GqivPVa7Brio!M7gVGdjgN0hBofV52hMhQySIqxgmoHq9HlhG63v-LUCbnq63I7VVbwkfuC4c-kCx-g$) that using virtual threads can defer CPU bound tasks
> > I've demonstrated (https://urldefense.com/v3/__https://webtide.com/do-looms-claims-stack-up-part-2/__;!!GqivPVa7Brio!M7gVGdjgN0hBofV52hMhQySIqxgmoHq9HlhG63v-LUCbnq63I7VVbwkfuC6tKeoB4Q$) that using virtual threads can double the CPU usage over pool kernel threads. Even their best usage in my tests has a 4% CPU usage increase.
> >
> > > The “additional load on GC” statement is not, I believe, demonstrated.
> >
> > I've demonstrated (https://urldefense.com/v3/__https://webtide.com/do-looms-claims-stack-up-part-1/__;!!GqivPVa7Brio!M7gVGdjgN0hBofV52hMhQySIqxgmoHq9HlhG63v-LUCbnq63I7VVbwkfuC45Vhd96A$) 1.5s GC pauses when using virtual threads at levels that kernel threads handle without pause.
> >
> > Besides, isn't it self evident that moving stacks from static kernel memory to dynamic heap is going to have additional GC load? You've even described how recycling virtual threads will not help reduce that additional load on the GC as a reason not to pool virtual threads!
> >
> > > It is tautologically true that if your use case does not benefit from virtual
> > > threads then it does not benefit from virtual threads.
> >
> > Indeed, but half the time it is not clear that you acknowledge that there are use cases that are not suitable for virtual threads. Just paragraphs above you are implying that there is "no *harm*" to use virtual threads for CPU Bound tasks!
> >
> > > > Totally confused by the messaging from this project.
> > > I’m confused by what you find confusing.
> >
> > This is not just a hive mind issue as the messaging just from you is inconsistent. One moment you are happy to describe limitations of virtual threads and agree that there are use cases that do benefit. Then the next moment we are back to "what limitations", "no harm" , "not demonstrated" etc..
> >
> > None of my demonstrations are fatal flaws. Some may well be fixable, whilst others are just things to note when making a thread choice. But to deny them just encourages dwelling on the negatives rather than the positives!
> >
> > cheers
> >
> > --
> > Greg Wilkins CTO http://webtide.com (https://urldefense.com/v3/__http://webtide.com__;!!GqivPVa7Brio!M7gVGdjgN0hBofV52hMhQySIqxgmoHq9HlhG63v-LUCbnq63I7VVbwkfuC6uPVrMnA$)