RFR(M): 8231460: Performance issue (CodeHeap) with large free blocks
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Nov 4 17:04:41 UTC 2019
Hi Andrew, Lutz,
I agree with Lutz in this case. I think the patch complexity could be
reduced if needed (see my mail to Lutz) if complexity is a concern, but I
like most of these changes and the comment improvements are nice.
Just my 5 cent.
Cheers, Thomas
On Mon, Nov 4, 2019 at 4:36 PM Schmidt, Lutz <lutz.schmidt at sap.com> wrote:
> Hi Andrew,
>
> thank you for your thoughts. I do not agree to your conclusion, though.
>
> There are two bottlenecks in the CodeHeap management code. One is in
> CodeHeap::mark_segmap_as_used(), uncovered by OverflowCodeCacheTest.java.
> The other is in CodeHeap::add_to_freelist(), uncovered by
> StressCodeCacheTest.java.
>
> Both bottlenecks are tackled by the recommended changeset.
>
> CodeHeap::mark_segmap_as_used() is no longer O(n*n) for the critical
> "FreeBlock-join" case. It actually is O(1) now. The time reduction from >
> 80 seconds to just a few milliseconds is proof of that statement.
>
> CodeHeap::add_to_freelist() is still O(n*n), with n being the free list
> length. But the kick-in point of the non-linearity could be significantly
> shifted towards larger n. The time reduction from approx. 8 seconds to 160
> milliseconds supports this statement.
>
> I agree it would be helpful to have a "real-world" example showing some
> improvement. Providing such evidence is hard, though. I could instrument
> the code and print some values form time to time. It's certain this
> additional output will mess up success/failure decisions in our test
> environment. Not sure everybody likes that. But I will give it a try and
> take the hits. This will be a multi-day effort.
>
> On a general note, I am always uncomfortable knowing of a O(n*n) effort,
> in particular when it could be removed or at least tamed considerably.
> Experience tells (at least to me) that, at some point in time, n will be
> large enough to hurt.
>
> I'll be back.
>
> Thanks,
> Lutz
>
>
> On 04.11.19, 11:08, "Andrew Dinn" <adinn at redhat.com> wrote:
>
> Hi Lutz,
>
> I'll summarize my thoughts here rather than answer point by point.
>
> The patch successfully addresses the worst case performance but it
> seems
> to me extremely unlikely that we will see anything that approaches that
> case in real applications. So, that doesn't argue for pushing the
> patch.
>
> The patch does not seem to make a significant difference to the stress
> test. This test is also not necessarily 'representative' of real cases
> but it is much more likely to be so than the worst case test. That
> suggests to me that the current patch is perhaps not worth pursuing (it
> ain't really broke so ...). Especially so given that it is not
> possible
> to distinguish any benefit when running the Spec benchmark apps. One
> could argue that the patch looks like it will do no harm and may do
> good
> in pathological cases but that's not really good enough reason to make
> a change. We really need evidence that this is worth doing.
>
> The free list 'search bottleneck' certainly looks like a more promising
> problem to tackle than the 'merge problem'. However, once again this
> 'problem' may just be an artefact of running this specific test rather
> than anything that might happen in real life.
>
> I think the only way to find out for sure whether the current patch or
> a
> patch that addresses the 'search bottleneck' is going to be beneficial
> is to instrument the JVM to record traces for code-cache use from real
> apps and then replay allocations/frees based on those traces to see
> what
> difference a patch makes and how much this might help the overall
> execution time.
>
> regards,
>
>
> Andrew Dinn
> -----------
>
> On 31/10/2019 16:55, Schmidt, Lutz wrote:
> > Hi Andrew, (and hi to the interested crowd),
> >
> > Please accept my apologies for taking so long to get back.
> >
> > These tests (OverflowCodeCacheTest and StressCodeCacheTest) were
> causing me quite some headaches. Some layer between me and the test
> prevents the vm (in particular: the VMThread) from terminating normally.
> The final output from my time measurements is therefore not generated or
> thrown away. Adding to that were some test machine unavailabilities and a
> bug in my measurement code, causing crashes.
> >
> > Anyway, I added some on-the-fly output, printing the timer values
> after 10k measurement intervals. This reveals some interesting, additional
> facts about the tests and the CodeHeap management methods. For detailed
> numbers, refer to the files attached to the bug (
> https://bugs.openjdk.java.net/browse/JDK-8231460). For even more detail,
> I can provide the jtr files on request.
> >
> >
> > OverflowCodeCacheTest
> > =====================
> > This test runs (in my setup) with a 1GB CodeCache.
> >
> > For this test, CodeHeap::mark_segmap_as_used() is THE performance
> hog. 40% of all calls have to mark more than 16k segment map entries (in
> the not optimized case). Basically all of these calls convert to len=1
> calls with the optimization turned on. Note that during FreeBlock joining,
> the segment count is forced to 1(one). No wonder the time spent in
> CodeHeap::mark_segmap_as_used() collapses from >80sec (half of the test
> runtime) to <100msec.
> >
> > CodeHeap::add_to_freelist() on the other hand, is almost not
> observable. Average free list length is at two elements, making even linear
> search really quick.
> >
> >
> > StressCodeCacheTest
> > ===================
> > With a 1GB CodeCache, this test runs into a 12 min timeout, set by
> our internal test environment. Scaling back to 300MB prevents the test from
> timing out.
> >
> > For this test, CodeHeap::mark_segmap_as_used() is not a factor. From
> 200,000 calls, only a few (less than 3%) had to process a block consisting
> of more than 16 segments. Note that during FreeBlock joining, the segment
> count is forced to 1(one).
> >
> > Another method is popping up as performance hog instead:
> CodeHeap::add_to_freelist(). More than 8 out of 40 seconds of test runtime
> (before optimization) are spent in this method, for just 160,000 calls. The
> test seems to create a long list of non-contiguous free blocks (around
> 5,500 on average). This list is linearly scanned to find the insert point
> for the free block at hand.
> >
> > Suffering as well from the long free block list is
> CodeHeap::search_freelist(). It uses another 2.7 seconds for 270,000
> calls.
> >
> >
> > SPEVjvm2008 suite
> > =================
> > With respect to the task at hand, this is a well-behaved test suite.
> Timing shows some before/after difference, but nothing spectacular. The
> measurements due not provide evidence of a performance bottleneck.
> >
> >
> > There were some minor adjustments to the code. Unused code blocks
> have been removed as well. I have therefore created a new webrev. You can
> find it here:
> > http://cr.openjdk.java.net/~lucy/webrevs/8231460.01/
> >
> > Thanks for investing your time!
> > Lutz
> >
> >
> > On 21.10.19, 15:06, "Andrew Dinn" <adinn at redhat.com> wrote:
> >
> > Hi Lutz,
> >
> > On 21/10/2019 13:37, Schmidt, Lutz wrote:
> > > I understand what you are interested in. And I was hoping to
> be able
> > > to provide some (first) numbers by today. Unfortunately, the
> > > measurement code I activated last Friday was buggy and blew
> most of
> > > the tests I had hoped to run over the weekend.
> > >
> > > I will take your modified test and run it with and without my
> > > optimization. In parallel, I will try to generate some
> (non-random)
> > > numbers for other tests.
> > >
> > > I'll be back as soon as I have results.
> >
> > Thanks for trying the test and also for deriving some call stats
> from a
> > real example. I'm keen to see how much your patch improves
> things.
> >
> > regards,
> >
> >
> > Andrew Dinn
> > -----------
> > Senior Principal Software Engineer
> > Red Hat UK Ltd
> > Registered in England and Wales under Company Registration No.
> 03798903
> > Directors: Michael Cunningham, Michael ("Mike") O'Neill
> >
> >
> >
> >
>
>
>
>
More information about the hotspot-compiler-dev
mailing list