JDK-8229517: Buffered Logging?
Ramki Ramakrishna
rramakrishna at twitter.com
Thu Jan 30 23:11:50 UTC 2020
I sent a few follow up emails earlier today but because I was not (at that
point) subscribed to the mailing list, the emails are awaiting approval
(but direct addressees would have received them). May be Colleen can
moderate/approve those messages through. Will attach patches to ticket
soon. Sorry for my tardiness.
-- ramki
On Thu, Jan 30, 2020 at 2:28 PM David Holmes <david.holmes at oracle.com>
wrote:
> Hi,
>
> On 26/01/2020 9:37 am, Liu, Xin wrote:
> > Dear hotspot developers,
> >
> > I think runtime might be a better forum to continue this discussion.
> > The JBS: https://bugs.openjdk.java.net/browse/JDK-8229517
> >
> > Now our latest patch is here:
> https://cr.openjdk.java.net/~xliu/8229517/webrev02/webrev/
> > I know it’s based on jdk8u. I really appreciate if someone reviews
> it and gives me advice. If the community is still interested, we can keep
> developing using better approach or on the tip of jdk.
>
> When this was previously raised by Ramki it was stated that the 8u
> solution had already been ported to Unified Logging in JDK 11. That is
> the proposal that we would need to see to consider inclusion of this
> feature in mainline JDK.
>
> This proposal seems to be trying to solve a very specific problem in a
> very specific context. It is hard for me to see how this may work as a
> general facility or all the potential bad interactions that may occur.
> It would need to be well isolated so that if you don't ask for it it has
> zero affect. It would also have to be disabled by default.
>
> I cannot really speak to the merits of the proposal in general as I
> don't work in such environments. Ramki originally solicited feedback to
> see if there "was an appetite for this in openjdk". I'm unclear if that
> feedback was obtained elsewhere or whether this is rebooting the
> feedback process?
>
> Thanks,
> David
> -----
>
> > I’d like to present our experiments here. I try to post data on the
> JDK-8229517, but It doesn't look good in comment area.
> >
> > Experiments
> > We can’t reproduce the problem in [1] because harddisks are not
> accessible for us. To simulate IO blocking in JVM, we introduce a bogus
> delay. It spins DelayInGCLogging milliseconds for every fwrite.
> > notproduct(intx, DelayInGCLogging, 0,
> > "Delay milliseconds to simulate IO blocking for GC log testing")
> >
> > It spins in gcLogFileStream::write_blocking.
> https://cr.openjdk.java.net/~xliu/8229517/webrev02/webrev/src/share/vm/utilities/ostream.cpp.cdiff.html
> >
> > We observe GC pause time using -XX:+PrintGCApplicationStoppedTime
> for JavaWorkload with 2g heap. The unit is second. The program is on
> github. https://github.com/zhenyun/JavaGCworkload.
> > $java -Xmx2g -Xms2g -Xloggc:gc.log -XX:+UseG1GC
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails
> -XX:+UnlockExperimentalVMOptions Test
> > sync means jdk8u-dev. async means jdk8u-dev with nonblocking gclog
> change. delayXXX denotes an artifactual delay given by
> -XX:DelayInGCLogging=XXX.
> >
> > median p95 p99
> p100 losses
> > -------------------------------------------------------------------
> > async 0.01357 0.36187 0.48691 0.61792
> > sync 0.01339 0.3618 0.48646 0.63539
> > -------------------------------------------------------------------
> > delay5ms-async 0.01516 0.40564 0.486 0.61053 148156
> > delay5ms-sync 0.20444 0.4428 0.56733 0.69522
> > -------------------------------------------------------------------
> > delay10ms-async0.0123 0.47451 0.48623 0.61083 195460
> > delay10ms-sync 0.38732 0.52424 0.64783 0.78087
> > -------------------------------------------------------------------
> >
> > async and sync perform equally in happy case. JavaWorkload didn’t
> lose any GC log even though the GC activities are intensive and the outputs
> are verbose. If we introduce 5 ms delay for each file-writing,
> delay5ms-sync starts to show longer stop time for p99 and p100. eg. p99
> increases from 0.486s to 0.567s, which is 16,7% longer. delay5ms-async
> remains the same p99 and p100 at expense of losing 148k logging events.
> Because the outputs GC logs are contiguous, they are still readable and
> helpful for problem analysis. After we increase delay to 10ms,
> delay10ms-async still have the same p99 and p100 stop time while
> delay10ms-sync has 22.9% longer stop time.
> >
> > [1]Eliminating Large JVM GC Pauses Caused by Background IO Traffic
> >
> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic
> >
> > Thanks,
> > --lx
> >
> >
> >
> >
> >
> > From: Ramki Ramakrishna <rramakrishna at twitter.com>
> > Date: Tuesday, August 13, 2019 at 4:16 PM
> > To: "hotspot-dev at openjdk.java.net" <hotspot-dev at openjdk.java.net>
> > Cc: "Liu, Xin" <xxinliu at amazon.com>, "Mathiske, Bernd" <
> mathiske at amazon.com>, "Hohensee, Paul" <hohensee at amazon.com>, John Coomes
> <jcoomes at twitter.com>, Tony Printezis <tprintezis at twitter.com>
> > Subject: Re: Buffered Logging?
> >
> >
> > I filed https://bugs.openjdk.java.net/browse/JDK-8229517. Please
> correct/complete if I mangled or neglected to fill any specific required
> fields.
> >
> > Patches forthcoming in the next few days.
> > -- ramki
> >
> > On Tue, Aug 13, 2019 at 2:53 PM Ramki Ramakrishna <mailto:
> rramakrishna at twitter.com> wrote:
> >
> > I'll go ahead and open a JBS ticket to keep the ball rolling on
> this one, and so we have a place to hang some patches and discussions off
> of. Scream if I shouldn't.
> >
> > -- ramki
> >
> > On Mon, Aug 12, 2019 at 6:19 PM Ramki Ramakrishna <mailto:
> rramakrishna at twitter.com> wrote:
> >
> > (Resent... Apologies to anyone who might see duplicates. There was
> an issue with my membership in the list which, I am guessing, has now been
> resolved.)
> >
> > ---------- Forwarded message ----------
> > From: Ramki Ramakrishna <mailto:rramakrishna at twitter.com>
> > To: mailto:hotspot-dev at openjdk.java.net
> > Cc: "Liu, Xin" <mailto:xxinliu at amazon.com>, "Mathiske, Bernd"
> <mailto:mathiske at amazon.com>, "Hohensee, Paul" <mailto:hohensee at amazon.com>,
> John Coomes <mailto:jcoomes at twitter.com>, Tony Printezis <mailto:
> tprintezis at twitter.com>
> > Bcc:
> > Date: Mon, 12 Aug 2019 15:48:17 -0700
> > Subject: Buffered Logging?
> >
> > In the JDK 7 time-frame, some folks noticed much longer than
> expected pauses that seemed to coincide with GC logging in the midst of a
> GC safepoint. In that setup, the GC logs were going to a disk file (these
> were often useful for post-mortem analyses) rather than to a RAM-based
> tmpfs which had been the original design center assumption. The
> vicissitudes of the dirty page flushing policy in Linux when
> > IO load on the machine (not necessarily the JVM process doing the
> logging)
> > could affect the length and duration of these inline logging stalls.
> >
> > A buffered logging scheme was then implemented by us (and
> independently by
> > others) which we have used successfully to date to avoid these
> pauses in high i/o
> > multi-tenant environments.
> >
> > We have recently ported the OpenJDK 8u based scheme past the
> Unified Logging related changes and have been using it on OpenJDK 11u.
> >
> > In various fora, some of us have heard people express an interest
> in such an implementation.
> >
> > Would there be appetite for this in openjdk (there are some
> differences between our scheme in OpenJDK 8u and their subsequent
> post-Unified-Logging port to OpenJDK 11)?
> >
> > If so, we'd like to open a JBS ticket, attach some patches rebased
> to current OpenJDK tip, and start preparing a webrev for discussion with a
> view to possible upstreaming if it converges past review/discussion/debate.
> >
> > PS: In recent discussion at the JVMLS, mention was made of
> something that might be similar in support of JFR streaming, so our rebase
> to tip and familiarization with any JFR-associated buffered logging might
> well make this easier or, possibly, moot.
> >
> > Thoughts, questions, comments, feedback?
> > -- ramki
> >
> > --
> > JVM Team, Platform Engineering, Twitter (San Francisco)
> >
> >
> >
> > --
> > JVM Team, Infrastructure Engineering, Twitter (San Francisco)
> >
> >
> >
> > --
> > JVM Team, Infrastructure Engineering, Twitter (San Francisco)
> >
> >
> >
>
--
JVM Team, Infrastructure Engineering, Twitter (San Francisco)
More information about the hotspot-runtime-dev
mailing list