Garbage Collection Pauses & Non-interruptable System Calls

Wed Apr 15 17:16:40 UTC 2009

A good page to read.  Take a look at the Thread Management section too, esp.
VM Operations and Safepoints.

On x86/x64/sparc the polling scheme described there (thread asking 
'should I block
for a safepoint') works by issuing a read from a normally readable page 
called the 'polling
page'.  When the vm thread calls a safepoint, it sets the polling page 
protection to
no access, which causes a thread issuing the read (we call the read a 
'safepoint poll') to
fault, trap and block.  This scheme would work on Itanium as well, 
though one could
also use a branch on one of the condition code bits instead of a load.

Paul

Mark R Maxey wrote:
>
> Here's is the URL I tried to include:
>
> http://openjdk.java.net/groups/hotspot/docs/RuntimeOverview.html#Java%20Native%20Interface%20(JNI)|outline 
>
>
>
> We are using the concurrent GC with a single generation.  After we 
> workaround this immediate issue, we plan to migrate to multiple 
> generations and tweak the sizes of the generations.
>
> It is encouraging to hear that HotSpot uses a different algorithm. 
>  I'll see what I can do to accellerate its usage.
>
> I'd like to understand HotSpot's behavior a little better.  It sounds 
> like your algorithm can do the mark & sweep on a Java thread in native 
> code without pausing that thread?
>
> Thanks for all the feedback.  I'll continue reading ...
>
>
> Mark Maxey
> Raytheon, Garland
> 580/2/P22-1
> (972)205-5760
> Mark_R_Maxey at Raytheon.com
>
>
> *Paul Hohensee <Paul.Hohensee at Sun.COM>*
> Sent by: Paul.Hohensee at Sun.COM
>
> 04/14/2009 05:06 PM
>
> 	
> To
> 	Mark R Maxey <Mark_R_Maxey at RAYTHEON.COM>
> cc
> 	"Y. Srinivas Ramakrishna" <Y.S.Ramakrishna at Sun.COM>, 
> hotspot-gc-dev at openjdk.java.net, Andrew M Dungan 
> <Andrew_M_Dungan at RAYTHEON.COM>, David A Lilly 
> <David_A_Lilly at RAYTHEON.COM>
> Subject
> 	Re: Garbage Collection Pauses & Non-interruptable System Calls
>
>
>
> 	
>
>
>
>
>
> The url for "Hotspot Runtime Overview of JNI" didn't come through for me.
>
> In any case, as Ramki noted, Hotspot lets native code called from Java
> run free
> during GC pauses, unless said native code calls back into the jvm for
> some reason,
> including just returning back to Java from native.  Hotspot does _not_
> try to
> pause threads executing native code, nor does Hotspot use signal 
> mechanisms
> to block threads executing Java code so GC can happen.  Hotspot uses a
> polling
> mechanism instead because signal delivery mechanisms on Unix and Linux are
> unreliable.  I doubt you'll be able to reproduce the JRockit issue with
> Hotspot.
> You might have other problems, but not that one.
>
> I suggest forwarding your questions to the JRockit team at Oracle.  
> Though I suspect
> some of them are on this list too. :)
>
> Please try Hotspot before you give up on Java.  From your description,
> you should
> use the CMS (concurrent mark-sweep) collector.  See also the GC
> performance info
> accessible from
>
> http://java.sun.com/performance
>
> and Jon Masamitsu's blog
>
> http://blogs.sun.com/jonthecollector/
>
> Paul
>
> Mark R Maxey wrote:
> >
> > After reading the last paragraph of HotSpot Runtime Overview of JNI
> > more closely, I understand more.  I think we're almost on the same
> > page.  The problem seems to be that all threads are suspended until
> > the Java thread returns from the native call.
> >
> >
> > Mark Maxey
> > Raytheon, Garland
> > 580/2/P22-1
> > (972)205-5760
> > Mark_R_Maxey at Raytheon.com
> >
> >
> > *Mark R Maxey/US/Raytheon*
> >
> > 04/14/2009 12:37 PM
> >
> >                  
> > To
> >                  "Y. Srinivas Ramakrishna" <Y.S.Ramakrishna at Sun.COM>
> > cc
> >                  hotspot-gc-dev at openjdk.java.net, 
> Y.S.Ramakrishna at Sun.COM, Andrew M
> > Dungan/US/Raytheon at MAIL, David A Lilly/RCS/Raytheon/US at MAIL, Mark R
> > Maxey/US/Raytheon at MAIL
> > Subject
> >                  Re: Garbage Collection Pauses & Non-interruptable 
> System CallsLink
> > 
> <Notes://MK2-MSG05/86256EF3005F851B/38D46BF5E8F08834852564B500129B2C/E1F59319D41C5C4886257598005422BD> 
>
> >
> >
> >
> >
> >                  
> >
> >
> >
> >
> > Thank you for your reply.  The speed and depth of your response is
> > encouraging.
> >
> > Let me confess something I should have done up-front.  The behavior
> > we're seeing is using JDK 5 via JRockit R27.6.  We're in the process
> > of reproducing these problems under HotSpot JDK 6 Update 12, though
> > it'll be a few days before we can do so.  The reason I'm pinging this
> > forum is to research in advance what differences we might expect
> > between the two JVMs.
> >
> > Let me describe exactly what we're seeing as provided by doing an
> > strace on the process:
> >
> >    1. A Java thread calls a native C code that ultimately calls a
> >       pwrite().  We suspect that the device driver ultimately makes a
> >       non-interruptable system call to transfer the data directly from
> >       our mem-aligned 128 MB buffer to disk.
> >    2. The GC thread sends a tgkill(SIGUSR1) to all threads
> >    3. The GC thread waits on mutex #1 (presumably waiting on all the
> >       threads to signal it that it can begin GC)
> >    4. The Java thread wakes mutex #1 (presumably signaling the GC it
> >       is ready to go)
> >    5. The Java thread waits on mutex #2 (presumably waiting on GC to
> >       finish)
> >    6. The GC thread wakes mutex #2 (presumably telling the Java thread
> >       it can resume processing)
> >
> >
> > We're seeing times between #3 & #4 that are proportional to the amount
> > of time spent in the pwrite().  We also see some overhead between #5
> > &#6 that is proportional to the number of Java threads we have
> > (currently between 30 & 40 that we've created not counting the JVMs).
> >
> > Unfortunately, the JRockit logging only reveals the actual time GC
> > takes (#4 - #5).  Hopefully, HotSpot's logging includes the total time
> > (#2 - #6).
> >
> > I'm pursuing these questions with Oracle/BEA.  Again, I'm just trying
> > get a feel for HotSpot's behavior in comparison.  While we're using
> > JRockit today, HotSpot will be our ultimate platform.
> >
> >
> > One alternate solution that has been suggested is infrequently calling
> > GC explicitly within our code during special times when we know we can
> > afford to take the hit.  We would even accept a greater hit than
> > normal if we could avoid being impacted during critical times.  
> > Everything I've ever read says to not do this, but I'm curious why in
> > this case this is a bad idea.  Note that we're using the concurrent
> > GC, so I'm not even sure if System.gc() supports this.
> >
> >
> > Thanks again!
> >
> >
> > Mark Maxey
> > Raytheon, Garland
> > 580/2/P22-1
> > (972)205-5760
> > Mark_R_Maxey at Raytheon.com
> >
> >
> > *"Y. Srinivas Ramakrishna" <Y.S.Ramakrishna at Sun.COM>*
> > Sent by: Y.S.Ramakrishna at Sun.COM
> >
> > 04/14/2009 10:19 AM
> >
> >                  
> > To
> >                  Mark R Maxey <Mark_R_Maxey at raytheon.com>
> > cc
> >                  hotspot-gc-dev at openjdk.java.net
> > Subject
> >                  Re: Garbage Collection Pauses & Non-interruptable 
> System Calls
> >
> >
> >
> >                  
> >
> >
> >
> >
> >
> > Hello Mark --
> >
> > I am assuming your threads doing DMA are actually executing native
> > code (or
> > waiting for signals in native code).  Threads in native code do not
> > need to
> > synchronize \in any manner with GC while they are executing native code.
> > It is only the transitions to and from native mode (from Java code) that
> > require
> > synchronization. Roughly speaking, the JVM fences off those native
> > threads so that, in the event that they need to re-enter the JVM or
> > access the Java heap, they will be suspended until a GC/safepoint that
> > is in progress is completed.
> >
> > Thus, I do not believe you need to fear that a long-running DMA call 
> would
> > cause GC's to be delayed (which I understand is your  main concern 
> below).
> >
> > Have you actually seen cases where this is happening? If so, what
> > version of the JDK
> > are you running?
> >
> > thanks.
> > -- ramki
> >
> > Mark R Maxey wrote:
> > > Hello,
> > >
> > > I have a problem I was hoping with which I need some advice.
> > >
> > > We wrote a custom JNI library for file I/O that sits underneath the
> > Java
> > > NIO FileChannel.  One of our driving requirements is highly performant
> > > file I/O.  We achieved this by doing DMA I/O from large direct memory
> > > aligned buffers.  The JNI is very trivial - it just takes a buffer and
> > > performs the appropriate system call based on the parameters given
> > to it.
> > > 100% of the logic for calculating offsets, buffer management, etc.
> > is all
> > > in our implementation of java.nio.FileChannel.
> > >
> > > Here's our problem:  We have requirements to respond to some
> > messages in
> > > as little as 250 ms.  During this time, we're doing file writes of
> > 128 MB
> > > that take around 200 ms.  When GC kicks in, it tries to pause all
> > threads.
> > >  Because the DMA write is non-interruptable, GC waits for the I/O to
> > > complete before being able to pause the thread & run.  That means
> > that GC
> > > can take well over 200 ms putting us in grave danger of missing our
> > > timelines.  Worse, there is always the chance the write will hang
> > due to a
> > > bad filesystem.   We've seen this cause the JVM to hang indefinitely
> > > forcing us to cycle the process.
> > >
> > > Unless we find a solution that allows GC to continue while doing
> > this I/O,
> > > we will convert all the code to C++.  While that might solve our
> > timeline
> > > for that particular process, we have many less performance critical
> > > processes that use our JNI FileChannel libraries that would hang if a
> > > filesystem goes bad.
> > >
> > > We've tweaked the file system device timeouts down to a minimum, but
> > they
> > > are still very high (on the order of several seconds to minutes).  It
> > > would be nice if the JVM had a similar timeout for pausing threads,
> > i.e.,
> > > where the pause times out after X number of milliseconds.  We'd be
> > willing
> > > to sacrifice a larger heap size and postpone GC in the hopes that
> > the next
> > > time it ran GC, we wouldn't be in the middle of a non-interruptable
> > system
> > > call.
> > >
> > > The only solution being batted around here is pushing the system
> > calls out
> > > of Java threads and into native threads.  The JNI call would push
> > the info
> > > for the I/O call onto a native C++ queue where a small number of 
> native
> > > threads (3?) would pull the data off the queue and perform the actual
> > > system call.   The trick is finding an implementation where the Java
> > > thread blocked waiting on a response from the native thread is
> > > interruptible.  All this assumes GC doesn't try to pause native
> > threads.
> > > We thought about using pthreads, but were concerned about its signal
> > > interaction with the JVM.  So, we're leaning towards using pipes to
> > push
> > > data from one thread to another.
> > >
> > > If you have any suggestions or advice, we are desperate for your 
> wisdom.
> > >
> > > Thanks!
> > >
> > >
> > > Mark Maxey
> > > Raytheon, Garland
> > > 580/2/P22-1
> > > (972)205-5760
> > > Mark_R_Maxey at Raytheon.com
> > >
> > >  
> >
> >
> >
> >
> >
>
>
>