From brianfromoregon at gmail.com  Tue May  5 03:17:08 2015
From: brianfromoregon at gmail.com (Brian Harris)
Date: Mon, 4 May 2015 20:17:08 -0700
Subject: java8 metaspace issue
Message-ID: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>

Hi,

I find that this code crashes in 8u40 after getting up to about 900 when
run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with
-XX:MaxPermSize=10m it does not crash.

Is that expected? It seems similar to
https://bugs.openjdk.java.net/browse/JDK-8025635

Thanks,
Brian

// uses Guava's CacheBuilder
public class Main {

    public static void main(String[] args) throws Exception {
        Cache<Integer, Object> cache = CacheBuilder.newBuilder()
                .softValues()
                .build();

        for (int i = 0; i < 50_000; i++) {
            URL[] dummyUrls = {new URL("file:" + i + ".jar")};
            URLClassLoader cl = new URLClassLoader(dummyUrls,
Thread.currentThread().getContextClassLoader());
            Object proxy = Proxy.newProxyInstance(cl, new
Class[]{Foo.class}, new InvocationHandler() {
                @Override
                public Object invoke(Object proxy, Method method, Object[]
args) throws Throwable {
                    return null;
                }
            });
            cache.put(i, proxy);
            System.out.println(i);
        }
    }

    public interface Foo {
            void x();
    }
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150504/6ab35ce2/attachment.html>

From bengt.rutisson at oracle.com  Tue May  5 09:05:11 2015
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Tue, 05 May 2015 11:05:11 +0200
Subject: java8 metaspace issue
In-Reply-To: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>
References: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>
Message-ID: <554887C7.7080506@oracle.com>


Hi Brian,

On 2015-05-05 05:17, Brian Harris wrote:
> Hi,
>
> I find that this code crashes in 8u40 after getting up to about 900 
> when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with 
> -XX:MaxPermSize=10m it does not crash.

Thanks for providing the example program. For me it does not crash but 
if I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does 
it crash for you?

The OutOfMemoryError can be explained by the fact that when you run with 
-XX:MaxPermSize=10m there is some aligning going on and in the end you 
actually end up with a perm gen that is 20m large. Here's what I get 
when I use

$ java -XX:+PrintGCDetails  -XX:MaxPermSize=10m

Heap
  PSYoungGen      total 150528K, used 10363K [0x0000000758c80000, 
0x0000000763400000, 0x0000000800000000)
   eden space 129536K, 8% used 
[0x0000000758c80000,0x000000075969ed58,0x0000000760b00000)
   from space 20992K, 0% used 
[0x0000000761f80000,0x0000000761f80000,0x0000000763400000)
   to   space 20992K, 0% used 
[0x0000000760b00000,0x0000000760b00000,0x0000000761f80000)
  ParOldGen       total 342016K, used 0K [0x000000060a600000, 
0x000000061f400000, 0x0000000758c80000)
   object space 342016K, 0% used 
[0x000000060a600000,0x000000060a600000,0x000000061f400000)
  PSPermGen       total 20480K, used 3382K [0x0000000609200000, 
0x000000060a600000, 0x000000060a600000)
   object space 20480K, 16% used 
[0x0000000609200000,0x000000060954dae0,0x000000060a600000)

As you can see the perm gen is 20 m even though I specified 10m on the 
command line.

If I run your program with -XX:MaxMetaspaceSize=20m it passes and does 
not run out of memory.


There are no guarantees that you can always just replace MaxPermSize 
with MaxMetaspaceSize. Often it works, but sometimes you have to adjust 
the values. Especially at boundary cases as low as 10m.

Hths,
Bengt


>
> Is that expected? It seems similar to 
> https://bugs.openjdk.java.net/browse/JDK-8025635
>
> Thanks,
> Brian
>
> // uses Guava's CacheBuilder
> public class Main {
>     public static void main(String[] args) throws Exception {
>         Cache<Integer, Object> cache = CacheBuilder.newBuilder()
>                 .softValues()
>                 .build();
>         for (int i = 0; i < 50_000; i++) {
>             URL[] dummyUrls = {new URL("file:" + i + ".jar")};
>             URLClassLoader cl = new URLClassLoader(dummyUrls, 
> Thread.currentThread().getContextClassLoader());
>             Object proxy = Proxy.newProxyInstance(cl, new 
> Class[]{Foo.class}, new InvocationHandler() {
>                 @Override
>                 public Object invoke(Object proxy, Method method, 
> Object[] args) throws Throwable {
>                     return null;
>                 }
>             });
>             cache.put(i, proxy);
>             System.out.println(i);
>         }
>     }
>     public interface Foo {
>             void x();
>     }
> }
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150505/38ec9410/attachment.html>

From brianfromoregon at gmail.com  Fri May  1 21:10:05 2015
From: brianfromoregon at gmail.com (Brian Harris)
Date: Fri, 1 May 2015 14:10:05 -0700
Subject: java8 metaspace issue
Message-ID: <CAFtUM9YdNDxDGqDo3Ge9v9zjoPWV6zkv2D2P17dLUEOyO0fjwA@mail.gmail.com>

Hi,

I find that this code crashes in 8u40 after getting up to about 900 when
run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with
-XX:MaxPermSize=10m it does not crash.

Is that expected? It seems similar to
https://bugs.openjdk.java.net/browse/JDK-8025635

Thanks,
Brian

// uses Guava's CacheBuilder
public class Main {

    public static void main(String[] args) throws Exception {
        Cache<Integer, Object> cache = CacheBuilder.newBuilder()
                .softValues()
                .build();

        for (int i = 0; i < 50_000; i++) {
            URL[] dummyUrls = {new URL("file:" + i + ".jar")};
            URLClassLoader cl = new URLClassLoader(dummyUrls,
Thread.currentThread().getContextClassLoader());
            Object proxy = Proxy.newProxyInstance(cl, new
Class[]{Foo.class}, new InvocationHandler() {
                @Override
                public Object invoke(Object proxy, Method method, Object[]
args) throws Throwable {
                    return null;
                }
            });
            cache.put(i, proxy);
            System.out.println(i);
        }
    }

    public interface Foo {
            void x();
    }
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150501/ea8138a5/attachment.html>

From simone.bordet at gmail.com  Fri May  8 15:11:00 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Fri, 8 May 2015 17:11:00 +0200
Subject: G1, Remembered Sets & Refinement
Message-ID: <CAFWmRJ0BQQvgjPZqzfZxGtDAtr8C=DGUobaYtUmBB5Nbj_Ys7w@mail.gmail.com>

Hi,

I would like to ask some clarification about remembered sets (RS),
buffers and refinement in G1.

My understanding is that G1 installs a write barrier to record old to
young pointers.
Let's assume that A is the object in old generation, and B is the
object in young generation.

I understand that when the barrier triggers, the card correspondent to
the place where the A resides is marked.
This card is then enqueued into a queue (the dirty card queue).

I understand that until the number of entries in the dirty card queue
does not enter the yellow zone, then nothing is done.
When the yellow zone is entered, a refinement thread is started to
poll items out of the dirty card queue and update the RS for the young
region.

I understand that, when a young GC happens, the refinement threads are
stopped (if running), and "Update RS" phase takes care of processing
the dirty card queue.

Provided my understanding is correct, what is the meaning of the word
"buffer" in this scenario ?
PrintGCDetails prints out a "Processed Buffers" subphase for "Update
RS", but what is a "buffer" ?

Another question: given that the write barrier knows exactly A's oop,
what is the reason for card marking, rather than just recording the
oop ?

Last question: would setting the yellow zone to zero always reduce the
"Update RS" to (almost) zero ? What problem could this setting
possibly generate ?

Thanks !

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From yiyeguhu at gmail.com  Fri May  8 22:33:44 2015
From: yiyeguhu at gmail.com (Tao Mao)
Date: Fri, 8 May 2015 15:33:44 -0700
Subject: How to find classes FinalReference references to?
Message-ID: <CANrGW1zRAtfU06ALiU4N8miFbu4noV+29uYb7OQ6nq9Jj=bFmA@mail.gmail.com>

Hi,

I find one of our applications using G1GC takes a comparably long time to
do [Ref Proc]. Tried ParallelRefProcEnabled, with no noticeable
improvement. Turning on PrintReferenceGC further finds all of processed
references are FinalReference. We want to know which code is using a
finalizer at this point. Our own code base does not use finalize() at all.
So, I suspect finalize() is being used by some external Java libraries but
they are not easy to find.

Is there any way to find and profile classes/objects FinalReference
references to?

(I guess I'm still able to hack into OpenJDK code but I'd rather not do
that :)

Thanks.
Tao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150508/1ed8ba4b/attachment.html>

From vitalyd at gmail.com  Fri May  8 22:43:05 2015
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 8 May 2015 18:43:05 -0400
Subject: How to find classes FinalReference references to?
In-Reply-To: <CANrGW1zRAtfU06ALiU4N8miFbu4noV+29uYb7OQ6nq9Jj=bFmA@mail.gmail.com>
References: <CANrGW1zRAtfU06ALiU4N8miFbu4noV+29uYb7OQ6nq9Jj=bFmA@mail.gmail.com>
Message-ID: <CAHjP37H_+XuQmeT1u6Fr=C9Xyoh4ARR27Z8Le+HL9Lq=HqU6vg@mail.gmail.com>

Hi Tao,

Have you tried taking a heap dump (jmap) and then using a heap
viewer/analyzer, such as MAT <http://eclipse.org/mat/>? You should be able
to find instances of FinalReference, and what they point at.

HTH

On Fri, May 8, 2015 at 6:33 PM, Tao Mao <yiyeguhu at gmail.com> wrote:

> Hi,
>
> I find one of our applications using G1GC takes a comparably long time to
> do [Ref Proc]. Tried ParallelRefProcEnabled, with no noticeable
> improvement. Turning on PrintReferenceGC further finds all of processed
> references are FinalReference. We want to know which code is using a
> finalizer at this point. Our own code base does not use finalize() at all.
> So, I suspect finalize() is being used by some external Java libraries but
> they are not easy to find.
>
> Is there any way to find and profile classes/objects FinalReference
> references to?
>
> (I guess I'm still able to hack into OpenJDK code but I'd rather not do
> that :)
>
> Thanks.
> Tao
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150508/430584ac/attachment-0001.html>

From yu.zhang at oracle.com  Mon May 11 18:46:58 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Mon, 11 May 2015 11:46:58 -0700
Subject: G1, Remembered Sets & Refinement
In-Reply-To: <CAFWmRJ0BQQvgjPZqzfZxGtDAtr8C=DGUobaYtUmBB5Nbj_Ys7w@mail.gmail.com>
References: <CAFWmRJ0BQQvgjPZqzfZxGtDAtr8C=DGUobaYtUmBB5Nbj_Ys7w@mail.gmail.com>
Message-ID: <5550F922.8040703@oracle.com>

Simon,

I will try to answer your questions.

Thanks,
Jenny

On 5/8/2015 8:11 AM, Simone Bordet wrote:
> Hi,
>
> I would like to ask some clarification about remembered sets (RS),
> buffers and refinement in G1.
>
> My understanding is that G1 installs a write barrier to record old to
> young pointers.
> Let's assume that A is the object in old generation, and B is the
> object in young generation.
Yes.
>
> I understand that when the barrier triggers, the card correspondent to
> the place where the A resides is marked.
> This card is then enqueued into a queue (the dirty card queue).
yes
>
> I understand that until the number of entries in the dirty card queue
> does not enter the yellow zone, then nothing is done.
> When the yellow zone is entered, a refinement thread is started to
> poll items out of the dirty card queue and update the RS for the young
> region.
yes. The number of refinement threads activated is decided by 
G1ConcRefinementThresholdStep.
>
> I understand that, when a young GC happens, the refinement threads are
> stopped (if running), and "Update RS" phase takes care of processing
> the dirty card queue.
yes.
>
> Provided my understanding is correct, what is the meaning of the word
> "buffer" in this scenario ?
> PrintGCDetails prints out a "Processed Buffers" subphase for "Update
> RS", but what is a "buffer" ?
as you mentioned, the buffer is a set of the dirty card 
queues(DirtyCardQueueSet).  The dirty cards are processed by concurrent 
refinement threads or at STW phase( update RS).
>
> Another question: given that the write barrier knows exactly A's oop,
> what is the reason for card marking, rather than just recording the
> oop ?
I hope others can chime in on this.  My guess it is related to memory 
footprint and performance.
>
> Last question: would setting the yellow zone to zero always reduce the
> "Update RS" to (almost) zero ? What problem could this setting
> possibly generate ?
Probably not.  It will push more work to concurrent refinement threads, 
but still could leave some work for STW phase.
>
> Thanks !
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150511/2c205cc0/attachment.html>

From simone.bordet at gmail.com  Mon May 11 19:15:18 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Mon, 11 May 2015 21:15:18 +0200
Subject: G1, Remembered Sets & Refinement
In-Reply-To: <5550F922.8040703@oracle.com>
References: <CAFWmRJ0BQQvgjPZqzfZxGtDAtr8C=DGUobaYtUmBB5Nbj_Ys7w@mail.gmail.com>
	<5550F922.8040703@oracle.com>
Message-ID: <CAFWmRJ2AcAR4rY+0uZH-+aFODFPP-0PE-dE+07roqFpwxUJXRQ@mail.gmail.com>

Jenny,

On Mon, May 11, 2015 at 8:46 PM, Yu Zhang <yu.zhang at oracle.com> wrote:
> Simon,
>
> I will try to answer your questions.

Thank you for your answers !

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From thomas.schatzl at oracle.com  Tue May 12 08:06:48 2015
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 12 May 2015 10:06:48 +0200
Subject: G1, Remembered Sets & Refinement
In-Reply-To: <5550F922.8040703@oracle.com>
References: <CAFWmRJ0BQQvgjPZqzfZxGtDAtr8C=DGUobaYtUmBB5Nbj_Ys7w@mail.gmail.com>
	<5550F922.8040703@oracle.com>
Message-ID: <1431418008.3356.6.camel@oracle.com>

Hi,

On Mon, 2015-05-11 at 11:46 -0700, Yu Zhang wrote:
> Simon,
> 
> I will try to answer your questions.  
> Thanks,
> Jenny
> On 5/8/2015 8:11 AM, Simone Bordet wrote:
> 
> > Hi,
> > 
> > I would like to ask some clarification about remembered sets (RS),
> > buffers and refinement in G1.
> > 
> > My understanding is that G1 installs a write barrier to record old to
> > young pointers.

Actually all inter-region pointers, where old->young are special cases.

> > Let's assume that A is the object in old generation, and B is the
> > object in young generation.
> Yes.
> > 
> > I understand that when the barrier triggers, the card correspondent to
> > the place where the A resides is marked.
> > This card is then enqueued into a queue (the dirty card queue).
> yes
> > 
> > I understand that until the number of entries in the dirty card queue
> > does not enter the yellow zone, then nothing is done.
> > When the yellow zone is entered, a refinement thread is started to
> > poll items out of the dirty card queue and update the RS for the young
> > region.
> yes. The number of refinement threads activated is decided by
> G1ConcRefinementThresholdStep.

Actually refinement starts at the green threshold, until all refinement
threads are running at the yellow one. At the red threshold, mutator
threads start helping.

> > I understand that, when a young GC happens, the refinement threads are
> > stopped (if running), and "Update RS" phase takes care of processing
> > the dirty card queue.
> yes.
> > 
> > Provided my understanding is correct, what is the meaning of the word
> > "buffer" in this scenario ?
> > PrintGCDetails prints out a "Processed Buffers" subphase for "Update
> > RS", but what is a "buffer" ?
> as you mentioned, the buffer is a set of the dirty card
> queues(DirtyCardQueueSet).  The dirty cards are processed by
> concurrent refinement threads or at STW phase( update RS).
> > 
> > Another question: given that the write barrier knows exactly A's oop,
> > what is the reason for card marking, rather than just recording the
> > oop ?
> I hope others can chime in on this.  My guess it is related to memory
> footprint and performance.

- memory usage: a card covers a set of references which are often
changed together.
- performance: while a card is marked, that card is not re-enqueued
again. This kind of duplicate detection is simple using cards (or any
range of memory backed by an array), while hard otherwise.

> > Last question: would setting the yellow zone to zero always reduce the
> > "Update RS" to (almost) zero ? What problem could this setting
> > possibly generate ?
> Probably not.  It will push more work to concurrent refinement
> threads, but still could leave some work for STW phase.

- there will always be some work remaining to be done during the stw
pause
- this will effectively disable duplicate detection leading to much
higher cpu-usage as cards are more frequently re-processed. Often it is
actually advantageous to keep the cards longer in the queue.

Thanks,
  Thomas


From thomas.schatzl at oracle.com  Tue May 12 09:05:28 2015
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 12 May 2015 11:05:28 +0200
Subject: G1, Remembered Sets & Refinement
In-Reply-To: <1431418008.3356.6.camel@oracle.com>
References: <CAFWmRJ0BQQvgjPZqzfZxGtDAtr8C=DGUobaYtUmBB5Nbj_Ys7w@mail.gmail.com>
	<5550F922.8040703@oracle.com> <1431418008.3356.6.camel@oracle.com>
Message-ID: <1431421528.3356.28.camel@oracle.com>

Hi all,

On Tue, 2015-05-12 at 10:06 +0200, Thomas Schatzl wrote:
> Hi,
> 
> On Mon, 2015-05-11 at 11:46 -0700, Yu Zhang wrote:
> > Simon,
> > 
> > I will try to answer your questions.  
> > Thanks,
> > Jenny
> > On 5/8/2015 8:11 AM, Simone Bordet wrote:
> > 
> > > Hi,
> > > 
> > > I would like to ask some clarification about remembered sets (RS),
> > > buffers and refinement in G1.
> > > 
> > > My understanding is that G1 installs a write barrier to record old to
> > > young pointers.
> 
> Actually all inter-region pointers, where old->young are special cases.

  just to clear up any misunderstandings: the barrier is installed for
all writes (except ones that the compiler can prove to be uninteresting
as below, I think only guaranteed NULL-writes).

Summarizing that, G1 is interested in any old->whatever references that

- are non-NULL
- cross a region
- originate from old

(in that order btw)

For refinement, the corresponding card also needs to be non-dirty to be
enqueued, otherwise it knows that the card is already somewhere in some
queue and will eventually be looked at anyway.

Thanks,
  Thomas


From brianfromoregon at gmail.com  Thu May 14 23:35:04 2015
From: brianfromoregon at gmail.com (Brian Harris)
Date: Thu, 14 May 2015 16:35:04 -0700
Subject: java8 metaspace issue
In-Reply-To: <554887C7.7080506@oracle.com>
References: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>
	<554887C7.7080506@oracle.com>
Message-ID: <CAFtUM9YnE7xNxtx_ERRtgen12AuR41E0S=AbBt6zNEe6OkkWvw@mail.gmail.com>

Yes I should have said OOME instead of 'crash'. Indeed when setting
-XX:MaxMetaspaceSize=20m the program does not throw OOME.

Appears to be a boundary case jvm bug that this will throw OOME when
-XX:MaxMetaspaceSize=10m
after going through the loop 890 times. Otherwise, how else can the OOME be
explained given we're using soft refs?

On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson <bengt.rutisson at oracle.com>
wrote:

>
> Hi Brian,
>
> On 2015-05-05 05:17, Brian Harris wrote:
>
> Hi,
>
>  I find that this code crashes in 8u40 after getting up to about 900 when
> run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with
> -XX:MaxPermSize=10m it does not crash.
>
>
> Thanks for providing the example program. For me it does not crash but if
> I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it
> crash for you?
>
> The OutOfMemoryError can be explained by the fact that when you run with
> -XX:MaxPermSize=10m there is some aligning going on and in the end you
> actually end up with a perm gen that is 20m large. Here's what I get when I
> use
>
> $ java -XX:+PrintGCDetails  -XX:MaxPermSize=10m
>
> Heap
>  PSYoungGen      total 150528K, used 10363K [0x0000000758c80000,
> 0x0000000763400000, 0x0000000800000000)
>   eden space 129536K, 8% used
> [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000)
>   from space 20992K, 0% used
> [0x0000000761f80000,0x0000000761f80000,0x0000000763400000)
>   to   space 20992K, 0% used
> [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000)
>  ParOldGen       total 342016K, used 0K [0x000000060a600000,
> 0x000000061f400000, 0x0000000758c80000)
>   object space 342016K, 0% used
> [0x000000060a600000,0x000000060a600000,0x000000061f400000)
>  PSPermGen       total 20480K, used 3382K [0x0000000609200000,
> 0x000000060a600000, 0x000000060a600000)
>   object space 20480K, 16% used
> [0x0000000609200000,0x000000060954dae0,0x000000060a600000)
>
> As you can see the perm gen is 20 m even though I specified 10m on the
> command line.
>
> If I run your program with -XX:MaxMetaspaceSize=20m it passes and does not
> run out of memory.
>
>
> There are no guarantees that you can always just replace MaxPermSize with
> MaxMetaspaceSize. Often it works, but sometimes you have to adjust the
> values. Especially at boundary cases as low as 10m.
>
> Hths,
> Bengt
>
>
>
>  Is that expected? It seems similar to
> https://bugs.openjdk.java.net/browse/JDK-8025635
>
>  Thanks,
> Brian
>
>  // uses Guava's CacheBuilder
>  public class Main {
>
>     public static void main(String[] args) throws Exception {
>         Cache<Integer, Object> cache = CacheBuilder.newBuilder()
>                 .softValues()
>                 .build();
>
>         for (int i = 0; i < 50_000; i++) {
>             URL[] dummyUrls = {new URL("file:" + i + ".jar")};
>             URLClassLoader cl = new URLClassLoader(dummyUrls,
> Thread.currentThread().getContextClassLoader());
>             Object proxy = Proxy.newProxyInstance(cl, new
> Class[]{Foo.class}, new InvocationHandler() {
>                 @Override
>                 public Object invoke(Object proxy, Method method, Object[]
> args) throws Throwable {
>                     return null;
>                 }
>             });
>             cache.put(i, proxy);
>             System.out.println(i);
>         }
>     }
>
>     public interface Foo {
>             void x();
>     }
> }
>
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150514/119147ee/attachment.html>

From ecki at zusammenkunft.net  Fri May 15 00:41:29 2015
From: ecki at zusammenkunft.net (Bernd)
Date: Fri, 15 May 2015 02:41:29 +0200
Subject: java8 metaspace issue
In-Reply-To: <CAFtUM9YnE7xNxtx_ERRtgen12AuR41E0S=AbBt6zNEe6OkkWvw@mail.gmail.com>
References: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>
	<554887C7.7080506@oracle.com>
	<CAFtUM9YnE7xNxtx_ERRtgen12AuR41E0S=AbBt6zNEe6OkkWvw@mail.gmail.com>
Message-ID: <CABOR3+yOXS15-k603TTv+gjiQk3VyBbsx5qw8cLeYrVwLH9sTw@mail.gmail.com>

You can try to turn on tracing of inlining and compilation. The JITed code
can need space in the meta generation and can be caused by repeating code
till it becomes hot enough. And 10mb is really small, anyway.

Gruss
Bernd
Am 15.05.2015 02:35 schrieb "Brian Harris" <brianfromoregon at gmail.com>:

> Yes I should have said OOME instead of 'crash'. Indeed when setting
> -XX:MaxMetaspaceSize=20m the program does not throw OOME.
>
> Appears to be a boundary case jvm bug that this will throw OOME when -XX:MaxMetaspaceSize=10m
> after going through the loop 890 times. Otherwise, how else can the OOME be
> explained given we're using soft refs?
>
> On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson <bengt.rutisson at oracle.com>
> wrote:
>
>>
>> Hi Brian,
>>
>> On 2015-05-05 05:17, Brian Harris wrote:
>>
>> Hi,
>>
>>  I find that this code crashes in 8u40 after getting up to about 900
>> when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with
>> -XX:MaxPermSize=10m it does not crash.
>>
>>
>> Thanks for providing the example program. For me it does not crash but if
>> I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it
>> crash for you?
>>
>> The OutOfMemoryError can be explained by the fact that when you run with
>> -XX:MaxPermSize=10m there is some aligning going on and in the end you
>> actually end up with a perm gen that is 20m large. Here's what I get when I
>> use
>>
>> $ java -XX:+PrintGCDetails  -XX:MaxPermSize=10m
>>
>> Heap
>>  PSYoungGen      total 150528K, used 10363K [0x0000000758c80000,
>> 0x0000000763400000, 0x0000000800000000)
>>   eden space 129536K, 8% used
>> [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000)
>>   from space 20992K, 0% used
>> [0x0000000761f80000,0x0000000761f80000,0x0000000763400000)
>>   to   space 20992K, 0% used
>> [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000)
>>  ParOldGen       total 342016K, used 0K [0x000000060a600000,
>> 0x000000061f400000, 0x0000000758c80000)
>>   object space 342016K, 0% used
>> [0x000000060a600000,0x000000060a600000,0x000000061f400000)
>>  PSPermGen       total 20480K, used 3382K [0x0000000609200000,
>> 0x000000060a600000, 0x000000060a600000)
>>   object space 20480K, 16% used
>> [0x0000000609200000,0x000000060954dae0,0x000000060a600000)
>>
>> As you can see the perm gen is 20 m even though I specified 10m on the
>> command line.
>>
>> If I run your program with -XX:MaxMetaspaceSize=20m it passes and does
>> not run out of memory.
>>
>>
>> There are no guarantees that you can always just replace MaxPermSize with
>> MaxMetaspaceSize. Often it works, but sometimes you have to adjust the
>> values. Especially at boundary cases as low as 10m.
>>
>> Hths,
>> Bengt
>>
>>
>>
>>  Is that expected? It seems similar to
>> https://bugs.openjdk.java.net/browse/JDK-8025635
>>
>>  Thanks,
>> Brian
>>
>>  // uses Guava's CacheBuilder
>>  public class Main {
>>
>>     public static void main(String[] args) throws Exception {
>>         Cache<Integer, Object> cache = CacheBuilder.newBuilder()
>>                 .softValues()
>>                 .build();
>>
>>         for (int i = 0; i < 50_000; i++) {
>>             URL[] dummyUrls = {new URL("file:" + i + ".jar")};
>>             URLClassLoader cl = new URLClassLoader(dummyUrls,
>> Thread.currentThread().getContextClassLoader());
>>             Object proxy = Proxy.newProxyInstance(cl, new
>> Class[]{Foo.class}, new InvocationHandler() {
>>                 @Override
>>                 public Object invoke(Object proxy, Method method,
>> Object[] args) throws Throwable {
>>                     return null;
>>                 }
>>             });
>>             cache.put(i, proxy);
>>             System.out.println(i);
>>         }
>>     }
>>
>>     public interface Foo {
>>             void x();
>>     }
>> }
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150515/c6b60e9e/attachment-0001.html>

From brianfromoregon at gmail.com  Fri May 15 15:40:07 2015
From: brianfromoregon at gmail.com (Brian Harris)
Date: Fri, 15 May 2015 08:40:07 -0700
Subject: java8 metaspace issue
In-Reply-To: <CABOR3+yOXS15-k603TTv+gjiQk3VyBbsx5qw8cLeYrVwLH9sTw@mail.gmail.com>
References: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>
	<554887C7.7080506@oracle.com>
	<CAFtUM9YnE7xNxtx_ERRtgen12AuR41E0S=AbBt6zNEe6OkkWvw@mail.gmail.com>
	<CABOR3+yOXS15-k603TTv+gjiQk3VyBbsx5qw8cLeYrVwLH9sTw@mail.gmail.com>
Message-ID: <CAFtUM9Z0nOVu7+=65zML+ULVZxNGA86Dnv9WdpO+2bN0-z2E5w@mail.gmail.com>

This toy example was meant to be a reproduction of real OOME we're getting
in JVM where MaxMetaspaceSize is not set and heap dumps suggest uncleared
soft references to objects in metaspace as being the cause.
If you're right and this really only happens when metaspace is capped super
low, then it's a false reproduction. But if there's a deeper problem and
the metaspace cap simply reveals in the toy example the same underlying
issue we're hitting in prod, I'd hope that could be investigated.

On Thu, May 14, 2015 at 5:41 PM, Bernd <ecki at zusammenkunft.net> wrote:

> You can try to turn on tracing of inlining and compilation. The JITed code
> can need space in the meta generation and can be caused by repeating code
> till it becomes hot enough. And 10mb is really small, anyway.
>
> Gruss
> Bernd
> Am 15.05.2015 02:35 schrieb "Brian Harris" <brianfromoregon at gmail.com>:
>
>> Yes I should have said OOME instead of 'crash'. Indeed when setting
>> -XX:MaxMetaspaceSize=20m the program does not throw OOME.
>>
>> Appears to be a boundary case jvm bug that this will throw OOME when -XX:MaxMetaspaceSize=10m
>> after going through the loop 890 times. Otherwise, how else can the OOME be
>> explained given we're using soft refs?
>>
>> On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson <bengt.rutisson at oracle.com
>> > wrote:
>>
>>>
>>> Hi Brian,
>>>
>>> On 2015-05-05 05:17, Brian Harris wrote:
>>>
>>> Hi,
>>>
>>>  I find that this code crashes in 8u40 after getting up to about 900
>>> when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with
>>> -XX:MaxPermSize=10m it does not crash.
>>>
>>>
>>> Thanks for providing the example program. For me it does not crash but
>>> if I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it
>>> crash for you?
>>>
>>> The OutOfMemoryError can be explained by the fact that when you run with
>>> -XX:MaxPermSize=10m there is some aligning going on and in the end you
>>> actually end up with a perm gen that is 20m large. Here's what I get when I
>>> use
>>>
>>> $ java -XX:+PrintGCDetails  -XX:MaxPermSize=10m
>>>
>>> Heap
>>>  PSYoungGen      total 150528K, used 10363K [0x0000000758c80000,
>>> 0x0000000763400000, 0x0000000800000000)
>>>   eden space 129536K, 8% used
>>> [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000)
>>>   from space 20992K, 0% used
>>> [0x0000000761f80000,0x0000000761f80000,0x0000000763400000)
>>>   to   space 20992K, 0% used
>>> [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000)
>>>  ParOldGen       total 342016K, used 0K [0x000000060a600000,
>>> 0x000000061f400000, 0x0000000758c80000)
>>>   object space 342016K, 0% used
>>> [0x000000060a600000,0x000000060a600000,0x000000061f400000)
>>>  PSPermGen       total 20480K, used 3382K [0x0000000609200000,
>>> 0x000000060a600000, 0x000000060a600000)
>>>   object space 20480K, 16% used
>>> [0x0000000609200000,0x000000060954dae0,0x000000060a600000)
>>>
>>> As you can see the perm gen is 20 m even though I specified 10m on the
>>> command line.
>>>
>>> If I run your program with -XX:MaxMetaspaceSize=20m it passes and does
>>> not run out of memory.
>>>
>>>
>>> There are no guarantees that you can always just replace MaxPermSize
>>> with MaxMetaspaceSize. Often it works, but sometimes you have to adjust the
>>> values. Especially at boundary cases as low as 10m.
>>>
>>> Hths,
>>> Bengt
>>>
>>>
>>>
>>>  Is that expected? It seems similar to
>>> https://bugs.openjdk.java.net/browse/JDK-8025635
>>>
>>>  Thanks,
>>> Brian
>>>
>>>  // uses Guava's CacheBuilder
>>>  public class Main {
>>>
>>>     public static void main(String[] args) throws Exception {
>>>         Cache<Integer, Object> cache = CacheBuilder.newBuilder()
>>>                 .softValues()
>>>                 .build();
>>>
>>>         for (int i = 0; i < 50_000; i++) {
>>>             URL[] dummyUrls = {new URL("file:" + i + ".jar")};
>>>             URLClassLoader cl = new URLClassLoader(dummyUrls,
>>> Thread.currentThread().getContextClassLoader());
>>>             Object proxy = Proxy.newProxyInstance(cl, new
>>> Class[]{Foo.class}, new InvocationHandler() {
>>>                 @Override
>>>                 public Object invoke(Object proxy, Method method,
>>> Object[] args) throws Throwable {
>>>                     return null;
>>>                 }
>>>             });
>>>             cache.put(i, proxy);
>>>             System.out.println(i);
>>>         }
>>>     }
>>>
>>>     public interface Foo {
>>>             void x();
>>>     }
>>> }
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150515/ff362994/attachment.html>

From jon.masamitsu at oracle.com  Fri May 15 18:20:47 2015
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 15 May 2015 11:20:47 -0700
Subject: java8 metaspace issue
In-Reply-To: <CAFtUM9Z0nOVu7+=65zML+ULVZxNGA86Dnv9WdpO+2bN0-z2E5w@mail.gmail.com>
References: <CAFtUM9bn6On41pYdt+BCiKUG5HbL-fGPpiEjo-n1nOhxN7B2hw@mail.gmail.com>	<554887C7.7080506@oracle.com>	<CAFtUM9YnE7xNxtx_ERRtgen12AuR41E0S=AbBt6zNEe6OkkWvw@mail.gmail.com>	<CABOR3+yOXS15-k603TTv+gjiQk3VyBbsx5qw8cLeYrVwLH9sTw@mail.gmail.com>
	<CAFtUM9Z0nOVu7+=65zML+ULVZxNGA86Dnv9WdpO+2bN0-z2E5w@mail.gmail.com>
Message-ID: <555638FF.6010705@oracle.com>

Brian,

In your first mail you mention 7u60 and 8u40.  With 7u60 you must be setting
MaxPermSize.  What value are you using and you are not seeing any problems.
A sample GC log with -XX:+PrintGCDetails and -XX:+PrintHeapAtGC would
help us understand the 7u60 behavior.

With 8u40 you are not setting MaxMetaspaceSize yet you are seeing an OOME
with Metaspace as the cause?  A GC log would be most helpful there also

Jon

On 5/15/2015 8:40 AM, Brian Harris wrote:
> This toy example was meant to be a reproduction of real OOME we're 
> getting in JVM where MaxMetaspaceSize is not set and heap dumps 
> suggest uncleared soft references to objects in metaspace as being the 
> cause.
> If you're right and this really only happens when metaspace is capped 
> super low, then it's a false reproduction. But if there's a deeper 
> problem and the metaspace cap simply reveals in the toy example the 
> same underlying issue we're hitting in prod, I'd hope that could be 
> investigated.
>
> On Thu, May 14, 2015 at 5:41 PM, Bernd <ecki at zusammenkunft.net 
> <mailto:ecki at zusammenkunft.net>> wrote:
>
>     You can try to turn on tracing of inlining and compilation. The
>     JITed code can need space in the meta generation and can be caused
>     by repeating code till it becomes hot enough. And 10mb is really
>     small, anyway.
>
>     Gruss
>     Bernd
>
>     Am 15.05.2015 02:35 schrieb "Brian Harris"
>     <brianfromoregon at gmail.com <mailto:brianfromoregon at gmail.com>>:
>
>         Yes I should have said OOME instead of 'crash'. Indeed when
>         setting -XX:MaxMetaspaceSize=20m the program does not throw OOME.
>
>         Appears to be a boundary case jvm bug that this will throw
>         OOME when -XX:MaxMetaspaceSize=10m after going through the
>         loop 890 times. Otherwise, how else can the OOME be explained
>         given we're using soft refs?
>
>         On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson
>         <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>>
>         wrote:
>
>
>             Hi Brian,
>
>             On 2015-05-05 05:17, Brian Harris wrote:
>>             Hi,
>>
>>             I find that this code crashes in 8u40 after getting up to
>>             about 900 when run with -XX:MaxMetaspaceSize=10m. When
>>             run in 7u60 with -XX:MaxPermSize=10m it does not crash.
>
>             Thanks for providing the example program. For me it does
>             not crash but if I run with -XX:MaxMetaspaceSize=10m I get
>             an OutOfMemoryError. Does it crash for you?
>
>             The OutOfMemoryError can be explained by the fact that
>             when you run with -XX:MaxPermSize=10m there is some
>             aligning going on and in the end you actually end up with
>             a perm gen that is 20m large. Here's what I get when I use
>
>             $ java -XX:+PrintGCDetails -XX:MaxPermSize=10m
>
>             Heap
>              PSYoungGen      total 150528K, used 10363K
>             [0x0000000758c80000, 0x0000000763400000, 0x0000000800000000)
>               eden space 129536K, 8% used
>             [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000)
>               from space 20992K, 0% used
>             [0x0000000761f80000,0x0000000761f80000,0x0000000763400000)
>               to   space 20992K, 0% used
>             [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000)
>              ParOldGen       total 342016K, used 0K
>             [0x000000060a600000, 0x000000061f400000, 0x0000000758c80000)
>               object space 342016K, 0% used
>             [0x000000060a600000,0x000000060a600000,0x000000061f400000)
>              PSPermGen       total 20480K, used 3382K
>             [0x0000000609200000, 0x000000060a600000, 0x000000060a600000)
>               object space 20480K, 16% used
>             [0x0000000609200000,0x000000060954dae0,0x000000060a600000)
>
>             As you can see the perm gen is 20 m even though I
>             specified 10m on the command line.
>
>             If I run your program with -XX:MaxMetaspaceSize=20m it
>             passes and does not run out of memory.
>
>
>             There are no guarantees that you can always just replace
>             MaxPermSize with MaxMetaspaceSize. Often it works, but
>             sometimes you have to adjust the values. Especially at
>             boundary cases as low as 10m.
>
>             Hths,
>             Bengt
>
>
>>
>>             Is that expected? It seems similar to
>>             https://bugs.openjdk.java.net/browse/JDK-8025635
>>
>>             Thanks,
>>             Brian
>>
>>             // uses Guava's CacheBuilder
>>             public class Main {
>>                 public static void main(String[] args) throws Exception {
>>                     Cache<Integer, Object> cache =
>>             CacheBuilder.newBuilder()
>>                             .softValues()
>>                             .build();
>>                     for (int i = 0; i < 50_000; i++) {
>>                         URL[] dummyUrls = {new URL("file:" + i +
>>             ".jar")};
>>                         URLClassLoader cl = new
>>             URLClassLoader(dummyUrls,
>>             Thread.currentThread().getContextClassLoader());
>>                         Object proxy = Proxy.newProxyInstance(cl, new
>>             Class[]{Foo.class}, new InvocationHandler() {
>>                             @Override
>>                             public Object invoke(Object proxy, Method
>>             method, Object[] args) throws Throwable {
>>                                 return null;
>>                             }
>>                         });
>>                         cache.put(i, proxy);
>>             System.out.println(i);
>>                     }
>>                 }
>>                 public interface Foo {
>>                         void x();
>>                 }
>>             }
>>
>>
>>             _______________________________________________
>>             hotspot-gc-use mailing list
>>             hotspot-gc-use at openjdk.java.net  <mailto:hotspot-gc-use at openjdk.java.net>
>>             http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net
>         <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>     _______________________________________________
>     hotspot-gc-use mailing list
>     hotspot-gc-use at openjdk.java.net
>     <mailto:hotspot-gc-use at openjdk.java.net>
>     http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150515/d086e793/attachment.html>

From joyxiong at yahoo.com  Wed May 20 17:35:24 2015
From: joyxiong at yahoo.com (Joy Xiong)
Date: Wed, 20 May 2015 17:35:24 +0000 (UTC)
Subject: Long Reference Processing Time
Message-ID: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com>


Hi All,
I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the application:-XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC?
/export/apps/jdk/JDK-1_8_0_5/bin/java

With G1, I observe long time processing references. The long reference processing time has two types:?1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time contributing to processing soft references?2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS.
Detailed log record is shown as below:1. Processing soft reference takes long time. However, we only have 0 soft reference2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young)Desired survivor size 201326592 bytes, new threshold 15 (max 15)- age ? 1: ? ?6197672 bytes, ? ?6197672 total- age ? 2: ? ? 553864 bytes, ? ?6751536 total- age ? 3: ? ? 321216 bytes, ? ?7072752 total- age ? 4: ? ? 563120 bytes, ? ?7635872 total- age ? 5: ? ? 261920 bytes, ? ?7897792 total- age ? 6: ? ? 265768 bytes, ? ?8163560 total- age ? 7: ? ? 319856 bytes, ? ?8483416 total- age ? 8: ? ? 132328 bytes, ? ?8615744 total- age ? 9: ? ? 153768 bytes, ? ?8769512 total- age ?10: ? ? 194256 bytes, ? ?8963768 total- age ?11: ? ? ?64600 bytes, ? ?9028368 total- age ?12: ? ? 160208 bytes, ? ?9188576 total- age ?13: ? ? ?69376 bytes, ? ?9257952 total- age ?14: ? ? 151832 bytes, ? ?9409784 total- age ?15: ? ? 186920 bytes, ? ?9596704 total?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms]?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms]?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms]30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs]? ?[Parallel Time: 14.4 ms, GC Workers: 22]? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5]? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1]? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8]? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181]? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0]? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3]? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3]? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0]? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7]? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2]? ?[Code Root Fixup: 0.2 ms]? ?[Code Root Migration: 0.1 ms]? ?[Clear CT: 1.0 ms]? ?[Other: 914.8 ms]? ? ? [Choose CSet: 0.0 ms]? ? ? [Ref Proc: 910.3 ms]? ? ? [Ref Enq: 0.5 ms]? ? ? [Free CSet: 1.7 ms]? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)]?[Times: user=0.29 sys=0.00, real=0.93 secs]
2. Processing weak reference takes long time93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs]? ?[Parallel Time: 33.1 ms, GC Workers: 22]? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7]? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9]? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2]? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3]? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173]? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6]? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0]? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4]? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9]? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3]? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5]? ?[Code Root Fixup: 0.2 ms]? ?[Code Root Migration: 0.1 ms]? ?[Clear CT: 0.5 ms]? ?[Other: 11.7 ms]? ? ? [Choose CSet: 0.0 ms]? ? ? [Ref Proc: 7.8 ms]? ? ? [Ref Enq: 0.5 ms]? ? ? [Free CSet: 0.8 ms]? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)]?[Times: user=0.66 sys=0.00, real=0.04 secs]2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start]2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped: 0.0652331 seconds2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs]2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start]2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs]2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs]?[Times: user=15.10 sys=0.19, real=1.08 secs]
Appreciate your help,-Joy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/397e31ec/attachment-0001.html>

From poonam.bajaj at oracle.com  Wed May 20 18:17:58 2015
From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar)
Date: Wed, 20 May 2015 11:17:58 -0700
Subject: Long Reference Processing Time
In-Reply-To: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com>
References: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <555CCFD6.50400@oracle.com>

Hello Joy,

Could you try running with the latest JDK8 update release (8u45). Looks 
like you are trying out G1 with 8u5. There have been many 
improvements/fixes in G1GC since 8u5. Please test with the latest 8u and 
let us know the results.

Thanks,
Poonam

On 5/20/2015 10:35 AM, Joy Xiong wrote:
> *
> *
> Hi All,
>
> I recently moved our application from CMS to G1 due to heap 
> fragmentation. Here are the JVM tunable used for the application:
> -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled 
> -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g 
> -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC
> /export/apps/jdk/JDK-1_8_0_5/bin/java
>
> With G1, I observe long time processing references. The long reference 
> processing time has two types:
> 1) Occur in Young GC phase. The processing time does not make sense to 
> me, as the majority time is spent on processing soft reference, whose 
> number is 0. Is there some hidden time contributing to processing soft 
> references?
> 2) Occur in the remark phase during the concurrent phase. Our 
> application has a large number of weak references, but I don't quite 
> understand why the processing time is much larger with G1 than with CMS.
>
> Detailed log record is shown as below:
> *1. Processing soft reference takes long time*. However, we only have 
> 0 soft reference
> 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation 
> Pause) (young)
> Desired survivor size 201326592 bytes, new threshold 15 (max 15)
> - age   1:    6197672 bytes,    6197672 total
> - age   2:     553864 bytes,    6751536 total
> - age   3:     321216 bytes,    7072752 total
> - age   4:     563120 bytes,    7635872 total
> - age   5:     261920 bytes,    7897792 total
> - age   6:     265768 bytes,    8163560 total
> - age   7:     319856 bytes,    8483416 total
> - age   8:     132328 bytes,    8615744 total
> - age   9:     153768 bytes,    8769512 total
> - age  10:     194256 bytes,    8963768 total
> - age  11:      64600 bytes,    9028368 total
> - age  12:     160208 bytes,    9188576 total
> - age  13:      69376 bytes,    9257952 total
> - age  14:     151832 bytes,    9409784 total
> - age  15:     186920 bytes,    9596704 total
>  30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, 
> _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 
> 17.67 ms, target pause time: 40.00 ms]
>  30271.429: [G1Ergonomics (CSet Construction) add young regions to 
> CSet, eden: 380 regions, survivors: 2 regions, predicted young region 
> time: 5.51 ms]
>  30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, 
> eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted 
> pause time: 27.83 ms, target pause time: 40.00 ms]
> 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: 
> [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 
> refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 
> secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs]
>    [Parallel Time: 14.4 ms, GC Workers: 22]
>       [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 
> 30271429.9, Diff: 0.5]
>       [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 
> 6.4, Sum: 120.1]
>       [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8]
>          [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, 
> Sum: 1181]
>       [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
> 0.0, Sum: 0.1]
>       [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 
> 55.3]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 
> 7.3]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, 
> Sum: 2.0]
>       [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 
> 0.6, Sum: 300.7]
>       [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 
> 30271443.5, Diff: 0.2]
>    [Code Root Fixup: 0.2 ms]
>    [Code Root Migration: 0.1 ms]
>    [Clear CT: 1.0 ms]
>    [Other: 914.8 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 910.3 ms]
>       [Ref Enq: 0.5 ms]
>       [Free CSet: 1.7 ms]
>    [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 
> 4588.4M(5120.0M)->1551.4M(5120.0M)]
>  [Times: user=0.29 sys=0.00, real=0.93 secs]
>
> *2. Processing weak reference takes long time*
> 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: 
> [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 
> refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 
> secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs]
>  [Parallel Time: 33.1 ms, GC Workers: 22]
>   [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 
> 93967013.6, Diff: 0.7]
>   [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, 
> Sum: 122.9]
>   [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, 
> Sum: 25.2]
>   [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3]
>      [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 
> 1173]
>   [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6]
>   [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, 
> Sum: 0.1]
>   [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 
> 463.0]
>   [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4]
>   [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 
> 4.9]
>   [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, 
> Sum: 709.3]
>   [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 
> 93967045.8, Diff: 0.5]
>  [Code Root Fixup: 0.2 ms]
>  [Code Root Migration: 0.1 ms]
>    [Clear CT: 0.5 ms]
>    [Other: 11.7 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 7.8 ms]
>       [Ref Enq: 0.5 ms]
>       [Free CSet: 0.8 ms]
>    [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M 
> Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)]
>  [Times: user=0.66 sys=0.00, real=0.04 secs]
> 2015-05-16T13:21:33.478+0000: 93967.057: [GC 
> concurrent-root-region-scan-start]
> 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which 
> application threads were stopped: 0.0652331 seconds
> 2015-05-16T13:21:33.487+0000: 93967.066: [GC 
> concurrent-root-region-scan-end, 0.0082516 secs]
> 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start]
> 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 
> 0.4016735 secs]
> 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC 
> ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: 
> [*WeakReference, 1430199 refs, 0.7913479 secs*]93968.281: 
> [FinalReference, 367 refs, 0.0036350 secs]93968.285: 
> [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak 
> Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs]
>  [Times: user=15.10 sys=0.19, real=1.08 secs]
>
> Appreciate your help,
> -Joy
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/354d3838/attachment.html>

From yu.zhang at oracle.com  Wed May 20 18:25:40 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Wed, 20 May 2015 11:25:40 -0700
Subject: Long Reference Processing Time
In-Reply-To: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com>
References: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <555CD1A4.5010507@oracle.com>

Joy,

For the 1st one, there is a bug 
https://bugs.openjdk.java.net/browse/JDK-8076462
You can try to reduce the ParallelGCThreads.  There is not much work anyway.

Thanks,
Jenny

On 5/20/2015 10:35 AM, Joy Xiong wrote:
> *
> *
> Hi All,
>
> I recently moved our application from CMS to G1 due to heap 
> fragmentation. Here are the JVM tunable used for the application:
> -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled 
> -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g 
> -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC
> /export/apps/jdk/JDK-1_8_0_5/bin/java
>
> With G1, I observe long time processing references. The long reference 
> processing time has two types:
> 1) Occur in Young GC phase. The processing time does not make sense to 
> me, as the majority time is spent on processing soft reference, whose 
> number is 0. Is there some hidden time contributing to processing soft 
> references?
> 2) Occur in the remark phase during the concurrent phase. Our 
> application has a large number of weak references, but I don't quite 
> understand why the processing time is much larger with G1 than with CMS.
>
> Detailed log record is shown as below:
> *1. Processing soft reference takes long time*. However, we only have 
> 0 soft reference
> 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation 
> Pause) (young)
> Desired survivor size 201326592 bytes, new threshold 15 (max 15)
> - age   1:    6197672 bytes,    6197672 total
> - age   2:     553864 bytes,    6751536 total
> - age   3:     321216 bytes,    7072752 total
> - age   4:     563120 bytes,    7635872 total
> - age   5:     261920 bytes,    7897792 total
> - age   6:     265768 bytes,    8163560 total
> - age   7:     319856 bytes,    8483416 total
> - age   8:     132328 bytes,    8615744 total
> - age   9:     153768 bytes,    8769512 total
> - age  10:     194256 bytes,    8963768 total
> - age  11:      64600 bytes,    9028368 total
> - age  12:     160208 bytes,    9188576 total
> - age  13:      69376 bytes,    9257952 total
> - age  14:     151832 bytes,    9409784 total
> - age  15:     186920 bytes,    9596704 total
>  30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, 
> _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 
> 17.67 ms, target pause time: 40.00 ms]
>  30271.429: [G1Ergonomics (CSet Construction) add young regions to 
> CSet, eden: 380 regions, survivors: 2 regions, predicted young region 
> time: 5.51 ms]
>  30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, 
> eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted 
> pause time: 27.83 ms, target pause time: 40.00 ms]
> 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: 
> [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 
> refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 
> secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs]
>    [Parallel Time: 14.4 ms, GC Workers: 22]
>       [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 
> 30271429.9, Diff: 0.5]
>       [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 
> 6.4, Sum: 120.1]
>       [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8]
>          [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, 
> Sum: 1181]
>       [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
> 0.0, Sum: 0.1]
>       [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 
> 55.3]
>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 
> 7.3]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, 
> Sum: 2.0]
>       [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 
> 0.6, Sum: 300.7]
>       [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 
> 30271443.5, Diff: 0.2]
>    [Code Root Fixup: 0.2 ms]
>    [Code Root Migration: 0.1 ms]
>    [Clear CT: 1.0 ms]
>    [Other: 914.8 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 910.3 ms]
>       [Ref Enq: 0.5 ms]
>       [Free CSet: 1.7 ms]
>    [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 
> 4588.4M(5120.0M)->1551.4M(5120.0M)]
>  [Times: user=0.29 sys=0.00, real=0.93 secs]
>
> *2. Processing weak reference takes long time*
> 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: 
> [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 
> refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 
> secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs]
>  [Parallel Time: 33.1 ms, GC Workers: 22]
>   [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 
> 93967013.6, Diff: 0.7]
>   [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, 
> Sum: 122.9]
>   [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, 
> Sum: 25.2]
>   [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3]
>      [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 
> 1173]
>   [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6]
>   [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, 
> Sum: 0.1]
>   [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 
> 463.0]
>   [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4]
>   [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 
> 4.9]
>   [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, 
> Sum: 709.3]
>   [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 
> 93967045.8, Diff: 0.5]
>  [Code Root Fixup: 0.2 ms]
>  [Code Root Migration: 0.1 ms]
>    [Clear CT: 0.5 ms]
>    [Other: 11.7 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 7.8 ms]
>       [Ref Enq: 0.5 ms]
>       [Free CSet: 0.8 ms]
>    [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M 
> Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)]
>  [Times: user=0.66 sys=0.00, real=0.04 secs]
> 2015-05-16T13:21:33.478+0000: 93967.057: [GC 
> concurrent-root-region-scan-start]
> 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which 
> application threads were stopped: 0.0652331 seconds
> 2015-05-16T13:21:33.487+0000: 93967.066: [GC 
> concurrent-root-region-scan-end, 0.0082516 secs]
> 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start]
> 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 
> 0.4016735 secs]
> 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC 
> ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: 
> [*WeakReference, 1430199 refs, 0.7913479 secs*]93968.281: 
> [FinalReference, 367 refs, 0.0036350 secs]93968.285: 
> [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak 
> Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs]
>  [Times: user=15.10 sys=0.19, real=1.08 secs]
>
> Appreciate your help,
> -Joy
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/192f73ad/attachment-0001.html>

From joyxiong at yahoo.com  Wed May 20 20:17:05 2015
From: joyxiong at yahoo.com (Joy Xiong)
Date: Wed, 20 May 2015 20:17:05 +0000 (UTC)
Subject: Long Reference Processing Time
In-Reply-To: <555CCFD6.50400@oracle.com>
References: <555CCFD6.50400@oracle.com>
Message-ID: <723782990.3775475.1432153025601.JavaMail.yahoo@mail.yahoo.com>

Yu and Poonam,
Thank you for your quick response.?In terms of JDK version, we have 8u40 available, so want to check with you how 8u40 differs from 8u45.
thanks,-Joy 


     On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar <poonam.bajaj at oracle.com> wrote:
   

  Hello Joy,
 
 Could you try running with the latest JDK8 update release (8u45). Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since 8u5. Please test with the latest 8u and let us know the results.
 
 Thanks,
 Poonam
 
 On 5/20/2015 10:35 AM, Joy Xiong wrote:
  
  
  Hi All, 
  I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the application: -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC?
  /export/apps/jdk/JDK-1_8_0_5/bin/java
  
  With G1, I observe long time processing references. The long reference processing time has two types:? 1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time contributing to processing soft references? 2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS. 
  Detailed log record is shown as below: 1. Processing soft reference takes long time. However, we only have 0 soft reference 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 201326592 bytes, new threshold 15 (max 15) - age ? 1: ? ?6197672 bytes, ? ?6197672 total - age ? 2: ? ? 553864 bytes, ? ?6751536 total  - age ? 3: ? ? 321216 bytes, ? ?7072752 total - age ? 4: ? ? 563120 bytes, ? ?7635872 total - age ? 5: ? ? 261920 bytes, ? ?7897792 total - age ? 6: ? ? 265768 bytes, ? ?8163560 total - age ? 7: ? ? 319856 bytes, ? ?8483416 total - age ? 8: ? ? 132328 bytes, ? ?8615744 total - age ? 9: ? ? 153768 bytes, ? ?8769512 total - age ?10: ? ? 194256 bytes, ? ?8963768 total - age ?11: ? ? ?64600 bytes, ? ?9028368 total - age ?12: ? ? 160208 bytes, ? ?9188576 total - age ?13: ? ? ?69376 bytes, ? ?9257952 total - age ?14: ? ? 151832 bytes, ? ?9409784 total - age ?15: ? ? 186920 bytes, ? ?9596704 total ?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms] ?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms] ?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms] 30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs]  ? ?[Parallel Time: 14.4 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5] ? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0] ? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7] ? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 1.0 ms] ? ?[Other: 914.8 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 910.3 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 1.7 ms] ? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)] ?[Times: user=0.29 sys=0.00, real=0.93 secs]  
   2. Processing weak reference takes long time 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] ? ?[Parallel Time: 33.1 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7] ? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9] ? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9] ? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3] ? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms]  ? ?[Clear CT: 0.5 ms] ? ?[Other: 11.7 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 7.8 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 0.8 ms] ? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] ?[Times: user=0.66 sys=0.00, real=0.04 secs] 2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start] 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped: 0.0652331 seconds 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs] 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs] 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] ?[Times: user=15.10 sys=0.19, real=1.08 secs] 
  Appreciate your help, -Joy  
   
  
 _______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/ad731021/attachment.html>

From poonam.bajaj at oracle.com  Wed May 20 21:26:43 2015
From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar)
Date: Wed, 20 May 2015 14:26:43 -0700
Subject: Long Reference Processing Time
In-Reply-To: <723782990.3775475.1432153025601.JavaMail.yahoo@mail.yahoo.com>
References: <555CCFD6.50400@oracle.com>
	<723782990.3775475.1432153025601.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <555CFC13.10900@oracle.com>

Hello Joy,

8u40 is the latest update release that contains new enhancements and bug 
fixes, and 8u45 is the latest security release that includes security 
fixes on top of 8u40.

So, for your test run I think you can try with 8u40.

regards,
Poonam

On 5/20/2015 1:17 PM, Joy Xiong wrote:
> Yu and Poonam,
>
> Thank you for your quick response.
> In terms of JDK version, we have 8u40 available, so want to check with 
> you how 8u40 differs from 8u45.
>
> thanks,
> -Joy
>
>
>
> On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar 
> <poonam.bajaj at oracle.com> wrote:
>
>
> Hello Joy,
>
> Could you try running with the latest JDK8 update release (8u45). 
> Looks like you are trying out G1 with 8u5. There have been many 
> improvements/fixes in G1GC since 8u5. Please test with the latest 8u 
> and let us know the results.
>
> Thanks,
> Poonam
>
> On 5/20/2015 10:35 AM, Joy Xiong wrote:
>> *
>> *
>> Hi All,
>>
>> I recently moved our application from CMS to G1 due to heap 
>> fragmentation. Here are the JVM tunable used for the application:
>> -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled 
>> -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g 
>> -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC
>> /export/apps/jdk/JDK-1_8_0_5/bin/java
>>
>> With G1, I observe long time processing references. The long 
>> reference processing time has two types:
>> 1) Occur in Young GC phase. The processing time does not make sense 
>> to me, as the majority time is spent on processing soft reference, 
>> whose number is 0. Is there some hidden time contributing to 
>> processing soft references?
>> 2) Occur in the remark phase during the concurrent phase. Our 
>> application has a large number of weak references, but I don't quite 
>> understand why the processing time is much larger with G1 than with CMS.
>>
>> Detailed log record is shown as below:
>> *1. Processing soft reference takes long time*. However, we only have 
>> 0 soft reference
>> 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation 
>> Pause) (young)
>> Desired survivor size 201326592 bytes, new threshold 15 (max 15)
>> - age   1:    6197672 bytes,  6197672 total
>> - age   2:     553864 bytes,  6751536 total
>> - age   3:     321216 bytes,  7072752 total
>> - age   4:     563120 bytes,  7635872 total
>> - age   5:     261920 bytes,  7897792 total
>> - age   6:     265768 bytes,  8163560 total
>> - age   7:     319856 bytes,  8483416 total
>> - age   8:     132328 bytes,  8615744 total
>> - age   9:     153768 bytes,  8769512 total
>> - age  10:     194256 bytes,  8963768 total
>> - age  11:      64600 bytes,  9028368 total
>> - age  12:     160208 bytes,  9188576 total
>> - age  13:      69376 bytes,  9257952 total
>> - age  14:     151832 bytes,  9409784 total
>> - age  15:     186920 bytes,  9596704 total
>>  30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, 
>> _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 
>> 17.67 ms, target pause time: 40.00 ms]
>>  30271.429: [G1Ergonomics (CSet Construction) add young regions to 
>> CSet, eden: 380 regions, survivors: 2 regions, predicted young region 
>> time: 5.51 ms]
>>  30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, 
>> eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted 
>> pause time: 27.83 ms, target pause time: 40.00 ms]
>> 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: 
>> [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 
>> refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 
>> 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 
>> 0.9305765 secs]
>>    [Parallel Time: 14.4 ms, GC Workers: 22]
>>       [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 
>> 30271429.9, Diff: 0.5]
>>       [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 
>> 6.4, Sum: 120.1]
>>       [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 
>> 80.8]
>>          [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, 
>> Sum: 1181]
>>       [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0]
>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
>> 0.0, Sum: 0.1]
>>       [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, 
>> Sum: 55.3]
>>       [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, 
>> Sum: 7.3]
>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, 
>> Sum: 2.0]
>>       [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 
>> 0.6, Sum: 300.7]
>>       [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 
>> 30271443.5, Diff: 0.2]
>>    [Code Root Fixup: 0.2 ms]
>>    [Code Root Migration: 0.1 ms]
>>    [Clear CT: 1.0 ms]
>>    [Other: 914.8 ms]
>>       [Choose CSet: 0.0 ms]
>>       [Ref Proc: 910.3 ms]
>>       [Ref Enq: 0.5 ms]
>>       [Free CSet: 1.7 ms]
>>    [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M 
>> Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)]
>>  [Times: user=0.29 sys=0.00, real=0.93 secs]
>>
>> *2. Processing weak reference takes long time*
>> 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: 
>> [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 
>> refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 
>> secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs]
>>    [Parallel Time: 33.1 ms, GC Workers: 22]
>>       [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 
>> 93967013.6, Diff: 0.7]
>>       [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 
>> 10.4, Sum: 122.9]
>>       [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 
>> 9.2, Sum: 25.2]
>>       [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 
>> 67.3]
>>          [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, 
>> Sum: 1173]
>>       [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6]
>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
>> 0.0, Sum: 0.1]
>>       [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, 
>> Sum: 463.0]
>>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, 
>> Sum: 1.4]
>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, 
>> Sum: 4.9]
>>       [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 
>> 0.9, Sum: 709.3]
>>       [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 
>> 93967045.8, Diff: 0.5]
>>    [Code Root Fixup: 0.2 ms]
>>    [Code Root Migration: 0.1 ms]
>>    [Clear CT: 0.5 ms]
>>    [Other: 11.7 ms]
>>       [Choose CSet: 0.0 ms]
>>       [Ref Proc: 7.8 ms]
>>       [Ref Enq: 0.5 ms]
>>       [Free CSet: 0.8 ms]
>>    [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M 
>> Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)]
>>  [Times: user=0.66 sys=0.00, real=0.04 secs]
>> 2015-05-16T13:21:33.478+0000: 93967.057: [GC 
>> concurrent-root-region-scan-start]
>> 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which 
>> application threads were stopped: 0.0652331 seconds
>> 2015-05-16T13:21:33.487+0000: 93967.066: [GC 
>> concurrent-root-region-scan-end, 0.0082516 secs]
>> 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start]
>> 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 
>> 0.4016735 secs]
>> 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC 
>> ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 
>> secs]93967.490: [*WeakReference, 1430199 refs, 0.7913479 
>> secs*]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: 
>> [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak 
>> Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs]
>>  [Times: user=15.10 sys=0.19, real=1.08 secs]
>>
>> Appreciate your help,
>> -Joy
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net  <mailto:hotspot-gc-use at openjdk.java.net>
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/28eec993/attachment-0001.html>

From joyxiong at yahoo.com  Wed May 20 22:12:08 2015
From: joyxiong at yahoo.com (Joy Xiong)
Date: Wed, 20 May 2015 22:12:08 +0000 (UTC)
Subject: Long Reference Processing Time
In-Reply-To: <555CFC13.10900@oracle.com>
References: <555CFC13.10900@oracle.com>
Message-ID: <1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com>

Thank you, Poonam.
Also is there a way to get more info on weak references, such as the reference name? Our application does not use weak references, so it's likely that the weak references come from the underneath library, and I'd like to know which library is using lots of weak references.
thanks,-Joy 


     On Wednesday, May 20, 2015 2:26 PM, Poonam Bajaj Parhar <poonam.bajaj at oracle.com> wrote:
   

  Hello Joy,
 
 8u40 is the latest update release that contains new enhancements and bug fixes, and 8u45 is the latest security release that includes security fixes on top of 8u40.
 
 So, for your test run I think you can try with 8u40.
 
 regards,
 Poonam
 
 On 5/20/2015 1:17 PM, Joy Xiong wrote:
  
   Yu and Poonam, 
  Thank you for your quick response.? In terms of JDK version, we have 8u40 available, so want to check with you how 8u40 differs from 8u45. 
  thanks, -Joy 
 
 
       On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar <poonam.bajaj at oracle.com> wrote:
   
 
    Hello Joy,
 
 Could you try running with the latest JDK8 update release (8u45). Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since 8u5. Please test with the latest 8u and let us know the results.
 
 Thanks,
 Poonam
 
  On 5/20/2015 10:35 AM, Joy Xiong wrote:
  
  
  Hi All, 
  I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the  application: -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC?
  /export/apps/jdk/JDK-1_8_0_5/bin/java
  
  With G1, I observe long time processing references. The long reference processing time has two types:? 1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing  soft reference, whose number is 0. Is there some hidden time contributing to processing soft references? 2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS. 
  Detailed log record is shown as below: 1. Processing soft reference takes long time. However, we only have 0 soft reference 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 201326592 bytes, new threshold 15 (max 15) - age ? 1: ? ?6197672 bytes, ? ?6197672 total - age ? 2: ? ? 553864 bytes, ? ?6751536 total  - age ? 3: ? ? 321216 bytes, ? ?7072752 total - age ? 4: ? ? 563120 bytes, ? ?7635872 total - age ? 5: ? ? 261920 bytes, ? ?7897792 total - age ? 6: ? ? 265768 bytes, ? ?8163560 total - age ? 7: ? ? 319856 bytes, ? ?8483416 total - age ? 8: ? ? 132328 bytes, ? ?8615744 total - age ? 9: ? ? 153768 bytes, ? ?8769512 total - age ?10: ? ? 194256 bytes, ? ?8963768 total - age ?11: ? ? ?64600 bytes, ? ?9028368 total - age ?12: ? ? 160208 bytes, ? ?9188576 total - age ?13: ? ? ?69376 bytes, ? ?9257952 total - age ?14: ? ? 151832 bytes, ? ?9409784 total - age ?15: ? ? 186920 bytes, ? ?9596704 total ?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base  time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms] ?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions,  predicted young region time: 5.51 ms] ?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0  regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms] 30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference,  0.0000124 secs], 0.9305765 secs]  ? ?[Parallel Time: 14.4 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5] ? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0] ? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7] ? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 1.0 ms] ? ?[Other: 914.8 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 910.3 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 1.7 ms] ? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)] ?[Times: user=0.29 sys=0.00, real=0.93 secs]  
   2. Processing weak reference takes long time 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052:  [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] ? ?[Parallel Time: 33.1 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7] ? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9] ? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9] ? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3] ? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms]  ? ?[Clear CT: 0.5 ms] ? ?[Other: 11.7 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 7.8 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 0.8 ms] ? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] ?[Times: user=0.66 sys=0.00, real=0.04 secs] 2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start] 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped:  0.0652331 seconds 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs] 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs] 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725  refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] ?[Times: user=15.10 sys=0.19, real=1.08 secs] 
  Appreciate your help, -Joy  
   
  
 _______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
 
  
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/fae88390/attachment.html>

From yu.zhang at oracle.com  Wed May 20 22:35:21 2015
From: yu.zhang at oracle.com (Yu Zhang)
Date: Wed, 20 May 2015 15:35:21 -0700
Subject: Long Reference Processing Time
In-Reply-To: <1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com>
References: <555CFC13.10900@oracle.com>
	<1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <555D0C29.8000301@oracle.com>

can you dump the heap and examine it with eclipse mat or some similar tools?

Thanks,
Jenny

On 5/20/2015 3:12 PM, Joy Xiong wrote:
> Thank you, Poonam.
>
> Also is there a way to get more info on weak references, such as the 
> reference name? Our application does not use weak references, so it's 
> likely that the weak references come from the underneath library, and 
> I'd like to know which library is using lots of weak references.
>
> thanks,
> -Joy
>
>
>
> On Wednesday, May 20, 2015 2:26 PM, Poonam Bajaj Parhar 
> <poonam.bajaj at oracle.com> wrote:
>
>
> Hello Joy,
>
> 8u40 is the latest update release that contains new enhancements and 
> bug fixes, and 8u45 is the latest security release that includes 
> security fixes on top of 8u40.
>
> So, for your test run I think you can try with 8u40.
>
> regards,
> Poonam
>
> On 5/20/2015 1:17 PM, Joy Xiong wrote:
>> Yu and Poonam,
>>
>> Thank you for your quick response.
>> In terms of JDK version, we have 8u40 available, so want to check 
>> with you how 8u40 differs from 8u45.
>>
>> thanks,
>> -Joy
>>
>>
>>
>> On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar 
>> <poonam.bajaj at oracle.com> <mailto:poonam.bajaj at oracle.com> wrote:
>>
>>
>> Hello Joy,
>>
>> Could you try running with the latest JDK8 update release (8u45). 
>> Looks like you are trying out G1 with 8u5. There have been many 
>> improvements/fixes in G1GC since 8u5. Please test with the latest 8u 
>> and let us know the results.
>>
>> Thanks,
>> Poonam
>>
>> On 5/20/2015 10:35 AM, Joy Xiong wrote:
>>> *
>>> *
>>> Hi All,
>>>
>>> I recently moved our application from CMS to G1 due to heap 
>>> fragmentation. Here are the JVM tunable used for the application:
>>> -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled 
>>> -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g 
>>> -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC
>>> /export/apps/jdk/JDK-1_8_0_5/bin/java
>>>
>>> With G1, I observe long time processing references. The long 
>>> reference processing time has two types:
>>> 1) Occur in Young GC phase. The processing time does not make sense 
>>> to me, as the majority time is spent on processing soft reference, 
>>> whose number is 0. Is there some hidden time contributing to 
>>> processing soft references?
>>> 2) Occur in the remark phase during the concurrent phase. Our 
>>> application has a large number of weak references, but I don't quite 
>>> understand why the processing time is much larger with G1 than with CMS.
>>>
>>> Detailed log record is shown as below:
>>> *1. Processing soft reference takes long time*. However, we only 
>>> have 0 soft reference
>>> 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation 
>>> Pause) (young)
>>> Desired survivor size 201326592 bytes, new threshold 15 (max 15)
>>> - age   1:  6197672 bytes,    6197672 total
>>> - age   2: 553864 bytes,    6751536 total
>>> - age   3: 321216 bytes,    7072752 total
>>> - age   4: 563120 bytes,    7635872 total
>>> - age   5: 261920 bytes,    7897792 total
>>> - age   6: 265768 bytes,    8163560 total
>>> - age   7: 319856 bytes,    8483416 total
>>> - age   8: 132328 bytes,    8615744 total
>>> - age   9: 153768 bytes,    8769512 total
>>> - age  10: 194256 bytes,    8963768 total
>>> - age  11:  64600 bytes,    9028368 total
>>> - age  12: 160208 bytes,    9188576 total
>>> - age  13:  69376 bytes,    9257952 total
>>> - age  14: 151832 bytes,    9409784 total
>>> - age  15: 186920 bytes,    9596704 total
>>>  30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, 
>>> _pending_cards: 13708, predicted base time: 22.33 ms, remaining 
>>> time: 17.67 ms, target pause time: 40.00 ms]
>>>  30271.429: [G1Ergonomics (CSet Construction) add young regions to 
>>> CSet, eden: 380 regions, survivors: 2 regions, predicted young 
>>> region time: 5.51 ms]
>>>  30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, 
>>> eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted 
>>> pause time: 27.83 ms, target pause time: 40.00 ms]
>>> 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: 
>>> [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 
>>> refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 
>>> 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 
>>> 0.9305765 secs]
>>>    [Parallel Time: 14.4 ms, GC Workers: 22]
>>>       [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 
>>> 30271429.9, Diff: 0.5]
>>>       [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 
>>> 6.4, Sum: 120.1]
>>>       [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 
>>> 80.8]
>>>  [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181]
>>>       [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0]
>>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
>>> 0.0, Sum: 0.1]
>>>       [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, 
>>> Sum: 55.3]
>>> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3]
>>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 
>>> 0.2, Sum: 2.0]
>>>       [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 
>>> 0.6, Sum: 300.7]
>>>       [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 
>>> 30271443.5, Diff: 0.2]
>>>    [Code Root Fixup: 0.2 ms]
>>>    [Code Root Migration: 0.1 ms]
>>>    [Clear CT: 1.0 ms]
>>>    [Other: 914.8 ms]
>>>       [Choose CSet: 0.0 ms]
>>>       [Ref Proc: 910.3 ms]
>>>       [Ref Enq: 0.5 ms]
>>>       [Free CSet: 1.7 ms]
>>>    [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M 
>>> Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)]
>>>  [Times: user=0.29 sys=0.00, real=0.93 secs]
>>>
>>> *2. Processing weak reference takes long time*
>>> 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: 
>>> [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 
>>> refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 
>>> 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 
>>> 0.0455414 secs]
>>>    [Parallel Time: 33.1 ms, GC Workers: 22]
>>>       [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 
>>> 93967013.6, Diff: 0.7]
>>>       [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 
>>> 10.4, Sum: 122.9]
>>>       [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 
>>> 9.2, Sum: 25.2]
>>>       [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 
>>> 67.3]
>>>  [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173]
>>>       [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6]
>>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
>>> 0.0, Sum: 0.1]
>>>       [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, 
>>> Sum: 463.0]
>>> [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4]
>>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 
>>> 0.5, Sum: 4.9]
>>>       [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 
>>> 0.9, Sum: 709.3]
>>>       [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 
>>> 93967045.8, Diff: 0.5]
>>>    [Code Root Fixup: 0.2 ms]
>>>    [Code Root Migration: 0.1 ms]
>>>    [Clear CT: 0.5 ms]
>>>    [Other: 11.7 ms]
>>>       [Choose CSet: 0.0 ms]
>>>       [Ref Proc: 7.8 ms]
>>>       [Ref Enq: 0.5 ms]
>>>       [Free CSet: 0.8 ms]
>>>    [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M 
>>> Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)]
>>>  [Times: user=0.66 sys=0.00, real=0.04 secs]
>>> 2015-05-16T13:21:33.478+0000: 93967.057: [GC 
>>> concurrent-root-region-scan-start]
>>> 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which 
>>> application threads were stopped: 0.0652331 seconds
>>> 2015-05-16T13:21:33.487+0000: 93967.066: [GC 
>>> concurrent-root-region-scan-end, 0.0082516 secs]
>>> 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start]
>>> 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 
>>> 0.4016735 secs]
>>> 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC 
>>> ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 
>>> secs]93967.490: [*WeakReference, 1430199 refs, 0.7913479 
>>> secs*]93968.281: [FinalReference, 367 refs, 0.0036350 
>>> secs]93968.285: [PhantomReference, 221 refs, 0.0031875 
>>> secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 
>>> secs], 1.0832167 secs]
>>>  [Times: user=15.10 sys=0.19, real=1.08 secs]
>>>
>>> Appreciate your help,
>>> -Joy
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net  <mailto:hotspot-gc-use at openjdk.java.net>
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/3d2ed998/attachment-0001.html>

From joyxiong at yahoo.com  Wed May 20 22:41:27 2015
From: joyxiong at yahoo.com (Joy Xiong)
Date: Wed, 20 May 2015 22:41:27 +0000 (UTC)
Subject: Long Reference Processing Time
In-Reply-To: <555D0C29.8000301@oracle.com>
References: <555D0C29.8000301@oracle.com>
Message-ID: <913945159.3879136.1432161687674.JavaMail.yahoo@mail.yahoo.com>

Is there other ways for this? It's a prod environment and it would be too intrusive for a heap dump...
-Joy 


     On Wednesday, May 20, 2015 3:35 PM, Yu Zhang <yu.zhang at oracle.com> wrote:
   

  can you dump the heap and examine it with eclipse mat or some similar tools?
 Thanks,
Jenny On 5/20/2015 3:12 PM, Joy Xiong wrote:
  
  Thank you, Poonam. 
  Also is there a way to get more info on weak references, such as the reference name? Our application does not use weak references, so it's likely that the weak references come from the underneath library, and I'd like to know which library is using lots of weak references. 
  thanks, -Joy 
 
 
       On Wednesday, May 20, 2015 2:26 PM, Poonam Bajaj Parhar <poonam.bajaj at oracle.com> wrote:
   
 
    Hello Joy,
 
 8u40 is the latest update release that contains new enhancements and bug fixes, and 8u45 is the latest security release that includes security fixes on top of 8u40.
 
 So, for your test run I think you can try with 8u40.
 
 regards,
 Poonam
 
  On 5/20/2015 1:17 PM, Joy Xiong wrote:
  
   Yu and Poonam, 
  Thank you for your quick response.? In terms of JDK version, we have 8u40 available, so want to check with you how 8u40 differs from 8u45. 
  thanks, -Joy 
 
 
       On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar <poonam.bajaj at oracle.com> wrote:
   
 
    Hello Joy,
 
 Could you try running with the latest JDK8 update release (8u45).  Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since  8u5. Please test with the latest 8u and let us know the results.
 
 Thanks,
 Poonam
 
  On 5/20/2015 10:35 AM, Joy Xiong wrote:
  
  
  Hi All, 
  I recently moved our application from CMS to G1 due to heap  fragmentation. Here are the JVM tunable used for the application: -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M  -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC?
  /export/apps/jdk/JDK-1_8_0_5/bin/java
  
  With G1, I observe long time processing references. The  long reference processing time has two types:? 1) Occur in Young GC phase. The processing time does not make sense  to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time  contributing to processing soft references? 2) Occur in the remark phase during the concurrent phase. Our  application has a large number of weak references, but I don't quite understand why the processing time is much  larger with G1 than with CMS. 
  Detailed log record is shown as below: 1. Processing soft reference takes long  time. However, we only have 0 soft reference 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 201326592 bytes, new threshold 15 (max 15) - age ? 1: ? ?6197672 bytes, ? ?6197672 total - age ? 2: ? ? 553864 bytes, ? ?6751536 total  - age ? 3: ? ? 321216 bytes, ? ?7072752 total - age ? 4: ? ? 563120 bytes, ? ?7635872 total - age ? 5: ? ? 261920 bytes, ? ?7897792 total - age ? 6: ? ? 265768 bytes, ? ?8163560 total - age ? 7: ? ? 319856 bytes, ? ?8483416 total - age ? 8: ? ? 132328 bytes, ? ?8615744 total - age ? 9: ? ? 153768 bytes, ? ?8769512 total - age ?10: ? ? 194256 bytes, ? ?8963768 total - age ?11: ? ? ?64600 bytes, ? ?9028368 total - age ?12: ? ? 160208 bytes, ? ?9188576 total - age ?13: ? ? ?69376 bytes, ? ?9257952 total - age ?14: ? ? 151832 bytes, ? ?9409784 total - age ?15: ? ? 186920 bytes, ? ?9596704 total ?30271.429: [G1Ergonomics (CSet Construction) start  choosing CSet, _pending_cards: 13708, predicted base time:  22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms] ?30271.429: [G1Ergonomics (CSet Construction) add young  regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms] ?30271.429: [G1Ergonomics (CSet Construction) finish  choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause  time: 40.00 ms] 30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983  secs]30272.350: [FinalReference, 2 refs, 0.0019730  secs]30272.352: [PhantomReference, 102 refs, 0.0019032  secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs]  ? ?[Parallel Time: 14.4 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg:  30271429.7, Max: 30271429.9, Diff: 0.5] ? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5,  Max: 10.8, Diff: 6.4, Sum: 120.1] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7,  Diff: 8.7, Sum: 80.8] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109,  Sum: 1181] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9,  Diff: 4.9, Sum: 35.0] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0,  Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1,  Diff: 3.1, Sum: 55.3] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.3,  Max: 0.4, Diff: 0.4, Sum: 7.3] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1,  Max: 0.3, Diff: 0.2, Sum: 2.0] ? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7,  Max: 14.0, Diff: 0.6, Sum: 300.7] ? ? ? [GC Worker End (ms): Min: 30271443.3, Avg:  30271443.3, Max: 30271443.5, Diff: 0.2] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 1.0 ms] ? ?[Other: 914.8 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 910.3 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 1.7 ms] ? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap:4588.4M(5120.0M)->1551.4M(5120.0M)] ?[Times: user=0.29 sys=0.00, real=0.93 secs]  
   2. Processing weak reference takes long  time 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051:  [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97  refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160  secs], 0.0455414 secs] ? ?[Parallel Time: 33.1 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg:  93967013.3, Max: 93967013.6, Diff: 0.7] ? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff:  10.4, Sum: 122.9] ? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff:  9.2, Sum: 25.2] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum:  67.3] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum:  24.6] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:  0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff:  5.0, Sum: 463.0] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1,  Diff: 0.1, Sum: 1.4] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5,  Sum: 4.9] ? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7,  Diff: 0.9, Sum: 709.3] ? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max:  93967045.8, Diff: 0.5] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms]  ? ?[Clear CT: 0.5 ms] ? ?[Other: 11.7 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 7.8 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 0.8 ms] ? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] ?[Times: user=0.66 sys=0.00, real=0.04 secs] 2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start] 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application  threads were stopped: 0.0652331 seconds 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end,  0.0082516 secs] 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end,  0.4016735 secs] 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC  ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350  secs]93968.285: [PhantomReference, 221 refs, 0.0031875  secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] ?[Times: user=15.10 sys=0.19, real=1.08 secs] 
  Appreciate your help, -Joy  
   
  
 _______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
 
  
 _______________________________________________
 hotspot-gc-use mailing list
 hotspot-gc-use at openjdk.java.net
 http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
  
 
 _______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150520/1563c27d/attachment-0001.html>

From simone.bordet at gmail.com  Wed May 20 22:49:35 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Thu, 21 May 2015 00:49:35 +0200
Subject: Long Reference Processing Time
In-Reply-To: <555D0C29.8000301@oracle.com>
References: <555CFC13.10900@oracle.com>
	<1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com>
	<555D0C29.8000301@oracle.com>
Message-ID: <CAFWmRJ21prwsq4iYM5eai61rAq8swuYspomfXb4UHWEQ5ZcA-A@mail.gmail.com>

Hi,

On Thu, May 21, 2015 at 12:35 AM, Yu Zhang <yu.zhang at oracle.com> wrote:
> can you dump the heap and examine it with eclipse mat or some similar tools?

In our case this was not helpful, but your case may be different.

> On 5/20/2015 3:12 PM, Joy Xiong wrote:
> Also is there a way to get more info on weak references, such as the
> reference name? Our application does not use weak references, so it's likely
> that the weak references come from the underneath library, and I'd like to
> know which library is using lots of weak references.

We ended up with this "dirty" trick:
https://github.com/jetty-project/weakref-allocation

Basically, we modified WeakReference to keep track of allocations and
expose them via JMX.

Turned out that for our case, ThreadLocal and RMI usage were the most
important allocators of WeakReferences.

Hope it helps.

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From simone.bordet at gmail.com  Wed May 20 22:52:49 2015
From: simone.bordet at gmail.com (Simone Bordet)
Date: Thu, 21 May 2015 00:52:49 +0200
Subject: Long Reference Processing Time
In-Reply-To: <913945159.3879136.1432161687674.JavaMail.yahoo@mail.yahoo.com>
References: <555D0C29.8000301@oracle.com>
	<913945159.3879136.1432161687674.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CAFWmRJ2RzgjUW1F0MUxZ3hSJj2OG7Ljw=TPEEFk2A+5_bW+8gg@mail.gmail.com>

Hi,

On Thu, May 21, 2015 at 12:41 AM, Joy Xiong <joyxiong at yahoo.com> wrote:
> Is there other ways for this? It's a prod environment and it would be too
> intrusive for a heap dump...

We used our solution in production too, by enabling it for few minutes
to collect data (via JMX) and then disabling it until the next restart
(also via JMX), where it was removed.
Required 2 restarts: one to add the instrumentation, and one to remove it.

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From joyxiong at yahoo.com  Thu May 21 18:32:43 2015
From: joyxiong at yahoo.com (Joy Xiong)
Date: Thu, 21 May 2015 18:32:43 +0000 (UTC)
Subject: Long Reference Processing Time
In-Reply-To: <CAFWmRJ2RzgjUW1F0MUxZ3hSJj2OG7Ljw=TPEEFk2A+5_bW+8gg@mail.gmail.com>
References: <CAFWmRJ2RzgjUW1F0MUxZ3hSJj2OG7Ljw=TPEEFk2A+5_bW+8gg@mail.gmail.com>
Message-ID: <1091555704.4676086.1432233163367.JavaMail.yahoo@mail.yahoo.com>

Thank you Simone. 


     On Wednesday, May 20, 2015 3:52 PM, Simone Bordet <simone.bordet at gmail.com> wrote:
   

 Hi,

On Thu, May 21, 2015 at 12:41 AM, Joy Xiong <joyxiong at yahoo.com> wrote:
> Is there other ways for this? It's a prod environment and it would be too
> intrusive for a heap dump...

We used our solution in production too, by enabling it for few minutes
to collect data (via JMX) and then disabling it until the next restart
(also via JMX), where it was removed.
Required 2 restarts: one to add the instrumentation, and one to remove it.

-- 
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.? Victoria Livschitz


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150521/c5b41f77/attachment.html>

From jk at codearte.io  Mon May 25 08:56:08 2015
From: jk at codearte.io (Jakub Kubrynski)
Date: Mon, 25 May 2015 10:56:08 +0200
Subject: G1 young STW time in MBean
Message-ID: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>

Hi,

is there any possibility to get STW time for G1 young collection through
MBean? The duration reported here is IMHO total GC time (concurrent + stw).

-- 
Best regards,
Jakub Kubrynski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150525/8be3b60c/attachment.html>

From jesper.wilhelmsson at oracle.com  Mon May 25 15:37:51 2015
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Mon, 25 May 2015 17:37:51 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
Message-ID: <556341CF.9000705@oracle.com>

Hi Jakub,

The times reported in the MBeans are total GC times. There is currently no way 
to get STW times through the MBeans.

Best regards,
/Jesper


Jakub Kubrynski skrev den 25/5/15 10:56:
> Hi,
>
> is there any possibility to get STW time for G1 young collection through MBean?
> The duration reported here is IMHO total GC time (concurrent + stw).
>
> --
> Best regards,
> Jakub Kubrynski
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From jk at codearte.io  Mon May 25 15:43:14 2015
From: jk at codearte.io (Jakub Kubrynski)
Date: Mon, 25 May 2015 17:43:14 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <556341CF.9000705@oracle.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
	<556341CF.9000705@oracle.com>
Message-ID: <CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>

Do you know if there is any development planned in this area? Also there is
no avgPauseTime, promoted and survived mbean information for G1. The only
available substitution I see for now is to get safepoint timings.

Best,
Jakub Kubrynski
 25 maj 2015 17:37 "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com>
napisa?(a):

> Hi Jakub,
>
> The times reported in the MBeans are total GC times. There is currently no
> way to get STW times through the MBeans.
>
> Best regards,
> /Jesper
>
>
> Jakub Kubrynski skrev den 25/5/15 10:56:
>
>> Hi,
>>
>> is there any possibility to get STW time for G1 young collection through
>> MBean?
>> The duration reported here is IMHO total GC time (concurrent + stw).
>>
>> --
>> Best regards,
>> Jakub Kubrynski
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150525/2d6c07e7/attachment.html>

From jesper.wilhelmsson at oracle.com  Mon May 25 16:28:04 2015
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Mon, 25 May 2015 18:28:04 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>	<556341CF.9000705@oracle.com>
	<CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
Message-ID: <55634D94.1080301@oracle.com>

As far as I know there hasn't been much development done to get G1 to play well 
with the MBeans. And I don't think there is anything planned in this area. Maybe 
the serviceability team has other plans for the MBeans.
/Jesper


Jakub Kubrynski skrev den 25/5/15 17:43:
> Do you know if there is any development planned in this area? Also there is no
> avgPauseTime, promoted and survived mbean information for G1. The only available
> substitution I see for now is to get safepoint timings.
>
> Best,
> Jakub Kubrynski
>
> 25 maj 2015 17:37 "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com
> <mailto:jesper.wilhelmsson at oracle.com>> napisa?(a):
>
>     Hi Jakub,
>
>     The times reported in the MBeans are total GC times. There is currently no
>     way to get STW times through the MBeans.
>
>     Best regards,
>     /Jesper
>
>
>     Jakub Kubrynski skrev den 25/5/15 10:56:
>
>         Hi,
>
>         is there any possibility to get STW time for G1 young collection through
>         MBean?
>         The duration reported here is IMHO total GC time (concurrent + stw).
>
>         --
>         Best regards,
>         Jakub Kubrynski
>
>
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From jk at codearte.io  Mon May 25 20:42:52 2015
From: jk at codearte.io (Jakub Kubrynski)
Date: Mon, 25 May 2015 22:42:52 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <55634D94.1080301@oracle.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
	<556341CF.9000705@oracle.com>
	<CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
	<55634D94.1080301@oracle.com>
Message-ID: <CAMfYRA6Zkf9zhMMSQvf6dHgFdBDTgiijYzqAJrVVtxfZSt3i4g@mail.gmail.com>

Maybe we could propose some JSR about that? Using G1 without proper
monitoring is like living on the edge :)

Cheers,
Jakub

2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com
>:

> As far as I know there hasn't been much development done to get G1 to play
> well with the MBeans. And I don't think there is anything planned in this
> area. Maybe the serviceability team has other plans for the MBeans.
> /Jesper
>
>
> Jakub Kubrynski skrev den 25/5/15 17:43:
>
>> Do you know if there is any development planned in this area? Also there
>> is no
>> avgPauseTime, promoted and survived mbean information for G1. The only
>> available
>> substitution I see for now is to get safepoint timings.
>>
>> Best,
>> Jakub Kubrynski
>>
>> 25 maj 2015 17:37 "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com
>> <mailto:jesper.wilhelmsson at oracle.com>> napisa?(a):
>>
>>     Hi Jakub,
>>
>>     The times reported in the MBeans are total GC times. There is
>> currently no
>>     way to get STW times through the MBeans.
>>
>>     Best regards,
>>     /Jesper
>>
>>
>>     Jakub Kubrynski skrev den 25/5/15 10:56:
>>
>>         Hi,
>>
>>         is there any possibility to get STW time for G1 young collection
>> through
>>         MBean?
>>         The duration reported here is IMHO total GC time (concurrent +
>> stw).
>>
>>         --
>>         Best regards,
>>         Jakub Kubrynski
>>
>>
>>         _______________________________________________
>>         hotspot-gc-use mailing list
>>         hotspot-gc-use at openjdk.java.net <mailto:
>> hotspot-gc-use at openjdk.java.net>
>>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>


-- 
Best regards,
Jakub Kubrynski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150525/674481b6/attachment.html>

From jesper.wilhelmsson at oracle.com  Mon May 25 20:49:12 2015
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Mon, 25 May 2015 22:49:12 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <CAMfYRA6Zkf9zhMMSQvf6dHgFdBDTgiijYzqAJrVVtxfZSt3i4g@mail.gmail.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
	<556341CF.9000705@oracle.com>
	<CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
	<55634D94.1080301@oracle.com>
	<CAMfYRA6Zkf9zhMMSQvf6dHgFdBDTgiijYzqAJrVVtxfZSt3i4g@mail.gmail.com>
Message-ID: <55638AC8.4080401@oracle.com>

Including the serviceability team.

There might be plans for G1 monitoring in the future but I don't know too much 
about what and when. I know there are commercial products that does a pretty 
good job at G1 monitoring but I don't know if that is an option for you.

Best,
/Jesper


Jakub Kubrynski skrev den 25/5/15 22:42:
> Maybe we could propose some JSR about that? Using G1 without proper monitoring
> is like living on the edge :)
>
> Cheers,
> Jakub
>
> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com
> <mailto:jesper.wilhelmsson at oracle.com>>:
>
>     As far as I know there hasn't been much development done to get G1 to play
>     well with the MBeans. And I don't think there is anything planned in this
>     area. Maybe the serviceability team has other plans for the MBeans.
>     /Jesper
>
>
>     Jakub Kubrynski skrev den 25/5/15 17:43:
>
>         Do you know if there is any development planned in this area? Also there
>         is no
>         avgPauseTime, promoted and survived mbean information for G1. The only
>         available
>         substitution I see for now is to get safepoint timings.
>
>         Best,
>         Jakub Kubrynski
>
>         25 maj 2015 17:37 "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com
>         <mailto:jesper.wilhelmsson at oracle.com>
>         <mailto:jesper.wilhelmsson at oracle.com
>         <mailto:jesper.wilhelmsson at oracle.com>>> napisa?(a):
>
>              Hi Jakub,
>
>              The times reported in the MBeans are total GC times. There is
>         currently no
>              way to get STW times through the MBeans.
>
>              Best regards,
>              /Jesper
>
>
>              Jakub Kubrynski skrev den 25/5/15 10:56:
>
>                  Hi,
>
>                  is there any possibility to get STW time for G1 young
>         collection through
>                  MBean?
>                  The duration reported here is IMHO total GC time (concurrent +
>         stw).
>
>                  --
>                  Best regards,
>                  Jakub Kubrynski
>
>
>                  _______________________________________________
>                  hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>         <mailto:hotspot-gc-use at openjdk.java.net
>         <mailto:hotspot-gc-use at openjdk.java.net>>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> --
> Best regards,
> Jakub Kubrynski

From jk at codearte.io  Tue May 26 06:22:25 2015
From: jk at codearte.io (Jakub Kubrynski)
Date: Tue, 26 May 2015 08:22:25 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <D7AED88F-F8C2-4E54-9BEF-1ED0EF35CD8C@oracle.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
	<556341CF.9000705@oracle.com>
	<CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
	<55634D94.1080301@oracle.com>
	<CAMfYRA6Zkf9zhMMSQvf6dHgFdBDTgiijYzqAJrVVtxfZSt3i4g@mail.gmail.com>
	<55638AC8.4080401@oracle.com>
	<D7AED88F-F8C2-4E54-9BEF-1ED0EF35CD8C@oracle.com>
Message-ID: <CAMfYRA6R+wV7o-hnTFGttzyOZTtVQ0r7PjZiHrzGQ84EbKYNSA@mail.gmail.com>

Jesper already pointed me about JMC. The reason we're not using it is that
it cannot be integrated with our production monitoring. It's more problem
solving tool than continuous APM. So the question is if Oracle is going to
implement some MBeans for G1, and if not maybe we could propose JSR about
it?

Cheers,
Jakub
26 maj 2015 08:17 "Staffan Larsen" <staffan.larsen at oracle.com> napisa?(a):

> Try out Java Flight Recorder - it has a lot more data in it.
>
> /Staffan
>
> > On 25 maj 2015, at 22:49, Jesper Wilhelmsson <
> jesper.wilhelmsson at oracle.com> wrote:
> >
> > Including the serviceability team.
> >
> > There might be plans for G1 monitoring in the future but I don't know
> too much about what and when. I know there are commercial products that
> does a pretty good job at G1 monitoring but I don't know if that is an
> option for you.
> >
> > Best,
> > /Jesper
> >
> >
> > Jakub Kubrynski skrev den 25/5/15 22:42:
> >> Maybe we could propose some JSR about that? Using G1 without proper
> monitoring
> >> is like living on the edge :)
> >>
> >> Cheers,
> >> Jakub
> >>
> >> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson <
> jesper.wilhelmsson at oracle.com
> >> <mailto:jesper.wilhelmsson at oracle.com>>:
> >>
> >>    As far as I know there hasn't been much development done to get G1
> to play
> >>    well with the MBeans. And I don't think there is anything planned in
> this
> >>    area. Maybe the serviceability team has other plans for the MBeans.
> >>    /Jesper
> >>
> >>
> >>    Jakub Kubrynski skrev den 25/5/15 17:43:
> >>
> >>        Do you know if there is any development planned in this area?
> Also there
> >>        is no
> >>        avgPauseTime, promoted and survived mbean information for G1.
> The only
> >>        available
> >>        substitution I see for now is to get safepoint timings.
> >>
> >>        Best,
> >>        Jakub Kubrynski
> >>
> >>        25 maj 2015 17:37 "Jesper Wilhelmsson" <
> jesper.wilhelmsson at oracle.com
> >>        <mailto:jesper.wilhelmsson at oracle.com>
> >>        <mailto:jesper.wilhelmsson at oracle.com
> >>        <mailto:jesper.wilhelmsson at oracle.com>>> napisa?(a):
> >>
> >>             Hi Jakub,
> >>
> >>             The times reported in the MBeans are total GC times. There
> is
> >>        currently no
> >>             way to get STW times through the MBeans.
> >>
> >>             Best regards,
> >>             /Jesper
> >>
> >>
> >>             Jakub Kubrynski skrev den 25/5/15 10:56:
> >>
> >>                 Hi,
> >>
> >>                 is there any possibility to get STW time for G1 young
> >>        collection through
> >>                 MBean?
> >>                 The duration reported here is IMHO total GC time
> (concurrent +
> >>        stw).
> >>
> >>                 --
> >>                 Best regards,
> >>                 Jakub Kubrynski
> >>
> >>
> >>                 _______________________________________________
> >>                 hotspot-gc-use mailing list
> >>        hotspot-gc-use at openjdk.java.net <mailto:
> hotspot-gc-use at openjdk.java.net>
> >>        <mailto:hotspot-gc-use at openjdk.java.net
> >>        <mailto:hotspot-gc-use at openjdk.java.net>>
> >>        http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >>
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Jakub Kubrynski
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150526/8bb5c094/attachment-0001.html>

From staffan.larsen at oracle.com  Tue May 26 06:17:39 2015
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 26 May 2015 08:17:39 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <55638AC8.4080401@oracle.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
	<556341CF.9000705@oracle.com>
	<CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
	<55634D94.1080301@oracle.com>
	<CAMfYRA6Zkf9zhMMSQvf6dHgFdBDTgiijYzqAJrVVtxfZSt3i4g@mail.gmail.com>
	<55638AC8.4080401@oracle.com>
Message-ID: <D7AED88F-F8C2-4E54-9BEF-1ED0EF35CD8C@oracle.com>

Try out Java Flight Recorder - it has a lot more data in it.

/Staffan

> On 25 maj 2015, at 22:49, Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com> wrote:
> 
> Including the serviceability team.
> 
> There might be plans for G1 monitoring in the future but I don't know too much about what and when. I know there are commercial products that does a pretty good job at G1 monitoring but I don't know if that is an option for you.
> 
> Best,
> /Jesper
> 
> 
> Jakub Kubrynski skrev den 25/5/15 22:42:
>> Maybe we could propose some JSR about that? Using G1 without proper monitoring
>> is like living on the edge :)
>> 
>> Cheers,
>> Jakub
>> 
>> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com
>> <mailto:jesper.wilhelmsson at oracle.com>>:
>> 
>>    As far as I know there hasn't been much development done to get G1 to play
>>    well with the MBeans. And I don't think there is anything planned in this
>>    area. Maybe the serviceability team has other plans for the MBeans.
>>    /Jesper
>> 
>> 
>>    Jakub Kubrynski skrev den 25/5/15 17:43:
>> 
>>        Do you know if there is any development planned in this area? Also there
>>        is no
>>        avgPauseTime, promoted and survived mbean information for G1. The only
>>        available
>>        substitution I see for now is to get safepoint timings.
>> 
>>        Best,
>>        Jakub Kubrynski
>> 
>>        25 maj 2015 17:37 "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com
>>        <mailto:jesper.wilhelmsson at oracle.com>
>>        <mailto:jesper.wilhelmsson at oracle.com
>>        <mailto:jesper.wilhelmsson at oracle.com>>> napisa?(a):
>> 
>>             Hi Jakub,
>> 
>>             The times reported in the MBeans are total GC times. There is
>>        currently no
>>             way to get STW times through the MBeans.
>> 
>>             Best regards,
>>             /Jesper
>> 
>> 
>>             Jakub Kubrynski skrev den 25/5/15 10:56:
>> 
>>                 Hi,
>> 
>>                 is there any possibility to get STW time for G1 young
>>        collection through
>>                 MBean?
>>                 The duration reported here is IMHO total GC time (concurrent +
>>        stw).
>> 
>>                 --
>>                 Best regards,
>>                 Jakub Kubrynski
>> 
>> 
>>                 _______________________________________________
>>                 hotspot-gc-use mailing list
>>        hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>>        <mailto:hotspot-gc-use at openjdk.java.net
>>        <mailto:hotspot-gc-use at openjdk.java.net>>
>>        http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> 
>> 
>> 
>> 
>> --
>> Best regards,
>> Jakub Kubrynski


From staffan.larsen at oracle.com  Tue May 26 06:26:31 2015
From: staffan.larsen at oracle.com (Staffan Larsen)
Date: Tue, 26 May 2015 08:26:31 +0200
Subject: G1 young STW time in MBean
In-Reply-To: <CAMfYRA6R+wV7o-hnTFGttzyOZTtVQ0r7PjZiHrzGQ84EbKYNSA@mail.gmail.com>
References: <CAMfYRA46CQmbrJ3Auw290BfuHV_sDsrOe1pbtcCYr693fPH2Uw@mail.gmail.com>
	<556341CF.9000705@oracle.com>
	<CAMfYRA7nReZfGRgHnt-QWsYQcmDs0G7+X+H+OQq1enEdxNMXjg@mail.gmail.com>
	<55634D94.1080301@oracle.com>
	<CAMfYRA6Zkf9zhMMSQvf6dHgFdBDTgiijYzqAJrVVtxfZSt3i4g@mail.gmail.com>
	<55638AC8.4080401@oracle.com>
	<D7AED88F-F8C2-4E54-9BEF-1ED0EF35CD8C@oracle.com>
	<CAMfYRA6R+wV7o-hnTFGttzyOZTtVQ0r7PjZiHrzGQ84EbKYNSA@mail.gmail.com>
Message-ID: <D682EE11-2201-424E-BFB8-767D7DE170AB@oracle.com>


> On 26 maj 2015, at 08:22, Jakub Kubrynski <jk at codearte.io> wrote:
> 
> Jesper already pointed me about JMC. The reason we're not using it is that it cannot be integrated with our production monitoring. It's more problem solving tool than continuous APM. So the question is if Oracle is going to implement some MBeans for G1, and if not maybe we could propose JSR about it?
> 
I don?t think there are any open issues or plans for this, so contributions are always welcome!

/Staffan


> Cheers, 
> Jakub
> 
> 26 maj 2015 08:17 "Staffan Larsen" <staffan.larsen at oracle.com <mailto:staffan.larsen at oracle.com>> napisa?(a):
> Try out Java Flight Recorder - it has a lot more data in it.
> 
> /Staffan
> 
> > On 25 maj 2015, at 22:49, Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>> wrote:
> >
> > Including the serviceability team.
> >
> > There might be plans for G1 monitoring in the future but I don't know too much about what and when. I know there are commercial products that does a pretty good job at G1 monitoring but I don't know if that is an option for you.
> >
> > Best,
> > /Jesper
> >
> >
> > Jakub Kubrynski skrev den 25/5/15 22:42:
> >> Maybe we could propose some JSR about that? Using G1 without proper monitoring
> >> is like living on the edge :)
> >>
> >> Cheers,
> >> Jakub
> >>
> >> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>
> >> <mailto:jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>>:
> >>
> >>    As far as I know there hasn't been much development done to get G1 to play
> >>    well with the MBeans. And I don't think there is anything planned in this
> >>    area. Maybe the serviceability team has other plans for the MBeans.
> >>    /Jesper
> >>
> >>
> >>    Jakub Kubrynski skrev den 25/5/15 17:43:
> >>
> >>        Do you know if there is any development planned in this area? Also there
> >>        is no
> >>        avgPauseTime, promoted and survived mbean information for G1. The only
> >>        available
> >>        substitution I see for now is to get safepoint timings.
> >>
> >>        Best,
> >>        Jakub Kubrynski
> >>
> >>        25 maj 2015 17:37 "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>
> >>        <mailto:jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>
> >>        <mailto:jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>
> >>        <mailto:jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>>> napisa?(a):
> >>
> >>             Hi Jakub,
> >>
> >>             The times reported in the MBeans are total GC times. There is
> >>        currently no
> >>             way to get STW times through the MBeans.
> >>
> >>             Best regards,
> >>             /Jesper
> >>
> >>
> >>             Jakub Kubrynski skrev den 25/5/15 10:56:
> >>
> >>                 Hi,
> >>
> >>                 is there any possibility to get STW time for G1 young
> >>        collection through
> >>                 MBean?
> >>                 The duration reported here is IMHO total GC time (concurrent +
> >>        stw).
> >>
> >>                 --
> >>                 Best regards,
> >>                 Jakub Kubrynski
> >>
> >>
> >>                 _______________________________________________
> >>                 hotspot-gc-use mailing list
> >>        hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net> <mailto:hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>>
> >>        <mailto:hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
> >>        <mailto:hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>>>
> >>        http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
> >>
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Jakub Kubrynski
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150526/a4540418/attachment.html>

From jason.goetz at gmail.com  Thu May 28 17:58:27 2015
From: jason.goetz at gmail.com (Jason Goetz)
Date: Thu, 28 May 2015 10:58:27 -0700
Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC
Message-ID: <D18CA553.42A6D%jason.goetz@gmail.com>

We're consistently seeing a situation where threads take a few seconds to
stop for a routine GC. For 20 straight minutes the GC will run right away
(it runs about every second). But then, during a 20-minute period, the
threads will take longer to stop for GC. See the GC output below.

2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application
threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908
seconds
2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application
threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106
seconds
2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 28.4067370
secs]
2015-05-28T12:15:09.786-0500: 54815.395: [GC remark
2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs],
0.0709271 secs]
 [Times: user=0.73 sys=0.00, real=0.08 secs]
2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application
threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032
seconds
2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G),
0.0451098 secs]
 [Times: user=0.61 sys=0.00, real=0.05 secs]
2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application
threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950
seconds

Turning on safepoint logging reveals that these stopping threads times are
taken up by safepoint ?sync? time. Taking thread dumps every second around
these pauses fail to show anything of note happening during this time, but
it?s my understanding that native code won?t necessarily show up in thread
dumps anyway given that they exit before the JVM reaches a safepoint.

Enabling PrintJNIGCStalls fails to show any logging around the 3 second
pause seen above. I highly suspected JNI but was surprise that I didn?t see
any logging about JNI Weak References after turning that option on. Any
ideas for what I can try next? We?re using JDK 7u80. Here are the rest of my
JVM settings:

DisableExplicitGC true
FlightRecorder true
GCLogFileSize 52428800
ManagementServer true
MinHeapSize 25769803776
MaxHeapSize 25769803776
MaxPermSize 536870912
NumberOfGCLogFiles 10
PrintAdaptiveSizePolicy true
PrintGC true
PrintGCApplicationStoppedTime true
PrintGCCause true
PrintGCDateStamps true
PrintGCDetails true
PrintGCTimeStamps true
PrintSafepointStatistics true
PrintSafepointStatisticsCount 1
PrintTenuringDistribution true
ReservedCodeCacheSize 268435456
SafepointTimeout true
SafepointTimeoutDelay 4000
ThreadStackSize 4096
UnlockCommercialFeatures true
UseBiasedLocking false
UseGCLogFileRotation false
UseG1GC true
PrintJNIGCStalls true


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150528/462b6d22/attachment.html>

From vitalyd at gmail.com  Thu May 28 18:17:02 2015
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 28 May 2015 14:17:02 -0400
Subject: JVM taking a few seconds to reach a safepoint for routine young
	gen GC
In-Reply-To: <D18CA553.42A6D%jason.goetz@gmail.com>
References: <D18CA553.42A6D%jason.goetz@gmail.com>
Message-ID: <CAHjP37FNCvV2pGkeTDNTX4iS6BByot01K0g4-xz_edseL9wKAg@mail.gmail.com>

Jason,

How many java threads are active when these stalls happen? How many CPUs
are available to the jvm? How much physical memory on the machine? Is your
jvm sole occupant of the machine or do you have noisy neighbors? You
mentioned JNI - do you have a lot of JNI calls around these times? Do you
allocate and/or write to large arrays/memory regions? Is there something
different/interesting about these 20 min periods (e.g. workload increases,
same time of day, more disk activity, any paging/swap activity, etc).

sent from my phone
On May 28, 2015 1:58 PM, "Jason Goetz" <jason.goetz at gmail.com> wrote:

> We're consistently seeing a situation where threads take a few seconds to
> stop for a routine GC. For 20 straight minutes the GC will run right away
> (it runs about every second). But then, during a 20-minute period, the
> threads will take longer to stop for GC. See the GC output below.
>
> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application
> threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908
> seconds
> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application
> threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106
> seconds
> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end,
> 28.4067370 secs]
> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark
> 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs],
> 0.0709271 secs]
>  [Times: user=0.73 sys=0.00, real=0.08 secs]
> *2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application
> threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032
> seconds*
> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G),
> 0.0451098 secs]
>  [Times: user=0.61 sys=0.00, real=0.05 secs]
> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application
> threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950
> seconds
>
> Turning on safepoint logging reveals that these stopping threads times are
> taken up by safepoint ?sync? time. Taking thread dumps every second around
> these pauses fail to show anything of note happening during this time, but
> it?s my understanding that native code won?t necessarily show up in thread
> dumps anyway given that they exit before the JVM reaches a safepoint.
>
> Enabling PrintJNIGCStalls fails to show any logging around the 3 second
> pause seen above. I highly suspected JNI but was surprise that I didn?t see
> any logging about JNI Weak References after turning that option on. Any
> ideas for what I can try next? We?re using JDK 7u80. Here are the rest of
> my JVM settings:
>
> DisableExplicitGC true
> FlightRecorder true
> GCLogFileSize 52428800
> ManagementServer true
> MinHeapSize 25769803776
> MaxHeapSize 25769803776
> MaxPermSize 536870912
> NumberOfGCLogFiles 10
> PrintAdaptiveSizePolicy true
> PrintGC true
> PrintGCApplicationStoppedTime true
> PrintGCCause true
> PrintGCDateStamps true
> PrintGCDetails true
> PrintGCTimeStamps true
> PrintSafepointStatistics true
> PrintSafepointStatisticsCount 1
> PrintTenuringDistribution true
> ReservedCodeCacheSize 268435456
> SafepointTimeout true
> SafepointTimeoutDelay 4000
> ThreadStackSize 4096
> UnlockCommercialFeatures true
> UseBiasedLocking false
> UseGCLogFileRotation false
> UseG1GC true
> PrintJNIGCStalls true
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150528/cfe77e44/attachment.html>

From poonam.bajaj at oracle.com  Thu May 28 19:37:24 2015
From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar)
Date: Thu, 28 May 2015 12:37:24 -0700
Subject: JVM taking a few seconds to reach a safepoint for routine young
	gen GC
In-Reply-To: <D18CA553.42A6D%jason.goetz@gmail.com>
References: <D18CA553.42A6D%jason.goetz@gmail.com>
Message-ID: <55676E74.8080301@oracle.com>

Hello Jason,

On 5/28/2015 10:58 AM, Jason Goetz wrote:
> We're consistently seeing a situation where threads take a few seconds 
> to stop for a routine GC. For 20 straight minutes the GC will run 
> right away (it runs about every second). But then, during a 20-minute 
> period, the threads will take longer to stop for GC. See the GC output 
> below.
>
> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which 
> application threads were stopped: 0.1121233 seconds, Stopping threads 
> took: 0.0000908 seconds
> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which 
> application threads were stopped: 0.0019384 seconds, Stopping threads 
> took: 0.0001106 seconds
> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 
> 28.4067370 secs]
> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark 
> 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 
> secs], 0.0709271 secs]
>  [Times: user=0.73 sys=0.00, real=0.08 secs]
> *2015-05-28T12:15:09.864-0500: 54815.466: Total time for which 
> application threads were stopped: 3.2916224 seconds, Stopping threads 
> took: 3.2188032 seconds*
> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), 
> 0.0451098 secs]
>  [Times: user=0.61 sys=0.00, real=0.05 secs]
> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which 
> application threads were stopped: 0.0459803 seconds, Stopping threads 
> took: 0.0001950 seconds
>
> Turning on safepoint logging reveals that these stopping threads times 
> are taken up by safepoint 'sync' time. Taking thread dumps every 
> second around these pauses fail to show anything of note happening 
> during this time, but it's my understanding that native code won't 
> necessarily show up in thread dumps anyway given that they exit before 
> the JVM reaches a safepoint.
>
Could you please share the output collected with 
PrintSafepointStatisticsoption. That may tell us which VM-operation is 
having trouble in stopping threads before starting the actual work.

> Enabling PrintJNIGCStalls fails to show any logging around the 3 
> second pause seen above.

PrintJNIGCStalls option logs the events when a GC invocation request is 
made by any of the application threads but GC can not be invoked at that 
time because one or more of the applications threads are running in a 
JNI Critical Section. So, the GC is stalled until threads come out of 
the JNI critical section, and as threads exit the JNI critical section 
GC request is honored.

If this option didn't print anything that means that the application 
didn't encounter any such situation.

Thanks,
Poonam

> I highly suspected JNI but was surprise that I didn't see any logging 
> about JNI Weak References after turning that option on. Any ideas for 
> what I can try next? We're using JDK 7u80. Here are the rest of my JVM 
> settings:
>
> DisableExplicitGCtrue
> FlightRecordertrue
> GCLogFileSize52428800
> ManagementServertrue
> MinHeapSize25769803776
> MaxHeapSize25769803776
> MaxPermSize536870912
> NumberOfGCLogFiles10
> PrintAdaptiveSizePolicytrue
> PrintGCtrue
> PrintGCApplicationStoppedTimetrue
> PrintGCCausetrue
> PrintGCDateStampstrue
> PrintGCDetailstrue
> PrintGCTimeStampstrue
> PrintSafepointStatisticstrue
> PrintSafepointStatisticsCount1
> PrintTenuringDistributiontrue
> ReservedCodeCacheSize268435456
> SafepointTimeouttrue
> SafepointTimeoutDelay4000
> ThreadStackSize4096
> UnlockCommercialFeaturestrue
> UseBiasedLockingfalse
> UseGCLogFileRotationfalse
> UseG1GCtrue
> PrintJNIGCStallstrue
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150528/c9822f11/attachment-0001.html>

From jason.goetz at gmail.com  Fri May 29 16:51:44 2015
From: jason.goetz at gmail.com (Jason Goetz)
Date: Fri, 29 May 2015 09:51:44 -0700
Subject: JVM taking a few seconds to reach a safepoint for routine young
	gen GC
In-Reply-To: <CAHjP37E+Zo5nMbC5TgoQN=sBKmgdhFx9_2Zk88nTjKREO6J1fg@mail.gmail.com>
References: <D18CA553.42A6D%jason.goetz@gmail.com>
	<CAHjP37FNCvV2pGkeTDNTX4iS6BByot01K0g4-xz_edseL9wKAg@mail.gmail.com>
	<D18CACB7.42A79%jason.goetz@gmail.com>
	<CAHjP37FmU_tKUkdTVzapR-ovSb66OgOxSO6KLs6DtZG=nm_MtA@mail.gmail.com>
	<D18CBCFB.42AA8%jason.goetz@gmail.com>
	<CAHjP37E+Zo5nMbC5TgoQN=sBKmgdhFx9_2Zk88nTjKREO6J1fg@mail.gmail.com>
Message-ID: <D18DE1C3.42B5A%jason.goetz@gmail.com>

Oops, I did not intend to remove this from the list. Re-added.

I?ll take a look at how many RUNNABLE threads are actually blocked in native
code. I?ll also look at VSphere to see if I can see anything unusual around
resource contention.

I?ve grepped the safepoint logs for GenCollectForAllocation, which, as I
mentioned before, happen about every second, but only show this long sync
times during the mysterious 20-minute period. I?ve taken an excerpt from one
of these 20-minute pauses. You can see that for most GC the only time is in
vmop and sync time is 0, but during these pauses the ?sync? time takes up
the majority of the time.

[threads: total initially_running wait_to_block]    [time: spin block sync
cleanup vmop] page_trap_count

116546.586: GenCollectForAllocation          [     146          5
7    ]      [     0     0     0     1    50    ]  0
116546.891: GenCollectForAllocation          [     146          2
7    ]      [     0     0     0     2    50    ]  0
116547.969: GenCollectForAllocation          [     145          0
2    ]      [     0     0     0     2   290    ]  0
116549.500: GenCollectForAllocation          [     145          0
0    ]      [     0     0     0     2    67    ]  0
116550.836: GenCollectForAllocation          [     142          0
1    ]      [     0     0     0     1    82    ]  0
116553.398: GenCollectForAllocation          [     142          0
2    ]      [     0     0     0     2    76    ]  0
116555.109: GenCollectForAllocation          [     142          0
0    ]      [     0     0     0     2    84    ]  0
116557.328: GenCollectForAllocation          [     142          0
0    ]      [     0     0     0     2    64    ]  0
116561.992: GenCollectForAllocation          [     143          2
1    ]      [     0     0   523     2    76    ]  1
116567.367: GenCollectForAllocation          [     143          1
0    ]      [     1     0    39     2   104    ]  0
116572.438: GenCollectForAllocation          [     143          4
3    ]      [     0     0     0     2    85    ]  0
116575.977: GenCollectForAllocation          [     144         76
1    ]      [    24     0154039     9   181    ]  0
116731.336: GenCollectForAllocation          [     353         41
5    ]      [     5     0     5     1   101    ]  0
116732.328: GenCollectForAllocation          [     354          5
16    ]      [  2080     0  2115     1     0    ]  1
116736.430: GenCollectForAllocation          [     354          5
9    ]      [     0     0     0     1    81    ]  2
116736.891: GenCollectForAllocation          [     354          0
4    ]      [     0     0     0     4    88    ]  0
116737.305: GenCollectForAllocation          [     354          2
9    ]      [     0     0     0     1    80    ]  0
116737.664: GenCollectForAllocation          [     354          1
8    ]      [     0     0     0     2    65    ]  0
116738.055: GenCollectForAllocation          [     355          1
8    ]      [     0     0     0     1   106    ]  0
116738.797: GenCollectForAllocation          [     354          0
5    ]      [     0     0  2116     2   125    ]  0
116741.523: GenCollectForAllocation          [     353          1
0    ]      [     5     0   502     1   195    ]  0
116743.219: GenCollectForAllocation          [     352          1
5    ]      [     0     0     0     1     0    ]  0
116743.719: GenCollectForAllocation          [     352          1
7    ]      [     0     0     0     1    67    ]  0
116744.266: GenCollectForAllocation          [     352        271
0    ]      [    28     0764563     4     0    ]  0
117509.914: GenCollectForAllocation          [     347          1
2    ]      [     0     0     0     2   166    ]  0
117510.609: GenCollectForAllocation          [     456         84
9    ]      [     8     0     8     2   103    ]  1
117511.305: GenCollectForAllocation          [     479          0
6    ]      [     0     0     0     7   199    ]  0
117512.086: GenCollectForAllocation          [     480          0
2    ]      [     0     0     0     2   192    ]  0
117829.000: GenCollectForAllocation          [     569          0
3    ]      [     0     0     0     2     0    ]  0
117829.000: GenCollectForAllocation          [     569          2
5    ]      [     0     0     0     0   128    ]  0
117829.523: GenCollectForAllocation          [     569          0
6    ]      [     0     0     0     2    84    ]  0
117830.039: GenCollectForAllocation          [     571          0
5    ]      [     0     0     0     2     0    ]  0
117830.781: GenCollectForAllocation          [     571          0
6    ]      [     0     0     0     6    72    ]  0
117831.461: GenCollectForAllocation          [     571          0
4    ]      [     0     0     0     1     0    ]  0
117831.469: GenCollectForAllocation          [     571          0
3    ]      [     0     0     0     0   113    ]  0

From:  Vitaly Davidovich <vitalyd at gmail.com>
Date:  Thursday, May 28, 2015 at 4:20 PM
To:  Jason Goetz <jason.goetz at gmail.com>
Subject:  Re: JVM taking a few seconds to reach a safepoint for routine
young gen GC

Jason,

Not sure if you meant to reply just to me, but you did :)

So I suspect the RUNNABLE you list is what jstack gives you, which is
slightly a lie since it'll show some threads blocked in native code as
RUNNABLE.

The fact that you're on a VM is biasing me towards looking at that angle.
If there's a spike in runnable (from kernel scheduler standpoint) threads
and/or contention for resources, and it's driven by hypervisor, I wonder if
there're any artifacts in that.  I don't have much experience running
servers on VMs (only bare metal), so hard to say.  You may want to reply to
the list again and see if anyone else has more insight into this type of
setup.

Also, Poonam asked for safepoint statistics for the vm ops that were
requested -- do you have those?

On Thu, May 28, 2015 at 4:20 PM, Jason Goetz <jason.goetz at gmail.com> wrote:
> I?m happy to answer whatever I can. Thanks for taking the time to help. It?s
> running on a VM, not bare metal. The exact OS is Windows Server 2008. The
> database is running on another machine. There is a very large Lucene index on
> the same machine as the application and commits to this index are frequent and
> often contended.
> 
> From the thread dumps I took during these pauses (there are several that
> happen around minor GCs during these 20-minute periods) I can see the
> following stats:
> 
> Dump 1:
> Threads: 147
> RUNNABLE: 42
> WAITING: 30
> TIMED_WAITING: 75
> BLOCKED: 0
> 
> Dump 2:
>  Threads: 259
> RUNNABLE: 143
> WAITING: 47
> TIMED_WAITING: 62
> BLOCKED: 7
> 
> The only reason I believe the thread count is higher than usual on the second
> dump is that the dump follows a very long pause (69 seconds, all spent in sync
> time stopping threads for a safepoint) so I think there were several web
> requests that gathered up during this pause and needed to be served.
> 
> As far as Unsafe operations, the only thing I see in thread dumps when I grep
> for Unsafe is Unsafe.park operations in threads that are TIMED_WAITING.
> 
> As far as memory allocation, I do have some good profiling of that from the
> flight recordings that are taken and have a listing of allocations by thread.
> I haven?t been able to see any abnormal allocations happening during the time
> of the pauses, and the total amount of memory being allocated is no different
> during these pauses. In fact, the amount of memory getting allocated (inside
> and outside TLABs) is less during these pauses as I imagine the time that
> threads are waiting for a safepoint are taking time away from running code
> that allocates memory.
> 
> From:  Vitaly Davidovich <vitalyd at gmail.com>
> Date:  Thursday, May 28, 2015 at 12:06 PM
> To:  Jason Goetz <jason.goetz at gmail.com>
> 
> Subject:  Re: JVM taking a few seconds to reach a safepoint for routine young
> gen GC
> 
> Thanks Jason.  Is this bare metal Windows or virtualized?  Of the 140-200
> active, how many are runnable at the time of the stalls?
> 
> Do you (or any used libs that you know of) use Unsafe for big memcpy style
> operations?
> 
> When these spikes occur, how many runnable procs are there on the machine? Is
> there scheduling contention perhaps (with Tomcat?)?
> 
> As for JNI, typically, java threads in JNI won't stall threads from sync'ing
> on a safepoint.
> 
> Sorry for the spanish inquisition, but may help us figure this out or at least
> get a lead.
> 
> On Thu, May 28, 2015 at 2:45 PM, Jason Goetz <jason.goetz at gmail.com> wrote:
>> Vitaly,
>> 
>> We?ve seen 140-200 active threads during the time of the stalls but that?s no
>> different than any other time period. There are 12 CPUs available on the JVM
>> and there is 24G in the heap, 64G on the machine.  This is the only JVM
>> running on the machine, which runs on a Windows server, and Tomcat is the
>> only application of note other than a few monitoring tools (Zabbix, HP Open
>> View, VMWare Tools), which I haven?t had the option of turning off).
>> 
>> I?m not sure that JNI is running. We don?t explicitly have any JNI calls
>> running, but I?m not sure about whether any of the 3rd-party libraries we use
>> have JNI code that I?m unaware of. I haven?t been able to figure out how to
>> identify if JNI calls are even running. We have taken several Java Flight
>> Recordings around these every-20-minute pauses, but haven?t seen any patterns
>> or unusual spikes in disk I/O, thread contention, or any thread activity.
>> There is no swapping at all either.
>> 
>> Any other information that I could provide in order to give a clearer picture
>> of the system?
>> 
>> Thanks,
>> Jason
>>  
>> From:  Vitaly Davidovich <vitalyd at gmail.com>
>> Date:  Thursday, May 28, 2015 at 11:17 AM
>> To:  Jason Goetz <jason.goetz at gmail.com>
>> Cc:  hotspot-gc-use <hotspot-gc-use at openjdk.java.net>
>> Subject:  Re: JVM taking a few seconds to reach a safepoint for routine young
>> gen GC
>> 
>> 
>> Jason,
>> 
>> How many java threads are active when these stalls happen? How many CPUs are
>> available to the jvm? How much physical memory on the machine? Is your jvm
>> sole occupant of the machine or do you have noisy neighbors? You mentioned
>> JNI - do you have a lot of JNI calls around these times? Do you allocate
>> and/or write to large arrays/memory regions? Is there something
>> different/interesting about these 20 min periods (e.g. workload increases,
>> same time of day, more disk activity, any paging/swap activity, etc).
>> 
>> sent from my phone
>> 
>> On May 28, 2015 1:58 PM, "Jason Goetz" <jason.goetz at gmail.com> wrote:
>>> We're consistently seeing a situation where threads take a few seconds to
>>> stop for a routine GC. For 20 straight minutes the GC will run right away
>>> (it runs about every second). But then, during a 20-minute period, the
>>> threads will take longer to stop for GC. See the GC output below.
>>> 
>>> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application
>>> threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908
>>> seconds
>>> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application
>>> threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106
>>> seconds
>>> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 28.4067370
>>> secs]
>>> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark
>>> 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs],
>>> 0.0709271 secs]
>>>  [Times: user=0.73 sys=0.00, real=0.08 secs]
>>> 2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application
>>> threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032
>>> seconds
>>> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G),
>>> 0.0451098 secs]
>>>  [Times: user=0.61 sys=0.00, real=0.05 secs]
>>> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application
>>> threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950
>>> seconds
>>> 
>>> Turning on safepoint logging reveals that these stopping threads times are
>>> taken up by safepoint ?sync? time. Taking thread dumps every second around
>>> these pauses fail to show anything of note happening during this time, but
>>> it?s my understanding that native code won?t necessarily show up in thread
>>> dumps anyway given that they exit before the JVM reaches a safepoint.
>>> 
>>> Enabling PrintJNIGCStalls fails to show any logging around the 3 second
>>> pause seen above. I highly suspected JNI but was surprise that I didn?t see
>>> any logging about JNI Weak References after turning that option on. Any
>>> ideas for what I can try next? We?re using JDK 7u80. Here are the rest of my
>>> JVM settings:
>>> 
>>> DisableExplicitGC true
>>> FlightRecorder true
>>> GCLogFileSize 52428800
>>> ManagementServer true
>>> MinHeapSize 25769803776
>>> MaxHeapSize 25769803776
>>> MaxPermSize 536870912
>>> NumberOfGCLogFiles 10
>>> PrintAdaptiveSizePolicy true
>>> PrintGC true
>>> PrintGCApplicationStoppedTime true
>>> PrintGCCause true
>>> PrintGCDateStamps true
>>> PrintGCDetails true
>>> PrintGCTimeStamps true
>>> PrintSafepointStatistics true
>>> PrintSafepointStatisticsCount 1
>>> PrintTenuringDistribution true
>>> ReservedCodeCacheSize 268435456
>>> SafepointTimeout true
>>> SafepointTimeoutDelay 4000
>>> ThreadStackSize 4096
>>> UnlockCommercialFeatures true
>>> UseBiasedLocking false
>>> UseGCLogFileRotation false
>>> UseG1GC true
>>> PrintJNIGCStalls true
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150529/8d6f75d6/attachment.html>

From ysr1729 at gmail.com  Sat May 30 02:14:10 2015
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 29 May 2015 19:14:10 -0700
Subject: JVM taking a few seconds to reach a safepoint for routine young
	gen GC
In-Reply-To: <D18DE1C3.42B5A%jason.goetz@gmail.com>
References: <D18CA553.42A6D%jason.goetz@gmail.com>
	<CAHjP37FNCvV2pGkeTDNTX4iS6BByot01K0g4-xz_edseL9wKAg@mail.gmail.com>
	<D18CACB7.42A79%jason.goetz@gmail.com>
	<CAHjP37FmU_tKUkdTVzapR-ovSb66OgOxSO6KLs6DtZG=nm_MtA@mail.gmail.com>
	<D18CBCFB.42AA8%jason.goetz@gmail.com>
	<CAHjP37E+Zo5nMbC5TgoQN=sBKmgdhFx9_2Zk88nTjKREO6J1fg@mail.gmail.com>
	<D18DE1C3.42B5A%jason.goetz@gmail.com>
Message-ID: <DCCF0E38-0E45-44F1-95AF-7DC1925B3ED5@gmail.com>

Hi Jason --

You mentioned a lucene indexer on the same box. Can you check for correlation between the indexing activity, paging behavior and the incidence of the long safe points?

-- ramki

ysr1729

> On May 29, 2015, at 09:51, Jason Goetz <jason.goetz at gmail.com> wrote:
> 
> Oops, I did not intend to remove this from the list. Re-added.
> 
> I?ll take a look at how many RUNNABLE threads are actually blocked in native code. I?ll also look at VSphere to see if I can see anything unusual around resource contention.
> 
> I?ve grepped the safepoint logs for GenCollectForAllocation, which, as I mentioned before, happen about every second, but only show this long sync times during the mysterious 20-minute period. I?ve taken an excerpt from one of these 20-minute pauses. You can see that for most GC the only time is in vmop and sync time is 0, but during these pauses the ?sync? time takes up the majority of the time.
> 
> [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
> 
> 116546.586: GenCollectForAllocation          [     146          5              7    ]      [     0     0     0     1    50    ]  0
> 116546.891: GenCollectForAllocation          [     146          2              7    ]      [     0     0     0     2    50    ]  0
> 116547.969: GenCollectForAllocation          [     145          0              2    ]      [     0     0     0     2   290    ]  0
> 116549.500: GenCollectForAllocation          [     145          0              0    ]      [     0     0     0     2    67    ]  0
> 116550.836: GenCollectForAllocation          [     142          0              1    ]      [     0     0     0     1    82    ]  0
> 116553.398: GenCollectForAllocation          [     142          0              2    ]      [     0     0     0     2    76    ]  0
> 116555.109: GenCollectForAllocation          [     142          0              0    ]      [     0     0     0     2    84    ]  0
> 116557.328: GenCollectForAllocation          [     142          0              0    ]      [     0     0     0     2    64    ]  0
> 116561.992: GenCollectForAllocation          [     143          2              1    ]      [     0     0   523     2    76    ]  1
> 116567.367: GenCollectForAllocation          [     143          1              0    ]      [     1     0    39     2   104    ]  0
> 116572.438: GenCollectForAllocation          [     143          4              3    ]      [     0     0     0     2    85    ]  0
> 116575.977: GenCollectForAllocation          [     144         76              1    ]      [    24     0154039     9   181    ]  0
> 116731.336: GenCollectForAllocation          [     353         41              5    ]      [     5     0     5     1   101    ]  0
> 116732.328: GenCollectForAllocation          [     354          5             16    ]      [  2080     0  2115     1     0    ]  1
> 116736.430: GenCollectForAllocation          [     354          5              9    ]      [     0     0     0     1    81    ]  2
> 116736.891: GenCollectForAllocation          [     354          0              4    ]      [     0     0     0     4    88    ]  0
> 116737.305: GenCollectForAllocation          [     354          2              9    ]      [     0     0     0     1    80    ]  0
> 116737.664: GenCollectForAllocation          [     354          1              8    ]      [     0     0     0     2    65    ]  0
> 116738.055: GenCollectForAllocation          [     355          1              8    ]      [     0     0     0     1   106    ]  0
> 116738.797: GenCollectForAllocation          [     354          0              5    ]      [     0     0  2116     2   125    ]  0
> 116741.523: GenCollectForAllocation          [     353          1              0    ]      [     5     0   502     1   195    ]  0
> 116743.219: GenCollectForAllocation          [     352          1              5    ]      [     0     0     0     1     0    ]  0
> 116743.719: GenCollectForAllocation          [     352          1              7    ]      [     0     0     0     1    67    ]  0
> 116744.266: GenCollectForAllocation          [     352        271              0    ]      [    28     0764563     4     0    ]  0
> 117509.914: GenCollectForAllocation          [     347          1              2    ]      [     0     0     0     2   166    ]  0
> 117510.609: GenCollectForAllocation          [     456         84              9    ]      [     8     0     8     2   103    ]  1
> 117511.305: GenCollectForAllocation          [     479          0              6    ]      [     0     0     0     7   199    ]  0
> 117512.086: GenCollectForAllocation          [     480          0              2    ]      [     0     0     0     2   192    ]  0
> 117829.000: GenCollectForAllocation          [     569          0              3    ]      [     0     0     0     2     0    ]  0
> 117829.000: GenCollectForAllocation          [     569          2              5    ]      [     0     0     0     0   128    ]  0
> 117829.523: GenCollectForAllocation          [     569          0              6    ]      [     0     0     0     2    84    ]  0
> 117830.039: GenCollectForAllocation          [     571          0              5    ]      [     0     0     0     2     0    ]  0
> 117830.781: GenCollectForAllocation          [     571          0              6    ]      [     0     0     0     6    72    ]  0
> 117831.461: GenCollectForAllocation          [     571          0              4    ]      [     0     0     0     1     0    ]  0
> 117831.469: GenCollectForAllocation          [     571          0              3    ]      [     0     0     0     0   113    ]  0
> 
> From: Vitaly Davidovich <vitalyd at gmail.com>
> Date: Thursday, May 28, 2015 at 4:20 PM
> To: Jason Goetz <jason.goetz at gmail.com>
> Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC
> 
> Jason,
> 
> Not sure if you meant to reply just to me, but you did :)
> 
> So I suspect the RUNNABLE you list is what jstack gives you, which is slightly a lie since it'll show some threads blocked in native code as RUNNABLE.
> 
> The fact that you're on a VM is biasing me towards looking at that angle.  If there's a spike in runnable (from kernel scheduler standpoint) threads and/or contention for resources, and it's driven by hypervisor, I wonder if there're any artifacts in that.  I don't have much experience running servers on VMs (only bare metal), so hard to say.  You may want to reply to the list again and see if anyone else has more insight into this type of setup.
> 
> Also, Poonam asked for safepoint statistics for the vm ops that were requested -- do you have those?
> 
>> On Thu, May 28, 2015 at 4:20 PM, Jason Goetz <jason.goetz at gmail.com> wrote:
>> I?m happy to answer whatever I can. Thanks for taking the time to help. It?s running on a VM, not bare metal. The exact OS is Windows Server 2008. The database is running on another machine. There is a very large Lucene index on the same machine as the application and commits to this index are frequent and often contended.
>> 
>> From the thread dumps I took during these pauses (there are several that happen around minor GCs during these 20-minute periods) I can see the following stats:
>> 
>> Dump 1:
>> 	Threads: 147
>> 	RUNNABLE: 42
>> 	WAITING: 30
>> 	TIMED_WAITING: 75
>> 	BLOCKED: 0
>> 
>> Dump 2:
>> 	 Threads: 259
>> 	RUNNABLE: 143
>> 	WAITING: 47
>> 	TIMED_WAITING: 62
>> 	BLOCKED: 7
>> 
>> The only reason I believe the thread count is higher than usual on the second dump is that the dump follows a very long pause (69 seconds, all spent in sync time stopping threads for a safepoint) so I think there were several web requests that gathered up during this pause and needed to be served. 
>> 
>> As far as Unsafe operations, the only thing I see in thread dumps when I grep for Unsafe is Unsafe.park operations in threads that are TIMED_WAITING. 
>> 
>> As far as memory allocation, I do have some good profiling of that from the flight recordings that are taken and have a listing of allocations by thread. I haven?t been able to see any abnormal allocations happening during the time of the pauses, and the total amount of memory being allocated is no different during these pauses. In fact, the amount of memory getting allocated (inside and outside TLABs) is less during these pauses as I imagine the time that threads are waiting for a safepoint are taking time away from running code that allocates memory.
>> 
>> From: Vitaly Davidovich <vitalyd at gmail.com>
>> Date: Thursday, May 28, 2015 at 12:06 PM
>> To: Jason Goetz <jason.goetz at gmail.com>
>> 
>> Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC
>> 
>> Thanks Jason.  Is this bare metal Windows or virtualized?  Of the 140-200 active, how many are runnable at the time of the stalls?
>> 
>> Do you (or any used libs that you know of) use Unsafe for big memcpy style operations?
>> 
>> When these spikes occur, how many runnable procs are there on the machine? Is there scheduling contention perhaps (with Tomcat?)?
>> 
>> As for JNI, typically, java threads in JNI won't stall threads from sync'ing on a safepoint.
>> 
>> Sorry for the spanish inquisition, but may help us figure this out or at least get a lead.
>> 
>>> On Thu, May 28, 2015 at 2:45 PM, Jason Goetz <jason.goetz at gmail.com> wrote:
>>> Vitaly,
>>> 
>>> We?ve seen 140-200 active threads during the time of the stalls but that?s no different than any other time period. There are 12 CPUs available on the JVM and there is 24G in the heap, 64G on the machine.  This is the only JVM running on the machine, which runs on a Windows server, and Tomcat is the only application of note other than a few monitoring tools (Zabbix, HP Open View, VMWare Tools), which I haven?t had the option of turning off).
>>> 
>>> I?m not sure that JNI is running. We don?t explicitly have any JNI calls running, but I?m not sure about whether any of the 3rd-party libraries we use have JNI code that I?m unaware of. I haven?t been able to figure out how to identify if JNI calls are even running. We have taken several Java Flight Recordings around these every-20-minute pauses, but haven?t seen any patterns or unusual spikes in disk I/O, thread contention, or any thread activity. There is no swapping at all either.
>>> 
>>> Any other information that I could provide in order to give a clearer picture of the system?
>>> 
>>> Thanks,
>>> Jason
>>>  
>>> From: Vitaly Davidovich <vitalyd at gmail.com>
>>> Date: Thursday, May 28, 2015 at 11:17 AM
>>> To: Jason Goetz <jason.goetz at gmail.com>
>>> Cc: hotspot-gc-use <hotspot-gc-use at openjdk.java.net>
>>> Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC
>>> 
>>> Jason,
>>> 
>>> How many java threads are active when these stalls happen? How many CPUs are available to the jvm? How much physical memory on the machine? Is your jvm sole occupant of the machine or do you have noisy neighbors? You mentioned JNI - do you have a lot of JNI calls around these times? Do you allocate and/or write to large arrays/memory regions? Is there something different/interesting about these 20 min periods (e.g. workload increases, same time of day, more disk activity, any paging/swap activity, etc).
>>> 
>>> sent from my phone
>>> 
>>>> On May 28, 2015 1:58 PM, "Jason Goetz" <jason.goetz at gmail.com> wrote:
>>>> We're consistently seeing a situation where threads take a few seconds to stop for a routine GC. For 20 straight minutes the GC will run right away (it runs about every second). But then, during a 20-minute period, the threads will take longer to stop for GC. See the GC output below.
>>>> 
>>>> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908 seconds
>>>> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106 seconds
>>>> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 28.4067370 secs]
>>>> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs], 0.0709271 secs]
>>>>  [Times: user=0.73 sys=0.00, real=0.08 secs] 
>>>> 2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032 seconds
>>>> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), 0.0451098 secs]
>>>>  [Times: user=0.61 sys=0.00, real=0.05 secs] 
>>>> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950 seconds
>>>> 
>>>> Turning on safepoint logging reveals that these stopping threads times are taken up by safepoint ?sync? time. Taking thread dumps every second around these pauses fail to show anything of note happening during this time, but it?s my understanding that native code won?t necessarily show up in thread dumps anyway given that they exit before the JVM reaches a safepoint.
>>>> 
>>>> Enabling PrintJNIGCStalls fails to show any logging around the 3 second pause seen above. I highly suspected JNI but was surprise that I didn?t see any logging about JNI Weak References after turning that option on. Any ideas for what I can try next? We?re using JDK 7u80. Here are the rest of my JVM settings:
>>>> 
>>>> DisableExplicitGC					true
>>>> FlightRecorder					true
>>>> GCLogFileSize					52428800
>>>> ManagementServer				true
>>>> MinHeapSize						25769803776
>>>> MaxHeapSize					25769803776
>>>> MaxPermSize					536870912
>>>> NumberOfGCLogFiles				10
>>>> PrintAdaptiveSizePolicy			true
>>>> PrintGC							true
>>>> PrintGCApplicationStoppedTime	true
>>>> PrintGCCause					true
>>>> PrintGCDateStamps				true
>>>> PrintGCDetails					true
>>>> PrintGCTimeStamps				true
>>>> PrintSafepointStatistics			true
>>>> PrintSafepointStatisticsCount		1
>>>> PrintTenuringDistribution			true
>>>> ReservedCodeCacheSize			268435456
>>>> SafepointTimeout				true
>>>> SafepointTimeoutDelay			4000
>>>> ThreadStackSize					4096
>>>> UnlockCommercialFeatures		true
>>>> UseBiasedLocking				false
>>>> UseGCLogFileRotation			false
>>>> UseG1GC						true
>>>> PrintJNIGCStalls					true
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150529/761f785a/attachment-0001.html>

From gustav.r.akesson at gmail.com  Sat May 30 17:05:08 2015
From: gustav.r.akesson at gmail.com (=?UTF-8?Q?Gustav_=C3=85kesson?=)
Date: Sat, 30 May 2015 19:05:08 +0200
Subject: JVM taking a few seconds to reach a safepoint for routine young
	gen GC
In-Reply-To: <DCCF0E38-0E45-44F1-95AF-7DC1925B3ED5@gmail.com>
References: <D18CA553.42A6D%jason.goetz@gmail.com>
	<CAHjP37FNCvV2pGkeTDNTX4iS6BByot01K0g4-xz_edseL9wKAg@mail.gmail.com>
	<D18CACB7.42A79%jason.goetz@gmail.com>
	<CAHjP37FmU_tKUkdTVzapR-ovSb66OgOxSO6KLs6DtZG=nm_MtA@mail.gmail.com>
	<D18CBCFB.42AA8%jason.goetz@gmail.com>
	<CAHjP37E+Zo5nMbC5TgoQN=sBKmgdhFx9_2Zk88nTjKREO6J1fg@mail.gmail.com>
	<D18DE1C3.42B5A%jason.goetz@gmail.com>
	<DCCF0E38-0E45-44F1-95AF-7DC1925B3ED5@gmail.com>
Message-ID: <CAKEw5+6_oiR+25BNNh8Y1v-DAaHCRUah91cULCOatYtx3Rt8wA@mail.gmail.com>

Hi,

I thought that threads (blocked) in native was not a problem for
safe-pointing? Meaning that these threads could continue to execute/block
during an attempt to reach safe point, but denied to return back to
Java-land until safe-ponting is over.

Best Regards,
Gustav ?kesson
Den 30 maj 2015 04:14 skrev "Srinivas Ramakrishna" <ysr1729 at gmail.com>:

> Hi Jason --
>
> You mentioned a lucene indexer on the same box. Can you check for
> correlation between the indexing activity, paging behavior and the
> incidence of the long safe points?
>
> -- ramki
>
> ysr1729
>
> On May 29, 2015, at 09:51, Jason Goetz <jason.goetz at gmail.com> wrote:
>
> Oops, I did not intend to remove this from the list. Re-added.
>
> I?ll take a look at how many RUNNABLE threads are actually blocked in
> native code. I?ll also look at VSphere to see if I can see anything unusual
> around resource contention.
>
> I?ve grepped the safepoint logs for GenCollectForAllocation, which, as I
> mentioned before, happen about every second, but only show this long sync
> times during the mysterious 20-minute period. I?ve taken an excerpt from
> one of these 20-minute pauses. You can see that for most GC the only time
> is in vmop and sync time is 0, but during these pauses the ?sync? time
> takes up the majority of the time.
>
> [threads: total initially_running wait_to_block]    [time: spin block sync
> cleanup vmop] page_trap_count
>
> 116546.586: GenCollectForAllocation          [     146          5
>      7    ]      [     0     0     0     1    50    ]  0
> 116546.891: GenCollectForAllocation          [     146          2
>      7    ]      [     0     0     0     2    50    ]  0
> 116547.969: GenCollectForAllocation          [     145          0
>      2    ]      [     0     0     0     2   290    ]  0
> 116549.500: GenCollectForAllocation          [     145          0
>      0    ]      [     0     0     0     2    67    ]  0
> 116550.836: GenCollectForAllocation          [     142          0
>      1    ]      [     0     0     0     1    82    ]  0
> 116553.398: GenCollectForAllocation          [     142          0
>      2    ]      [     0     0     0     2    76    ]  0
> 116555.109: GenCollectForAllocation          [     142          0
>      0    ]      [     0     0     0     2    84    ]  0
> 116557.328: GenCollectForAllocation          [     142          0
>      0    ]      [     0     0     0     2    64    ]  0
> 116561.992: GenCollectForAllocation          [     143          2
>      1    ]      [     0     0   523     2    76    ]  1
> 116567.367: GenCollectForAllocation          [     143          1
>      0    ]      [     1     0    39     2   104    ]  0
> 116572.438: GenCollectForAllocation          [     143          4
>      3    ]      [     0     0     0     2    85    ]  0
> 116575.977: GenCollectForAllocation          [     144         76
>      1    ]      [    24     0154039     9   181    ]  0
> 116731.336: GenCollectForAllocation          [     353         41
>      5    ]      [     5     0     5     1   101    ]  0
> 116732.328: GenCollectForAllocation          [     354          5
>     16    ]      [  2080     0  2115     1     0    ]  1
> 116736.430: GenCollectForAllocation          [     354          5
>      9    ]      [     0     0     0     1    81    ]  2
> 116736.891: GenCollectForAllocation          [     354          0
>      4    ]      [     0     0     0     4    88    ]  0
> 116737.305: GenCollectForAllocation          [     354          2
>      9    ]      [     0     0     0     1    80    ]  0
> 116737.664: GenCollectForAllocation          [     354          1
>      8    ]      [     0     0     0     2    65    ]  0
> 116738.055: GenCollectForAllocation          [     355          1
>      8    ]      [     0     0     0     1   106    ]  0
> 116738.797: GenCollectForAllocation          [     354          0
>      5    ]      [     0     0  2116     2   125    ]  0
> 116741.523: GenCollectForAllocation          [     353          1
>      0    ]      [     5     0   502     1   195    ]  0
> 116743.219: GenCollectForAllocation          [     352          1
>      5    ]      [     0     0     0     1     0    ]  0
> 116743.719: GenCollectForAllocation          [     352          1
>      7    ]      [     0     0     0     1    67    ]  0
> 116744.266: GenCollectForAllocation          [     352        271
>      0    ]      [    28     0764563     4     0    ]  0
> 117509.914: GenCollectForAllocation          [     347          1
>      2    ]      [     0     0     0     2   166    ]  0
> 117510.609: GenCollectForAllocation          [     456         84
>      9    ]      [     8     0     8     2   103    ]  1
> 117511.305: GenCollectForAllocation          [     479          0
>      6    ]      [     0     0     0     7   199    ]  0
> 117512.086: GenCollectForAllocation          [     480          0
>      2    ]      [     0     0     0     2   192    ]  0
> 117829.000: GenCollectForAllocation          [     569          0
>      3    ]      [     0     0     0     2     0    ]  0
> 117829.000: GenCollectForAllocation          [     569          2
>      5    ]      [     0     0     0     0   128    ]  0
> 117829.523: GenCollectForAllocation          [     569          0
>      6    ]      [     0     0     0     2    84    ]  0
> 117830.039: GenCollectForAllocation          [     571          0
>      5    ]      [     0     0     0     2     0    ]  0
> 117830.781: GenCollectForAllocation          [     571          0
>      6    ]      [     0     0     0     6    72    ]  0
> 117831.461: GenCollectForAllocation          [     571          0
>      4    ]      [     0     0     0     1     0    ]  0
> 117831.469: GenCollectForAllocation          [     571          0
>      3    ]      [     0     0     0     0   113    ]  0
>
> From: Vitaly Davidovich <vitalyd at gmail.com>
> Date: Thursday, May 28, 2015 at 4:20 PM
> To: Jason Goetz <jason.goetz at gmail.com>
> Subject: Re: JVM taking a few seconds to reach a safepoint for routine
> young gen GC
>
> Jason,
>
> Not sure if you meant to reply just to me, but you did :)
>
> So I suspect the RUNNABLE you list is what jstack gives you, which is
> slightly a lie since it'll show some threads blocked in native code as
> RUNNABLE.
>
> The fact that you're on a VM is biasing me towards looking at that angle.
> If there's a spike in runnable (from kernel scheduler standpoint) threads
> and/or contention for resources, and it's driven by hypervisor, I wonder if
> there're any artifacts in that.  I don't have much experience running
> servers on VMs (only bare metal), so hard to say.  You may want to reply to
> the list again and see if anyone else has more insight into this type of
> setup.
>
> Also, Poonam asked for safepoint statistics for the vm ops that were
> requested -- do you have those?
>
> On Thu, May 28, 2015 at 4:20 PM, Jason Goetz <jason.goetz at gmail.com>
> wrote:
>
>> I?m happy to answer whatever I can. Thanks for taking the time to help.
>> It?s running on a VM, not bare metal. The exact OS is Windows Server 2008.
>> The database is running on another machine. There is a very large Lucene
>> index on the same machine as the application and commits to this index are
>> frequent and often contended.
>>
>> From the thread dumps I took during these pauses (there are several that
>> happen around minor GCs during these 20-minute periods) I can see the
>> following stats:
>>
>> Dump 1:
>> Threads: 147
>> RUNNABLE: 42
>> WAITING: 30
>> TIMED_WAITING: 75
>> BLOCKED: 0
>>
>> Dump 2:
>>  Threads: 259
>> RUNNABLE: 143
>> WAITING: 47
>> TIMED_WAITING: 62
>> BLOCKED: 7
>>
>> The only reason I believe the thread count is higher than usual on the
>> second dump is that the dump follows a very long pause (69 seconds, all
>> spent in sync time stopping threads for a safepoint) so I think there were
>> several web requests that gathered up during this pause and needed to be
>> served.
>>
>> As far as Unsafe operations, the only thing I see in thread dumps when I
>> grep for Unsafe is Unsafe.park operations in threads that are
>> TIMED_WAITING.
>>
>> As far as memory allocation, I do have some good profiling of that from
>> the flight recordings that are taken and have a listing of allocations by
>> thread. I haven?t been able to see any abnormal allocations happening
>> during the time of the pauses, and the total amount of memory being
>> allocated is no different during these pauses. In fact, the amount of
>> memory getting allocated (inside and outside TLABs) is less during these
>> pauses as I imagine the time that threads are waiting for a safepoint are
>> taking time away from running code that allocates memory.
>>
>> From: Vitaly Davidovich <vitalyd at gmail.com>
>> Date: Thursday, May 28, 2015 at 12:06 PM
>> To: Jason Goetz <jason.goetz at gmail.com>
>>
>> Subject: Re: JVM taking a few seconds to reach a safepoint for routine
>> young gen GC
>>
>> Thanks Jason.  Is this bare metal Windows or virtualized?  Of the 140-200
>> active, how many are runnable at the time of the stalls?
>>
>> Do you (or any used libs that you know of) use Unsafe for big memcpy
>> style operations?
>>
>> When these spikes occur, how many runnable procs are there on the
>> machine? Is there scheduling contention perhaps (with Tomcat?)?
>>
>> As for JNI, typically, java threads in JNI won't stall threads from
>> sync'ing on a safepoint.
>>
>> Sorry for the spanish inquisition, but may help us figure this out or at
>> least get a lead.
>>
>> On Thu, May 28, 2015 at 2:45 PM, Jason Goetz <jason.goetz at gmail.com>
>> wrote:
>>
>>> Vitaly,
>>>
>>> We?ve seen 140-200 active threads during the time of the stalls but
>>> that?s no different than any other time period. There are 12 CPUs available
>>> on the JVM and there is 24G in the heap, 64G on the machine.  This is the
>>> only JVM running on the machine, which runs on a Windows server, and Tomcat
>>> is the only application of note other than a few monitoring tools (Zabbix,
>>> HP Open View, VMWare Tools), which I haven?t had the option of turning off).
>>>
>>> I?m not sure that JNI is running. We don?t explicitly have any JNI calls
>>> running, but I?m not sure about whether any of the 3rd-party libraries we
>>> use have JNI code that I?m unaware of. I haven?t been able to figure out
>>> how to identify if JNI calls are even running. We have taken several Java
>>> Flight Recordings around these every-20-minute pauses, but haven?t seen any
>>> patterns or unusual spikes in disk I/O, thread contention, or any thread
>>> activity. There is no swapping at all either.
>>>
>>> Any other information that I could provide in order to give a clearer
>>> picture of the system?
>>>
>>> Thanks,
>>> Jason
>>>
>>> From: Vitaly Davidovich <vitalyd at gmail.com>
>>> Date: Thursday, May 28, 2015 at 11:17 AM
>>> To: Jason Goetz <jason.goetz at gmail.com>
>>> Cc: hotspot-gc-use <hotspot-gc-use at openjdk.java.net>
>>> Subject: Re: JVM taking a few seconds to reach a safepoint for routine
>>> young gen GC
>>>
>>> Jason,
>>>
>>> How many java threads are active when these stalls happen? How many CPUs
>>> are available to the jvm? How much physical memory on the machine? Is your
>>> jvm sole occupant of the machine or do you have noisy neighbors? You
>>> mentioned JNI - do you have a lot of JNI calls around these times? Do you
>>> allocate and/or write to large arrays/memory regions? Is there something
>>> different/interesting about these 20 min periods (e.g. workload increases,
>>> same time of day, more disk activity, any paging/swap activity, etc).
>>>
>>> sent from my phone
>>> On May 28, 2015 1:58 PM, "Jason Goetz" <jason.goetz at gmail.com> wrote:
>>>
>>>> We're consistently seeing a situation where threads take a few seconds
>>>> to stop for a routine GC. For 20 straight minutes the GC will run right
>>>> away (it runs about every second). But then, during a 20-minute period, the
>>>> threads will take longer to stop for GC. See the GC output below.
>>>>
>>>> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which
>>>> application threads were stopped: 0.1121233 seconds, Stopping threads took:
>>>> 0.0000908 seconds
>>>> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which
>>>> application threads were stopped: 0.0019384 seconds, Stopping threads took:
>>>> 0.0001106 seconds
>>>> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end,
>>>> 28.4067370 secs]
>>>> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark
>>>> 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs],
>>>> 0.0709271 secs]
>>>>  [Times: user=0.73 sys=0.00, real=0.08 secs]
>>>> *2015-05-28T12:15:09.864-0500: 54815.466: Total time for which
>>>> application threads were stopped: 3.2916224 seconds, Stopping threads took:
>>>> 3.2188032 seconds*
>>>> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G),
>>>> 0.0451098 secs]
>>>>  [Times: user=0.61 sys=0.00, real=0.05 secs]
>>>> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which
>>>> application threads were stopped: 0.0459803 seconds, Stopping threads took:
>>>> 0.0001950 seconds
>>>>
>>>> Turning on safepoint logging reveals that these stopping threads times
>>>> are taken up by safepoint ?sync? time. Taking thread dumps every second
>>>> around these pauses fail to show anything of note happening during this
>>>> time, but it?s my understanding that native code won?t necessarily show up
>>>> in thread dumps anyway given that they exit before the JVM reaches a
>>>> safepoint.
>>>>
>>>> Enabling PrintJNIGCStalls fails to show any logging around the 3 second
>>>> pause seen above. I highly suspected JNI but was surprise that I didn?t see
>>>> any logging about JNI Weak References after turning that option on. Any
>>>> ideas for what I can try next? We?re using JDK 7u80. Here are the rest of
>>>> my JVM settings:
>>>>
>>>> DisableExplicitGC true
>>>> FlightRecorder true
>>>> GCLogFileSize 52428800
>>>> ManagementServer true
>>>> MinHeapSize 25769803776
>>>> MaxHeapSize 25769803776
>>>> MaxPermSize 536870912
>>>> NumberOfGCLogFiles 10
>>>> PrintAdaptiveSizePolicy true
>>>> PrintGC true
>>>> PrintGCApplicationStoppedTime true
>>>> PrintGCCause true
>>>> PrintGCDateStamps true
>>>> PrintGCDetails true
>>>> PrintGCTimeStamps true
>>>> PrintSafepointStatistics true
>>>> PrintSafepointStatisticsCount 1
>>>> PrintTenuringDistribution true
>>>> ReservedCodeCacheSize 268435456
>>>> SafepointTimeout true
>>>> SafepointTimeoutDelay 4000
>>>> ThreadStackSize 4096
>>>> UnlockCommercialFeatures true
>>>> UseBiasedLocking false
>>>> UseGCLogFileRotation false
>>>> UseG1GC true
>>>> PrintJNIGCStalls true
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>>>
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150530/42ce1712/attachment.html>