A Bug Of Metaspace After Full GC

Krystal Mok rednaxelafx at gmail.com
Tue Sep 25 20:08:02 UTC 2018


Hi JC,

Just a quick comment on:

So I looked at the other ones as well and as a summary:
  - The bug seems to show up for Serial, CMS
  - The bug does not seem to show up for Parallel, G1

That's because Serial (DefNew) / Serial Old (Tenured / GenMarkSweep) /
ParNew / CMS use the "generational GC framework" that's been there for
years. That's the "GenCollectedHeap" touched by Nijiaben's propsed patch.
It was there with the idea that different GC implementations can plug into
the same framework and supposedly they could mix-and-match. But later on
only a few possible configurations were really tested well and supported.

Parallel (ParallelScavenge) / Parallel Old and G1 use their own
implementation of the CollectedHeap interface and doesn't participate in
that "generational GC framework". So they don't have the same bug at the
same place.

- Kris

On Tue, Sep 25, 2018 at 12:57 PM, JC Beyler <jcbeyler at google.com> wrote:

> Hi Nijiaben,
>
> I'm not a GC engineer but thought I would look at this by curiosity (for
> note, I also created https://bugs.openjdk.java.net/browse/JDK-8211123 for
> tracking purposes). It seems you are using an older JDK because the code
> has changed somewhat there.
>
> I looked at this and used your reproducer on JDK12 tip:
>
> Using CMS (which is being deprecated from what I know and see in the log
> message):
> $ ~/mercurial/jdk12-gc-print/build/linux-x86_64-server-release/images/jdk/bin/java
> -XX:+UseConcMarkSweepGC -Xmx1000M -Xms1000M -Xlog:gc* ClassLoaderTest
> OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated
> in version 9.0 and will likely be removed in a future release.
> ...
> [0.559s][info][gc,metaspace   ] GC(0) Metaspace: 15267K->15267K(1067008K)
>
> (Note: I am using -Xlog:gc* since -XX:+PrintGCDetails is being deprecated
> as well from what the log message is saying)
>
> So the bug is there for CMS, but for G1, it seems to work:
>
> $ ~/mercurial/jdk12-gc-print/build/linux-x86_64-server-release/images/jdk/bin/java
> -Xmx1000M -Xms1000M -Xlog:gc* ClassLoaderTest
> ...
> [0.653s][info][gc,metaspace   ] GC(0) Metaspace: 15266K->5266K(1062912K)
> ...
>
> So I looked at the other ones as well and as a summary:
>   - The bug seems to show up for Serial, CMS
>   - The bug does not seem to show up for Parallel, G1
>
>
> I then updated the code like what you were showing though things have
> changed in tip so my change became a simple one-line move:
>
> diff -r 8f66a57054b7 src/hotspot/share/gc/shared/genCollectedHeap.cpp
> --- a/src/hotspot/share/gc/shared/genCollectedHeap.cpp  Tue Sep 25
> 10:30:32 2018 -0700
> +++ b/src/hotspot/share/gc/shared/genCollectedHeap.cpp  Tue Sep 25
> 12:50:54 2018 -0700
> @@ -649,7 +649,6 @@
>      complete = complete || collected_old;
>
>      print_heap_change(young_prev_used, old_prev_used);
> -    MetaspaceUtils::print_metaspace_change(metadata_prev_used);
>
>      // Adjust generation sizes.
>      if (collected_old) {
> @@ -665,6 +664,7 @@
>        MetaspaceGC::compute_new_size();
>        update_full_collections_completed();
>      }
> +    MetaspaceUtils::print_metaspace_change(metadata_prev_used);
>
>      // Track memory usage and detect low memory after GC finishes
>      MemoryService::track_memory_usage();
>
> With this, I looked at the various GC results and this fixes:
>   - Serial, CMS; while keeping Parallel and G1 working (seemingly)
>
> Finally, note that all GC logging now are logging the summary at the exit
> and this line shows up:
> [0.560s][info][gc,heap,exit   ]  Metaspace       used 5277K, capacity
> 5310K, committed 15232K, reserved 1062912K
>
> For all GCs, so though the line "Metaspace before -> after" is false, you
> can get the the after with the final line.
>
> If the GC experts on this list want me to make a webrev out of this and
> get reviews for it, I can do that.
>
> Hoping that helps,
> Jc
>
>
>
> On Fri, Sep 21, 2018 at 11:56 PM nijiaben <nijiaben at perfma.com> wrote:
>
>> Hi, all
>>
>> We use metaspace instead of perm since JDK8, I found that after the Full
>> GC, the size of the metaspace has always been the same from the GC log. I
>> have looked at the source code of hotspot roughly. It has not bean fixed in
>> the latest version. The problem which may have some impact on us to
>> troubleshoot some metaspace problems.
>>
>> I wrote a test case, and simply analyzed the hotspot code, and showed the
>> output before and after the modification. I hope this problem can be fixed
>> as soon as possible.
>>
>> Demo:
>>
>> import java.io.File;
>> import java.net.URL;
>> import java.net.URLClassLoader;
>>
>> /**
>>  *
>>  * @author nijiaben
>>  * @version v1.0
>>  * @createTime 2018年09月22日 12:32:32 PM
>>  *
>>  */
>> public class ClassLoaderTest {
>>     private static URL[] urls = new URL[1];
>>
>>     static {
>>         try {
>>             File jarFile = new File("/home/admin/.m2/repository/org/slf4j/slf4j-api/1.7.21/slf4j-api-1.7.21.jar");
>>             urls[0] = jarFile.toURI().toURL();
>>         } catch (Exception e) {
>>             e.printStackTrace();
>>         }
>>     }
>>
>>     public static void main(String args[]) {
>>         for(int i=0;i<1000;i++) {
>>             loadClass();
>>         }
>>         System.gc();
>>     }
>>
>>     public static void loadClass() {
>>         try {
>>             URLClassLoader ucl = new URLClassLoader(urls);
>>             Class.forName("org.slf4j.Logger", false, ucl);
>>         } catch (Exception e) {
>>             e.printStackTrace();
>>         }
>>     }
>> }
>>
>> My demo is very simple, just constantly create a new class loader to load
>> a specific class, after loading, execute a System GC, so that we can see
>> the relevant GC log output, the JVM parameters we run are
>>
>> -Xmx1000M -Xms1000M -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails ClassLoaderTest
>>
>> The GC log is
>>
>> [Full GC (System.gc()) [CMS: 0K->9921K(683264K), 0.1003988 secs] 49083K->9921K(989952K), [Metaspace: 12916K->12916K(1064960K)], 0.1060065 secs] [Times: user=0.06 sys=0.05, real=0.11 secs]
>>
>> Let's look at the source code of hotspot,
>>
>> GenCollectedHeap::do_collection:
>>
>> if (PrintGCDetails) {
>>       print_heap_change(gch_prev_used);
>>
>>       // Print metaspace info for full GC with PrintGCDetails flag.
>>       if (complete) {
>>         MetaspaceAux::print_metaspace_change(metadata_prev_used);
>>       }
>>     }
>>
>>     for (int j = max_level_collected; j >= 0; j -= 1) {
>>       // Adjust generation sizes.
>>       _gens[j]->compute_new_size();
>>     }
>>
>>     if (complete) {
>>       // Delete metaspaces for unloaded class loaders and clean up loader_data graph
>>       ClassLoaderDataGraph::purge();
>>       MetaspaceAux::verify_metrics();
>>       // Resize the metaspace capacity after full collections
>>       MetaspaceGC::compute_new_size();
>>       update_full_collections_completed();
>>     }
>>
>> We see that metaspace's gc log output is executed before
>> ClassLoaderDataGraph::purge(); , that means the size of the metaspace is
>> printed without reclaiming the memory of the metaspace, so the GC log seems
>> no change before and after the GC metaspace.
>>
>> After knowing the specific problem, we can make an adjustment to the code
>> to move the logic of the output of metaspace size after to
>> ClassLoaderDataGraph::purge(),just like this:
>>
>>     if (PrintGCDetails) {
>>       print_heap_change(gch_prev_used);
>>     }
>>
>>     for (int j = max_level_collected; j >= 0; j -= 1) {
>>       // Adjust generation sizes.
>>       _gens[j]->compute_new_size();
>>     }
>>
>>     if (complete) {
>>       // Delete metaspaces for unloaded class loaders and clean up loader_data graph
>>       ClassLoaderDataGraph::purge();
>>       MetaspaceAux::verify_metrics();
>>       // Resize the metaspace capacity after full collections
>>       MetaspaceGC::compute_new_size();
>>       update_full_collections_completed();
>>         if (PrintGCDetails) {
>>             // Print metaspace info for full GC with PrintGCDetails flag.
>>             MetaspaceAux::print_metaspace_change(metadata_prev_used);
>>         }
>>     }
>>
>> At this point we recompile hotspot, execute the above demo again, you can
>> see the correct output as follows
>>
>> [Full GC (System.gc()) [CMS: 0K->9921K(683264K), 0.0852627 secs] 49083K->9921K(989952K), [Metaspace: 12913K->3280K(1058816K)], 0.0891207 secs] [Times: user=0.05 sys=0.04, real=0.09 secs]
>>
>> I also probably thought about why this problem is caused. This is the
>> version before JDK8, such as JDK7, you can see the same code location:
>>
>> if (PrintGCDetails) {
>>       print_heap_change(gch_prev_used);
>>
>>       // Print perm gen info for full GC with PrintGCDetails flag.
>>       if (complete) {
>>         print_perm_heap_change(perm_prev_used);
>>       }
>>     }
>>
>>     for (int j = max_level_collected; j >= 0; j -= 1) {
>>       // Adjust generation sizes.
>>       _gens[j]->compute_new_size();
>>     }
>>
>>     if (complete) {
>>       // Ask the permanent generation to adjust size for full collections
>>       perm()->compute_new_size();
>>       update_full_collections_completed();
>>     }
>>
>> It can be seen that the GC engineer at that time simply replaced
>> print_perm_heap_change(perm_prev_used);with Me
>> taspaceAux::print_metaspace_change(metadata_prev_used);but forgot the
>> difference between Perm and Metaspace.
>>
>> Thanks,
>>
>> nijiaben
>>
>
>
> --
>
> Thanks,
> Jc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20180925/3c6b5ea9/attachment.htm>


More information about the hotspot-gc-dev mailing list