RFR: 8176026: SA: Huge heap sizes cause a negative value to be displayed in the jhisto heap total

Chris Plummer cjplummer at openjdk.java.net
Fri Mar 19 19:42:40 UTC 2021


On Fri, 19 Mar 2021 11:49:39 GMT, Koichi Sakata <ksakata at openjdk.org> wrote:

> When a heap is used more than about 2.1GB, clhsdb jhisto shows a negative number in the total field.
> 
> $ java -Xmx20g Sample
> 
> $ jhsdb clhsdb --pid 5773
> Attaching to process 5773, please wait...
> hsdb> jhisto
> ...
> 299:            1       16      jdk.internal.misc.Unsafe
> 300:            3402    10737610256     byte[]
> Total :         15823   -2146661280
> Heap traversal took 1.793 seconds.
> (Incidentally, the Sample is a program that only allocates many objects.)
> 
> #### Details
> This is because in ObjectHistogram class the totalSize variable is int type.
> 
> The total size is the total of ObjectHistogramElement#getSize() and getSize() returns long. So I changed int to long in the ObjectHistogram class.
> 
> Additionally, I changed the type of the totalCount. This doesn't cause a bug, but ObjectHistogramElement#getCount() also returns long. So it doesn't need to treat it as int, I think.
> 
> #### Tests
> The jtreg test was successful.
> $ sudo make run-test TEST=serviceability/sa/ClhsdbJhisto.java
> 
> $ cat build/linux-x86_64-server-fastdebug/test-results/jtreg_test_hotspot_jtreg_serviceability_sa_ClhsdbJhisto_java/text/summary.txt
> serviceability/sa/ClhsdbJhisto.java  Passed. Execution successful
> 
> I confirmed the output with the same program.
> 
> $ ./jdk/build/linux-x86_64-server-fastdebug/jdk/bin/java -Xmx20g Sample
> $ ./jdk/build/linux-x86_64-server-fastdebug/jdk/bin/jhsdb clhsdb --pid 22463
> Attaching to process 22463, please wait...
> hsdb> jhisto
> 
> 300:            3410    19327547600     byte[]
> Total :         16145   19328155080
> Heap traversal took 1.981 seconds.

I see another cast-to-int related bug, and it's visible in your jhisto output (both before and after):

304: 1 16 jdk.internal.perf.Perf
305: 1 16 java.util.jar.JavaUtilJarAccessImpl
306: 254 2863336944 int[]
Total : 22803 -1430650672 

The large `int[]` array should be at the top of the list, not the bottom. The issue is in ObjectHistogramElement.java:

  public int compare(ObjectHistogramElement other) {
    return (int) (other.size - size);
  }

So this will result in the returned value having the wrong sign if the difference between `other.size` and `size` is too large.  In ObjectHistogram.java, just above the code you fixed, we have:

  public List<ObjectHistogramElement> getElements() {
    List<ObjectHistogramElement> list = new ArrayList<>();
    list.addAll(map.values());
    Collections.sort(list, new Comparator<>() {
      public int compare(ObjectHistogramElement o1, ObjectHistogramElement o2) {
        return o1.compare(o2);
      }
    });
    return list;
  }
So it looks like this is calling the buggy `compare()` method. I think the fix is to have `compare()` return -1, 0, or 1 depending on the `long` value of `(other.size - size)`, rather than just trying to return `(other.size - size)`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3087



More information about the hotspot-gc-dev mailing list