RFR 8234270: [REDO] JDK-8204128 NMT might report incorrect numbers for Compiler area

Mon Nov 25 20:54:48 UTC 2019

Hi Zhengyu,

Hi Zhengyu,

http://cr.openjdk.java.net/~zgu/JDK-8234270/webrev.00/src/hotspot/share/memory/arena.cpp.udiff.html

not sure I understand this change - why is it needed?:

@@ -360,11 +360,12 @@
   }
   if (k) k->set_next(_chunk);   // Append new chunk to end of linked list
   else _first = _chunk;
   _hwm  = _chunk->bottom();     // Save the cached hwm, max
   _max =  _chunk->top();
-  set_size_in_bytes(size_in_bytes() + len);
+  size_t new_size = size_in_bytes() + _chunk->length();
+  set_size_in_bytes(new_size);
   void* result = _hwm;
   _hwm += x;
   return result;
 }

http://cr.openjdk.java.net/~zgu/JDK-8234270/webrev.00/src/hotspot/share/services/mallocTracker.hpp.udiff.html

-  inline void resize(long sz) {
+  inline void resize(ssize_t sz) {
     if (sz != 0) {
+      assert(sz >= 0 || _size >= size_t(-sz), "Must be");
       Atomic::add(size_t(sz), &_size);
       DEBUG_ONLY(_peak_size = MAX2(_size, _peak_size);)
     }
   }

assert looks fine but the Atomic::add() took me by surprise: when size is
reduced, we feed it knowingly a negative number we then to unsigned and
rely on the overflow? Just a nit, but would Atomic::sub() not be clearer
here?

http://cr.openjdk.java.net/~zgu/JDK-8234270/webrev.00/test/hotspot/jtreg/runtime/NMT/HugeArenaTracking.java.html

The test is fine as it is. Some thoughts (feel free to ignore them):

- would be more interesting to shake the boat a little by varying increase
rate, e.g.:

  60     // Allocate 2GB+ from arena
  61     long total = 0;
  62     while (total < 2 * GB) {
           long increase = <random_number_between_some_bytes_and_I dont
know, 100M? Capped at 2 GB>
  63       wb.NMTArenaMalloc(arena1, increase);
  64       total += increase;
  65     }

and maybe test jcmd VM.native_memory before this point? I know its annoying
since the numbers are probably not exact, and parsing would be necessary.
Just an idea.

  66     wb.NMTFreeArena(arena1);

On Mon, Nov 25, 2019 at 2:30 PM Zhengyu Gu <zgu at redhat.com> wrote:

> Ping ... May I get a second review?
>
> Thanks,
>
> -Zhengyu
>
> On 11/21/19 12:12 PM, yumin qi wrote:
> > Hi, Zhengyu
> >
> >    The fix looks good to me.
> >
> > Thanks
> > Yumin
> >
> >
> >
> > On Wed, Nov 20, 2019 at 5:49 AM Zhengyu Gu <zgu at redhat.com
> > <mailto:zgu at redhat.com>> wrote:
> >
> >     JDK-8204128 did not fix the original bug. But new assertion helped to
> >     catch the problem, as it consistently failed in Oracle internal
> tests.
> >
> >     The root cause is that, when NMT biases a resource area to compiler,
> it
> >     did not adjust tracking data to reflect that. When the biased
> resource
> >     area is released, there is a possibility that its size is greater
> than
> >     total size recorded, and underflow a size_t counter.
> >
> >     JDK-8204128 patch also missed a long to ssize_t parameter type
> change,
> >     that resulted new test failure on Windows, because long is 4-bytes on
> >     Windows.
> >
> >     Many thanks to Leonid Mesnik, who helped to run this patch through
> >     Oracle's internal stress tests.
> >
> >     Bug: https://bugs.openjdk.java.net/browse/JDK-8234270
> >     Webrev:
> http://cr.openjdk.java.net/~zgu/JDK-8234270/webrev.00/index.html
> >
> >
> >     Test:
> >         hotspot_nmt
> >         Submit test
> >         Oracle internal stress tests.
> >
> >
> >     Thanks,
> >
> >     -Zhengyu
> >
>
>