Crash in the MallocTracker::record_free

Thomas Stüfe thomas.stuefe at gmail.com
Tue Aug 16 08:03:23 UTC 2022


On Tue, Aug 16, 2022 at 12:52 AM David Holmes <david.holmes at oracle.com>
wrote:

> On 16/08/2022 2:04 am, Thomas Stüfe wrote:
> >  From the hex dump, I think it possible that you try to delete a
> > HandleMark, and that crashes.
> >
> > This corresponds with the code location in the stack, which is
> >
> > ```
> > V  [libjvm.so+0x1a6d77a]  Thread::~Thread()+0x8a
> > ```
> >
> > if gdb does not lie to me, this is near:
> >
> > ```
> > 0x7ffff74f91a9 <Thread::~Thread()+137>:      callq  0x7ffff6843210
> > <HandleMark::operator delete(void*)>
> > ```
> >
> > He complains about a broken NMT header at 0x00007fba3f56ca90. So this is
> > the assumed NMT header:
> >
> > ```
> > 0x00007fba3f56ca90:   00 00 00 00 00 00 00 00 98 d0 37 3f ba 7f 00 00
> > ```
> >
> > But this is what the caller actually freed. Since NMT header precedes
> > the allocation:
> >
> > ```
> > 0x00007fba3f56caa0:   90 98 02 38 ba 7f 00 00   80 a3 02 38 ba 7f 00 00
> > 0x00007fba3f56cab0:   e0 a3 02 38 ba 7f 00 00   f0 a3 02 38 ba 7f 00 00
> > 0x00007fba3f56cac0:   c8 a4 02 38 ba 7f 00 00   d8 00 00 00 00 00 00 00
> > 0x00007fba3f56cad0:   60 a6 02 38 ba 7f 00 00   a0 e4 89 3e ba 7f 00 00
> > 0x00007fba3f56cae0:   90 98 02 38 ba 7f 00 00   90 ac 02 38 ba 7f 00 00
> > 0x00007fba3f56caf0:   78 05 40 f0 b9 7f 00 00   00 00 00 00 00 00 00 00
> > 0x00007fba3f56cb00:   00 00 00 00 00 00 00 00   00 00 00 00 00 00 00 00
> > ```
> >
> > HandleMark has seven pointer-sized members:
> >
> > ```
> >    Thread *_thread;              // thread that owns this mark
> >    HandleArea *_area;            // saved handle area
> >    Chunk *_chunk;                // saved arena chunk
> >    char *_hwm, *_max;            // saved arena info
> >    size_t _size_in_bytes;        // size of handle area
> >    // Link to previous active HandleMark in thread
> >    HandleMark* _previous_handle_mark;
> > ```
> >
> > If I interprete 0x00007fba3f56caa0 as HandleMark*:
> >
> > ```
> > 0x00007fba3f56caa0:   90 98 02 38 ba 7f 00 00   80 a3 02 38 ba 7f 00 00
> >                        _thread                   _area
> > 0x00007fba3f56cab0:   e0 a3 02 38 ba 7f 00 00   f0 a3 02 38 ba 7f 00 00
> >                        _chunk                    _hwm
> > 0x00007fba3f56cac0:   c8 a4 02 38 ba 7f 00 00   d8 00 00 00 00 00 00 00
> >                        _max                      _size_in_bytes
> > 0x00007fba3f56cad0:   60 a6 02 38 ba 7f 00 00
> >                        _previous_handle_mark
> > ```
> >
> > Note how this fits:
> >
> > - The only 8-byte datum in this range is a size-like value for
> > _size_in_bytes (d8 = 216). All other slots look like pointers.
> >
> > - _chunk (0x7fba3802a3e0), _hwm (0x7fba3802a3f0) and _max
> > (0x7fba3802a4c8) look like they fit an Arena of size 216 perfectly:
> > _max - _chunk = 232. A Chunk looks like this:
> >
> > ```
> >    //
> >    // +-----------+--+--------------------------------------------+
> >    // |           |g |                                            |
> >    // | Chunk     |a |               Payload                      |
> >    // |           |p |                                            |
> >    // +-----------+--+--------------------------------------------+
> >    // A           B  C                                            D
> > ```
> >
> > and it has two pointer-sized members, so sizeof(Chunk) = 16. (232 - 16)
> > gives you the arena size of 216.
> >
> > Also, _hwm points to the beginning of the Arena (_chunk + 16). So this
> > looks like a mark for a HandleArea that has been cleared or never used.
> >
> > -------------
> >
> > I see two possibilities. Either someone overwrote the NMT header of the
> > block. Or, this HandleMark has never been allocated on the C-heap but
> > lives on the stack. In that case, something with
> > `Thread::thread->last_handle_mark()` could have gone wrong. Most
> > HandleMark instances live on the stack, the only ones that don't are the
> > initial marks created in `Thread::Thread()`.
> >
> > -----
> >
> > This is all pure hypotheses, since I don't have an hs-err file or any
> > information other than the stack and the hex dump.
>
> Thanks Thomas! That was way more than I expected! I suspect there is a
> HandleMark on the stack. I'm doing something unusual, but not prohibited.
>
> Cheers,
> David
>
>
>
Happy to help :)

I wish the NMT error texts were a bit more informative, but its difficult
to come up with better heuristics (same is true for
https://bugs.openjdk.org/browse/JDK-8292318, where the free'd pointer is
really no pointer at all).

Cheers, Thomas

>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20220816/05d4787d/attachment.htm>


More information about the hotspot-runtime-dev mailing list