stringStream in UL and nested ResourceMarks

Fri Jun 9 20:00:01 UTC 2017

This bug seems relevant to this discussion:

JDK-8181807 Graal internal error "StringStream is re-allocated with a 
different ResourceMark"
https://bugs.openjdk.java.net/browse/JDK-8181807

Dan

On 6/9/17 1:48 AM, Thomas Stüfe wrote:
> Hi Stefan,
>
> just a small question to verify that I understood everything correctly.
>
> The LogStream classes (LogStreamBase and children) are basically the
> write-to-UL frontend classes, right? Their purpose is to collect input via
> various print.. methods until \n is encountered, then pipe the assembled
> line to the UL backend. To do that it needs a backing store for the
> to-be-assembled-line, and this is the whole reason stringStream is used
> (via the "streamClass" template parameter for LogStreamBase)?
>
> So, basically the whole rather involved class tree rooted at LogStreamBase
> only deals with the various ways that one line backing store is allocated?
> Including LogStream itself, which contains - I was very surprised to see -
> an embedded ResourceMark (stringStreamWithResourceMark). There are no other
> reasons for this ResourceMark?
>
> I am currently experimenting with changing LogStream to use a simple
> malloc'd backing store, in combination with a small fixed size member
> buffer for small lines; I'd like to see if that had any measurable negative
> performance impact. The coding is halfway done already, but the callers
> need fixing up, because due to my change LogStreamBase children cannot be
> allocated with new anymore, because of the ResourceObj-destructor problem.
>
> What do you think, is this worthwhile and am I overlooking something
> obvious? The UL coding is quite large, after all.
>
> Kind Regards, Thomas
>
>
>
>
>
>
>
>
> On Wed, Jun 7, 2017 at 12:25 PM, Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
>> Hi Stefan,
>>
>> On Wed, Jun 7, 2017 at 12:17 PM, Stefan Karlsson <
>> stefan.karlsson at oracle.com> wrote:
>>
>>> Hi Thomas,
>>>
>>> On 2017-06-07 11:15, Thomas Stüfe wrote:
>>>
>>>> Hi Stefan,
>>>>
>>>> I saw this, but I also see LogStreamNoResourceMark being used as a
>>>> default for the (trace|debug|info|warning|error)_stream() methods of
>>>> Log. In this form it is used quite a lot.
>>>>
>>>> Looking further, I see that one cannot just exchange
>>>> LogStreamNoResourceMark with LogStreamCHeap, because there are hidden usage
>>>> conventions I was not aware of:
>>>>
>>> Just to be clear, I didn't propose that you did a wholesale replacement
>>> of LogStreamNoResourceMark with LogStreamCHeap. I merely pointed out the
>>> existence of this class in case you had missed it.
>>>
>>>
>> Sure! I implied this myself with my original post which proposed to
>> replace the resource area allocation inside stringStream with malloc'd
>> memory.
>>
>>
>>>> LogStreamNoResourceMark  is allocated with new() in create_log_stream().
>>>> LogStreamNoResourceMark is an outputStream, which is a ResourceObj. In its
>>>> current form ResourceObj cannot be deleted, so destructors for ResourceObj
>>>> child cannot be called.
>>>>
>>> By default ResourceObj classes are allocated in the resource area, but
>>> the class also supports CHeap allocations. For example, see some of the
>>> allocations of GrowableArray instances:
>>>
>>> _deallocate_list = new (ResourceObj::C_HEAP, mtClass)
>>> GrowableArray<Metadata*>(100, true);
>>>
>>> These can still be deleted:
>>>
>>> delete _deallocate_list;
>>>
>>>
>>>> So, we could not use malloc in the stringStream - or exchange
>>>> stringStream for bufferedStream - because we would need a non-empty
>>>> destructor to free the malloc'd memory, and that destructor cannot exist.
>>>>
>>>> Looking further, I see that this imposes subtle usage restrictions for
>>>> UL:
>>>>
>>>> LogStreamNoResourceMark objects are used via "log.debug_stream()" or
>>>> similar. For example:
>>>>
>>>>       codecache_print(log.debug_stream(), /* detailed= */ false);
>>>>
>>>> debug_stream() will allocate a LogStreamNoResourceMark object which
>>>> lives in the resourcearea. This is a bit surprising, because
>>>> "debug_stream()" feels like it returns a singleton or a member variable of
>>>> log.
>>>>
>>> IIRC, this was done to:
>>>
>>> 1) break up a cyclic dependencies between logStream.hpp and log.hpp
>>>
>>> 2) Not have log.hpp depend on the stream.hpp. This used to be important,
>>> but the includes in stream.hpp has been fixed so this might be a non-issue.
>>>
>>>
>>>> If one wants to use LogStreamCHeap instead, it must not be created with
>>>> new() - which would be a subtle memory leak because the destructor would
>>>> never be called - but instead on the stack as automatic variable:
>>>>
>>>>       LogStreamCHeap log_stream(log);
>>>>       log_stream.print("hallo");
>>>>
>>>> I may understand this wrong, but if this is true, this is quite a
>>>> difficult API.
>>>>
>>> Feel free to rework this and propose a simpler model. Anything that would
>>> simplify this would be helpful.
>>>
>>>
>> I will mull over this a bit (and I would be thankful for other viewpoints
>> as well). A bottomline question which is difficult to answer is whether
>> folks value the slight performance increase of resource area backed memory
>> allocation in stringStream more than simplicity and robustness which would
>> come with switching to malloced memory. And then, there is the second
>> question of why outputStream objects should be ResourceObj at all; for me,
>> they feel much more at home as stack objects. They themselves are small and
>> do not allocate a lot of memory (if they do, they do it dynamically). And
>> they are not allocated in vast amounts...
>>
>> Lets see what others think.
>>
>>
>>>   I have two classes which look like siblings but
>>>
>>>> LogStreamCHeap can only be allocated on the local stack - otherwise I'll
>>>> get a memory leak - while LogStreamNoResourceMark gets created in the
>>>> resource area, which prevents its destructor from running and may fill the
>>>> resource area up with temporary stream objects if used in a certain way.
>>>>
>>>> Have I understood this right so far? If yes, would it be possible to
>>>> simplify this?
>>>>
>>> I think you understand the code correctly, and yes, there are probably
>>> ways to make this simpler.
>>>
>>>
>> Thanks for your input!
>>
>> Kind regards, Thomas
>>
>>
>>> Thanks,
>>> StefanK
>>>
>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jun 7, 2017 at 9:20 AM, Stefan Karlsson <
>>>> stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>>>>
>>>>      Hi Thomas,
>>>>
>>>>
>>>>      On 2017-06-06 11:40, Thomas Stüfe wrote:
>>>>
>>>>          Hi all,
>>>>
>>>>          In our VM we recently hit something similar to
>>>>          https://bugs.openjdk.java.net/browse/JDK-8167995
>>>>          <https://bugs.openjdk.java.net/browse/JDK-8167995> or
>>>>          https://bugs.openjdk.java.net/browse/JDK-8149557
>>>>          <https://bugs.openjdk.java.net/browse/JDK-8149557>:
>>>>
>>>>          A stringStream* was handed down to nested print functions which
>>>>          create
>>>>          their own ResourceMarks and, while being down the stack under
>>>>          the scope of
>>>>          that new ResourceMark, the stringStream needed to enlarge its
>>>>          internal
>>>>          buffer. This is the situation the assert inside
>>>>          stringStream::write()
>>>>          attempts to catch
>>>>          (assert(Thread::current()->current_resource_mark() ==
>>>>          rm); in our case this was a release build, so we just crashed.
>>>>
>>>>          The solution for both JDK-816795 and JDK-8149557 seemed to be to
>>>>          just
>>>>          remove the offending ResourceMarks, or shuffle them around, but
>>>>          generally
>>>>          this is not an optimal solution, or?
>>>>
>>>>          We actually question whether using resource area memory is a
>>>>          good idea for
>>>>          outputStream chuild objects at all:
>>>>
>>>>          outputStream instances typically travel down the stack a lot by
>>>>          getting
>>>>          handed sub-print-functions, so they run danger of crossing
>>>>          resource mark
>>>>          boundaries like above. The sub functions are usually oblivious
>>>>          to the type
>>>>          of outputStream* handed down, and as well they should be. And if
>>>> the
>>>>          allocate resource area memory themselves, it makes sense to
>>>>          guard them with
>>>>          ResourceMark in case they are called in a loop.
>>>>
>>>>          The assert inside stringStream::write() is not a real help
>>>>          either, because
>>>>          whether or not it hits depends on pure luck - whether the
>>>>          realloc code path
>>>>          is hit just in the right moment while printing. Which depends on
>>>>          the buffer
>>>>          size and the print history, which is variable, especially with
>>>>          logging.
>>>>
>>>>          The only advantage to using bufferedStream (C-Heap) is a small
>>>>          performance
>>>>          improvement when allocating. The question is whether this is
>>>>          really worth
>>>>          the risk of using resource area memory in this fashion.
>>>>          Especially in the
>>>>          context of UL where we are about to do expensive IO operations
>>>>          (writing to
>>>>          log file) or may lock (os::flockfile).
>>>>
>>>>          Also, the difference between bufferedStream and stringStream
>>>>          might be
>>>>          reduced by improving bufferedStream (e.g. by using a member char
>>>>          array for
>>>>          small allocations and delay using malloc() for larger arrays.)
>>>>
>>>>          What you think? Should we get rid of stringStream and only use
>>>>          an (possibly
>>>>          improved) bufferedStream? I also imagine this could make UL
>>>>          coding a bit
>>>>          simpler.
>>>>
>>>>
>>>>      Not answering your questions, but I want to point out that we
>>>>      already have a UL stream that uses C-Heap:
>>>>
>>>>      logging/logStream.hpp:
>>>>
>>>>      // The backing buffer is allocated in CHeap memory.
>>>>      typedef LogStreamBase<bufferedStream> LogStreamCHeap;
>>>>
>>>>      StefanK
>>>>
>>>>
>>>>
>>>>          Thank you,
>>>>
>>>>          Kind Regards, Thomas
>>>>
>>>>
>>>>