stringStream in UL and nested ResourceMarks

Thomas Stüfe thomas.stuefe at gmail.com
Fri Jun 9 07:48:03 UTC 2017


Hi Stefan,

just a small question to verify that I understood everything correctly.

The LogStream classes (LogStreamBase and children) are basically the
write-to-UL frontend classes, right? Their purpose is to collect input via
various print.. methods until \n is encountered, then pipe the assembled
line to the UL backend. To do that it needs a backing store for the
to-be-assembled-line, and this is the whole reason stringStream is used
(via the "streamClass" template parameter for LogStreamBase)?

So, basically the whole rather involved class tree rooted at LogStreamBase
only deals with the various ways that one line backing store is allocated?
Including LogStream itself, which contains - I was very surprised to see -
an embedded ResourceMark (stringStreamWithResourceMark). There are no other
reasons for this ResourceMark?

I am currently experimenting with changing LogStream to use a simple
malloc'd backing store, in combination with a small fixed size member
buffer for small lines; I'd like to see if that had any measurable negative
performance impact. The coding is halfway done already, but the callers
need fixing up, because due to my change LogStreamBase children cannot be
allocated with new anymore, because of the ResourceObj-destructor problem.

What do you think, is this worthwhile and am I overlooking something
obvious? The UL coding is quite large, after all.

Kind Regards, Thomas








On Wed, Jun 7, 2017 at 12:25 PM, Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> Hi Stefan,
>
> On Wed, Jun 7, 2017 at 12:17 PM, Stefan Karlsson <
> stefan.karlsson at oracle.com> wrote:
>
>> Hi Thomas,
>>
>> On 2017-06-07 11:15, Thomas Stüfe wrote:
>>
>>> Hi Stefan,
>>>
>>> I saw this, but I also see LogStreamNoResourceMark being used as a
>>> default for the (trace|debug|info|warning|error)_stream() methods of
>>> Log. In this form it is used quite a lot.
>>>
>>> Looking further, I see that one cannot just exchange
>>> LogStreamNoResourceMark with LogStreamCHeap, because there are hidden usage
>>> conventions I was not aware of:
>>>
>>
>> Just to be clear, I didn't propose that you did a wholesale replacement
>> of LogStreamNoResourceMark with LogStreamCHeap. I merely pointed out the
>> existence of this class in case you had missed it.
>>
>>
> Sure! I implied this myself with my original post which proposed to
> replace the resource area allocation inside stringStream with malloc'd
> memory.
>
>
>>
>>> LogStreamNoResourceMark  is allocated with new() in create_log_stream().
>>> LogStreamNoResourceMark is an outputStream, which is a ResourceObj. In its
>>> current form ResourceObj cannot be deleted, so destructors for ResourceObj
>>> child cannot be called.
>>>
>>
>> By default ResourceObj classes are allocated in the resource area, but
>> the class also supports CHeap allocations. For example, see some of the
>> allocations of GrowableArray instances:
>>
>> _deallocate_list = new (ResourceObj::C_HEAP, mtClass)
>> GrowableArray<Metadata*>(100, true);
>>
>> These can still be deleted:
>>
>> delete _deallocate_list;
>>
>>
>>> So, we could not use malloc in the stringStream - or exchange
>>> stringStream for bufferedStream - because we would need a non-empty
>>> destructor to free the malloc'd memory, and that destructor cannot exist.
>>>
>>> Looking further, I see that this imposes subtle usage restrictions for
>>> UL:
>>>
>>> LogStreamNoResourceMark objects are used via "log.debug_stream()" or
>>> similar. For example:
>>>
>>>      codecache_print(log.debug_stream(), /* detailed= */ false);
>>>
>>> debug_stream() will allocate a LogStreamNoResourceMark object which
>>> lives in the resourcearea. This is a bit surprising, because
>>> "debug_stream()" feels like it returns a singleton or a member variable of
>>> log.
>>>
>>
>> IIRC, this was done to:
>>
>> 1) break up a cyclic dependencies between logStream.hpp and log.hpp
>>
>> 2) Not have log.hpp depend on the stream.hpp. This used to be important,
>> but the includes in stream.hpp has been fixed so this might be a non-issue.
>>
>>
>>> If one wants to use LogStreamCHeap instead, it must not be created with
>>> new() - which would be a subtle memory leak because the destructor would
>>> never be called - but instead on the stack as automatic variable:
>>>
>>>      LogStreamCHeap log_stream(log);
>>>      log_stream.print("hallo");
>>>
>>> I may understand this wrong, but if this is true, this is quite a
>>> difficult API.
>>>
>>
>> Feel free to rework this and propose a simpler model. Anything that would
>> simplify this would be helpful.
>>
>>
> I will mull over this a bit (and I would be thankful for other viewpoints
> as well). A bottomline question which is difficult to answer is whether
> folks value the slight performance increase of resource area backed memory
> allocation in stringStream more than simplicity and robustness which would
> come with switching to malloced memory. And then, there is the second
> question of why outputStream objects should be ResourceObj at all; for me,
> they feel much more at home as stack objects. They themselves are small and
> do not allocate a lot of memory (if they do, they do it dynamically). And
> they are not allocated in vast amounts...
>
> Lets see what others think.
>
>
>>  I have two classes which look like siblings but
>>
>>> LogStreamCHeap can only be allocated on the local stack - otherwise I'll
>>> get a memory leak - while LogStreamNoResourceMark gets created in the
>>> resource area, which prevents its destructor from running and may fill the
>>> resource area up with temporary stream objects if used in a certain way.
>>>
>>> Have I understood this right so far? If yes, would it be possible to
>>> simplify this?
>>>
>>
>> I think you understand the code correctly, and yes, there are probably
>> ways to make this simpler.
>>
>>
> Thanks for your input!
>
> Kind regards, Thomas
>
>
>> Thanks,
>> StefanK
>>
>>
>>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jun 7, 2017 at 9:20 AM, Stefan Karlsson <
>>> stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>>>
>>>     Hi Thomas,
>>>
>>>
>>>     On 2017-06-06 11:40, Thomas Stüfe wrote:
>>>
>>>         Hi all,
>>>
>>>         In our VM we recently hit something similar to
>>>         https://bugs.openjdk.java.net/browse/JDK-8167995
>>>         <https://bugs.openjdk.java.net/browse/JDK-8167995> or
>>>         https://bugs.openjdk.java.net/browse/JDK-8149557
>>>         <https://bugs.openjdk.java.net/browse/JDK-8149557>:
>>>
>>>         A stringStream* was handed down to nested print functions which
>>>         create
>>>         their own ResourceMarks and, while being down the stack under
>>>         the scope of
>>>         that new ResourceMark, the stringStream needed to enlarge its
>>>         internal
>>>         buffer. This is the situation the assert inside
>>>         stringStream::write()
>>>         attempts to catch
>>>         (assert(Thread::current()->current_resource_mark() ==
>>>         rm); in our case this was a release build, so we just crashed.
>>>
>>>         The solution for both JDK-816795 and JDK-8149557 seemed to be to
>>>         just
>>>         remove the offending ResourceMarks, or shuffle them around, but
>>>         generally
>>>         this is not an optimal solution, or?
>>>
>>>         We actually question whether using resource area memory is a
>>>         good idea for
>>>         outputStream chuild objects at all:
>>>
>>>         outputStream instances typically travel down the stack a lot by
>>>         getting
>>>         handed sub-print-functions, so they run danger of crossing
>>>         resource mark
>>>         boundaries like above. The sub functions are usually oblivious
>>>         to the type
>>>         of outputStream* handed down, and as well they should be. And if
>>> the
>>>         allocate resource area memory themselves, it makes sense to
>>>         guard them with
>>>         ResourceMark in case they are called in a loop.
>>>
>>>         The assert inside stringStream::write() is not a real help
>>>         either, because
>>>         whether or not it hits depends on pure luck - whether the
>>>         realloc code path
>>>         is hit just in the right moment while printing. Which depends on
>>>         the buffer
>>>         size and the print history, which is variable, especially with
>>>         logging.
>>>
>>>         The only advantage to using bufferedStream (C-Heap) is a small
>>>         performance
>>>         improvement when allocating. The question is whether this is
>>>         really worth
>>>         the risk of using resource area memory in this fashion.
>>>         Especially in the
>>>         context of UL where we are about to do expensive IO operations
>>>         (writing to
>>>         log file) or may lock (os::flockfile).
>>>
>>>         Also, the difference between bufferedStream and stringStream
>>>         might be
>>>         reduced by improving bufferedStream (e.g. by using a member char
>>>         array for
>>>         small allocations and delay using malloc() for larger arrays.)
>>>
>>>         What you think? Should we get rid of stringStream and only use
>>>         an (possibly
>>>         improved) bufferedStream? I also imagine this could make UL
>>>         coding a bit
>>>         simpler.
>>>
>>>
>>>     Not answering your questions, but I want to point out that we
>>>     already have a UL stream that uses C-Heap:
>>>
>>>     logging/logStream.hpp:
>>>
>>>     // The backing buffer is allocated in CHeap memory.
>>>     typedef LogStreamBase<bufferedStream> LogStreamCHeap;
>>>
>>>     StefanK
>>>
>>>
>>>
>>>         Thank you,
>>>
>>>         Kind Regards, Thomas
>>>
>>>
>>>
>


More information about the hotspot-dev mailing list