stringStream in UL and nested ResourceMarks
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Jun 9 20:00:01 UTC 2017
This bug seems relevant to this discussion:
JDK-8181807 Graal internal error "StringStream is re-allocated with a
different ResourceMark"
https://bugs.openjdk.java.net/browse/JDK-8181807
Dan
On 6/9/17 1:48 AM, Thomas Stüfe wrote:
> Hi Stefan,
>
> just a small question to verify that I understood everything correctly.
>
> The LogStream classes (LogStreamBase and children) are basically the
> write-to-UL frontend classes, right? Their purpose is to collect input via
> various print.. methods until \n is encountered, then pipe the assembled
> line to the UL backend. To do that it needs a backing store for the
> to-be-assembled-line, and this is the whole reason stringStream is used
> (via the "streamClass" template parameter for LogStreamBase)?
>
> So, basically the whole rather involved class tree rooted at LogStreamBase
> only deals with the various ways that one line backing store is allocated?
> Including LogStream itself, which contains - I was very surprised to see -
> an embedded ResourceMark (stringStreamWithResourceMark). There are no other
> reasons for this ResourceMark?
>
> I am currently experimenting with changing LogStream to use a simple
> malloc'd backing store, in combination with a small fixed size member
> buffer for small lines; I'd like to see if that had any measurable negative
> performance impact. The coding is halfway done already, but the callers
> need fixing up, because due to my change LogStreamBase children cannot be
> allocated with new anymore, because of the ResourceObj-destructor problem.
>
> What do you think, is this worthwhile and am I overlooking something
> obvious? The UL coding is quite large, after all.
>
> Kind Regards, Thomas
>
>
>
>
>
>
>
>
> On Wed, Jun 7, 2017 at 12:25 PM, Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
>> Hi Stefan,
>>
>> On Wed, Jun 7, 2017 at 12:17 PM, Stefan Karlsson <
>> stefan.karlsson at oracle.com> wrote:
>>
>>> Hi Thomas,
>>>
>>> On 2017-06-07 11:15, Thomas Stüfe wrote:
>>>
>>>> Hi Stefan,
>>>>
>>>> I saw this, but I also see LogStreamNoResourceMark being used as a
>>>> default for the (trace|debug|info|warning|error)_stream() methods of
>>>> Log. In this form it is used quite a lot.
>>>>
>>>> Looking further, I see that one cannot just exchange
>>>> LogStreamNoResourceMark with LogStreamCHeap, because there are hidden usage
>>>> conventions I was not aware of:
>>>>
>>> Just to be clear, I didn't propose that you did a wholesale replacement
>>> of LogStreamNoResourceMark with LogStreamCHeap. I merely pointed out the
>>> existence of this class in case you had missed it.
>>>
>>>
>> Sure! I implied this myself with my original post which proposed to
>> replace the resource area allocation inside stringStream with malloc'd
>> memory.
>>
>>
>>>> LogStreamNoResourceMark is allocated with new() in create_log_stream().
>>>> LogStreamNoResourceMark is an outputStream, which is a ResourceObj. In its
>>>> current form ResourceObj cannot be deleted, so destructors for ResourceObj
>>>> child cannot be called.
>>>>
>>> By default ResourceObj classes are allocated in the resource area, but
>>> the class also supports CHeap allocations. For example, see some of the
>>> allocations of GrowableArray instances:
>>>
>>> _deallocate_list = new (ResourceObj::C_HEAP, mtClass)
>>> GrowableArray<Metadata*>(100, true);
>>>
>>> These can still be deleted:
>>>
>>> delete _deallocate_list;
>>>
>>>
>>>> So, we could not use malloc in the stringStream - or exchange
>>>> stringStream for bufferedStream - because we would need a non-empty
>>>> destructor to free the malloc'd memory, and that destructor cannot exist.
>>>>
>>>> Looking further, I see that this imposes subtle usage restrictions for
>>>> UL:
>>>>
>>>> LogStreamNoResourceMark objects are used via "log.debug_stream()" or
>>>> similar. For example:
>>>>
>>>> codecache_print(log.debug_stream(), /* detailed= */ false);
>>>>
>>>> debug_stream() will allocate a LogStreamNoResourceMark object which
>>>> lives in the resourcearea. This is a bit surprising, because
>>>> "debug_stream()" feels like it returns a singleton or a member variable of
>>>> log.
>>>>
>>> IIRC, this was done to:
>>>
>>> 1) break up a cyclic dependencies between logStream.hpp and log.hpp
>>>
>>> 2) Not have log.hpp depend on the stream.hpp. This used to be important,
>>> but the includes in stream.hpp has been fixed so this might be a non-issue.
>>>
>>>
>>>> If one wants to use LogStreamCHeap instead, it must not be created with
>>>> new() - which would be a subtle memory leak because the destructor would
>>>> never be called - but instead on the stack as automatic variable:
>>>>
>>>> LogStreamCHeap log_stream(log);
>>>> log_stream.print("hallo");
>>>>
>>>> I may understand this wrong, but if this is true, this is quite a
>>>> difficult API.
>>>>
>>> Feel free to rework this and propose a simpler model. Anything that would
>>> simplify this would be helpful.
>>>
>>>
>> I will mull over this a bit (and I would be thankful for other viewpoints
>> as well). A bottomline question which is difficult to answer is whether
>> folks value the slight performance increase of resource area backed memory
>> allocation in stringStream more than simplicity and robustness which would
>> come with switching to malloced memory. And then, there is the second
>> question of why outputStream objects should be ResourceObj at all; for me,
>> they feel much more at home as stack objects. They themselves are small and
>> do not allocate a lot of memory (if they do, they do it dynamically). And
>> they are not allocated in vast amounts...
>>
>> Lets see what others think.
>>
>>
>>> I have two classes which look like siblings but
>>>
>>>> LogStreamCHeap can only be allocated on the local stack - otherwise I'll
>>>> get a memory leak - while LogStreamNoResourceMark gets created in the
>>>> resource area, which prevents its destructor from running and may fill the
>>>> resource area up with temporary stream objects if used in a certain way.
>>>>
>>>> Have I understood this right so far? If yes, would it be possible to
>>>> simplify this?
>>>>
>>> I think you understand the code correctly, and yes, there are probably
>>> ways to make this simpler.
>>>
>>>
>> Thanks for your input!
>>
>> Kind regards, Thomas
>>
>>
>>> Thanks,
>>> StefanK
>>>
>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jun 7, 2017 at 9:20 AM, Stefan Karlsson <
>>>> stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>>>>
>>>> Hi Thomas,
>>>>
>>>>
>>>> On 2017-06-06 11:40, Thomas Stüfe wrote:
>>>>
>>>> Hi all,
>>>>
>>>> In our VM we recently hit something similar to
>>>> https://bugs.openjdk.java.net/browse/JDK-8167995
>>>> <https://bugs.openjdk.java.net/browse/JDK-8167995> or
>>>> https://bugs.openjdk.java.net/browse/JDK-8149557
>>>> <https://bugs.openjdk.java.net/browse/JDK-8149557>:
>>>>
>>>> A stringStream* was handed down to nested print functions which
>>>> create
>>>> their own ResourceMarks and, while being down the stack under
>>>> the scope of
>>>> that new ResourceMark, the stringStream needed to enlarge its
>>>> internal
>>>> buffer. This is the situation the assert inside
>>>> stringStream::write()
>>>> attempts to catch
>>>> (assert(Thread::current()->current_resource_mark() ==
>>>> rm); in our case this was a release build, so we just crashed.
>>>>
>>>> The solution for both JDK-816795 and JDK-8149557 seemed to be to
>>>> just
>>>> remove the offending ResourceMarks, or shuffle them around, but
>>>> generally
>>>> this is not an optimal solution, or?
>>>>
>>>> We actually question whether using resource area memory is a
>>>> good idea for
>>>> outputStream chuild objects at all:
>>>>
>>>> outputStream instances typically travel down the stack a lot by
>>>> getting
>>>> handed sub-print-functions, so they run danger of crossing
>>>> resource mark
>>>> boundaries like above. The sub functions are usually oblivious
>>>> to the type
>>>> of outputStream* handed down, and as well they should be. And if
>>>> the
>>>> allocate resource area memory themselves, it makes sense to
>>>> guard them with
>>>> ResourceMark in case they are called in a loop.
>>>>
>>>> The assert inside stringStream::write() is not a real help
>>>> either, because
>>>> whether or not it hits depends on pure luck - whether the
>>>> realloc code path
>>>> is hit just in the right moment while printing. Which depends on
>>>> the buffer
>>>> size and the print history, which is variable, especially with
>>>> logging.
>>>>
>>>> The only advantage to using bufferedStream (C-Heap) is a small
>>>> performance
>>>> improvement when allocating. The question is whether this is
>>>> really worth
>>>> the risk of using resource area memory in this fashion.
>>>> Especially in the
>>>> context of UL where we are about to do expensive IO operations
>>>> (writing to
>>>> log file) or may lock (os::flockfile).
>>>>
>>>> Also, the difference between bufferedStream and stringStream
>>>> might be
>>>> reduced by improving bufferedStream (e.g. by using a member char
>>>> array for
>>>> small allocations and delay using malloc() for larger arrays.)
>>>>
>>>> What you think? Should we get rid of stringStream and only use
>>>> an (possibly
>>>> improved) bufferedStream? I also imagine this could make UL
>>>> coding a bit
>>>> simpler.
>>>>
>>>>
>>>> Not answering your questions, but I want to point out that we
>>>> already have a UL stream that uses C-Heap:
>>>>
>>>> logging/logStream.hpp:
>>>>
>>>> // The backing buffer is allocated in CHeap memory.
>>>> typedef LogStreamBase<bufferedStream> LogStreamCHeap;
>>>>
>>>> StefanK
>>>>
>>>>
>>>>
>>>> Thank you,
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
More information about the hotspot-dev
mailing list