stringStream in UL and nested ResourceMarks
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Jun 12 10:16:02 UTC 2017
Hi Marcus,
On Mon, Jun 12, 2017 at 10:28 AM, Marcus Larsson <marcus.larsson at oracle.com>
wrote:
> Hi,
>
> Thanks for looking into this.
>
> On 2017-06-12 08:52, Thomas Stüfe wrote:
>
> Hi guys,
>
> After our friday discussion I try to collect my thoughts about how to
> solve this. Could you please check if what I think makes sense:
>
> -> LogStream, an outputStream child, is used to give us outputStream
> compatibility to write to UL. It assembles a line and, once complete, sends
> it off to UL backend. It needs memory to do this, and it often uses
> resource memory for this. This is a problem if we cross resource marks.
>
> -> We do not want to use resource memory. Alternatives would be
> a) using a fixed sized char array as member in LogStream. This would be a
> very simple solution. But that imposes a max. line length, which we do not
> want(?) and, where LogStream is used as a Stack object, may consume too
> much stack space.
> b) using malloc-ed memory (perhaps in combination with having a small
> fixed sized char array for small lines to reduce the number of malloc
> invocations). But that requires us to be able to free the malloced memory,
> so the destructor of LogStream must be called.
>
> -> If possible, we want the UL API to stay the same, so we want to avoid
> having to fix up many UL call sites.
>
> Lets look at alternative (b) closer, because I think (a) may be out of
> question (?).
>
>
> Yes, I believe (a) is not an option.
>
>
>
> ------
>
> LogStreams are used in three ways which I can see:
>
> A) As a stack object:
> LogTarget(Error, logging) target;
> LogStreamCHeap errstream(target);
> return LogConfiguration::parse_log_arguments(_gc_log_filename,
> gc_conf, NULL, NULL, &errstream);
>
> No problem here: LogStream is an automatic variable and its destructor
> will be called when it goes out of scope. So we can free malloced memory.
>
> B) More often, by calling the static method LogImpl<>::xxx_stream():
>
> Either like this:
>
> 1) Log(jit, compilation) log;
> if (log.is_debug()) {
> print_impl(log.debug_stream(), ... );
> }
>
> Or like this:
>
> 2) outputStream* st = Log(stackwalk)::debug_stream();
>
> (So, either via an instance of LogImpl<>, or directly via the class::).
>
> Here, LogImpl<>::debug_stream() creates a LogStream with
> ResourceObj::operator new(). For this LogStream the destructor cannot be
> called (not without rewriting the call site).
>
> ----
>
> If we want to keep the syntax of the call sites as much as possible, we
> need to keep the xxxx_stream() methods as they are.
>
> But instead of allocating the LogStream objects each time anew, we could
> change LogImpl<> to manage pointers to its LogStream objects. This would
> help in all those cases where an instance of LogImpl<> is placed on the
> stack, which seems to be the majority of call sites. So, if there is a
> LogImpl<> object on the stack, like in (B1), its destructor would run when
> it goes out of scope and it would clean up its LogStream objects.
>
> Care must be taken that the construction of the LogImpl<> object stays
> cheap, because we do not want to pay for logging if we do not log. Which
> means that construction of the LogStream objects must be deferred to when
> they are actually needed.
>
> for instance:
>
> LogImpl<> {
> outputStream* _debugStream;
> outputStream* _errorStream;
> ...
> public:
> LogImpl<> :: _debugStream(NULL), _errorStream(NULL), ... {}
> ~LogImpl<> () {
> if (_debugStream) { cleanup debugStream; }
> if (_errorStream) { cleanup errorStream; }
> ...
> }
> debugStream* debug_stream() {
> if (!_debugStream) _debugStream = new LogStream();
> return _debugStream;
> }
> }
> };
>
> Unfortunately, this would not help for cases where xxx_stream() is called
> directly as static method (Log(..)::xx_stream()) - I do not see a
> possibility to keep this syntax. If we go with the "LogImpl<> manages its
> associated LogStream objects" solution, these sites would have to be
> rewritten:
>
> From:
> outputStream* st = Log(stackwalk)::debug_stream();
> to:
> Log(stackwalk) log;
> outputStream* st = log.debug_stream();
>
> ---
>
> What do you think? Does any of this make sense?
>
>
> I think it makes sense.
>
> IMO, we could as well get rid of the xxx_info() functions as a whole, and
> instead always require stack objects (LogStream instance). We would need to
> change some of the call sites with the proposed solution, so why not change
> all of them while we're at it. It saves complexity and makes it more
> obvious to the user where and how the memory is allocated.
>
>
I'd be all for it but if we really change the UL Api that drastically and
fix up all call sites, will this not make it harder to backport it to jdk9
if we want that?
Other than this worry, I like it, and it is also simpler than my proposal.
I will attempt a patch reworking UL in that matter: remove the xx_stream()
functions and make LogStream malloc-backed and stack-allocatable-only.
...Thomas
Thanks,
> Marcus
>
>
>
>
> Also, I opened https://bugs.openjdk.java.net/browse/JDK-8181917 to track
> this issue. Note that even though I assigned this to myself for now, if any
> of you rather wants to take it this over it is fine by me.
>
>
> Kind regards, Thomas
>
>
>
>
>
> On Sat, Jun 10, 2017 at 8:33 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
>> Yes, this seems to be the same issue.
>>
>> ..Thomas
>>
>> On Fri, Jun 9, 2017 at 10:00 PM, Daniel D. Daugherty <
>> daniel.daugherty at oracle.com> wrote:
>>
>>> This bug seems relevant to this discussion:
>>>
>>> JDK-8181807 Graal internal error "StringStream is re-allocated with a
>>> different ResourceMark"
>>> https://bugs.openjdk.java.net/browse/JDK-8181807
>>>
>>> Dan
>>>
>>>
>>>
>>> On 6/9/17 1:48 AM, Thomas Stüfe wrote:
>>>
>>>> Hi Stefan,
>>>>
>>>> just a small question to verify that I understood everything correctly.
>>>>
>>>> The LogStream classes (LogStreamBase and children) are basically the
>>>> write-to-UL frontend classes, right? Their purpose is to collect input
>>>> via
>>>> various print.. methods until \n is encountered, then pipe the assembled
>>>> line to the UL backend. To do that it needs a backing store for the
>>>> to-be-assembled-line, and this is the whole reason stringStream is used
>>>> (via the "streamClass" template parameter for LogStreamBase)?
>>>>
>>>> So, basically the whole rather involved class tree rooted at
>>>> LogStreamBase
>>>> only deals with the various ways that one line backing store is
>>>> allocated?
>>>> Including LogStream itself, which contains - I was very surprised to
>>>> see -
>>>> an embedded ResourceMark (stringStreamWithResourceMark). There are no
>>>> other
>>>> reasons for this ResourceMark?
>>>>
>>>> I am currently experimenting with changing LogStream to use a simple
>>>> malloc'd backing store, in combination with a small fixed size member
>>>> buffer for small lines; I'd like to see if that had any measurable
>>>> negative
>>>> performance impact. The coding is halfway done already, but the callers
>>>> need fixing up, because due to my change LogStreamBase children cannot
>>>> be
>>>> allocated with new anymore, because of the ResourceObj-destructor
>>>> problem.
>>>>
>>>> What do you think, is this worthwhile and am I overlooking something
>>>> obvious? The UL coding is quite large, after all.
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jun 7, 2017 at 12:25 PM, Thomas Stüfe <thomas.stuefe at gmail.com>
>>>> wrote:
>>>>
>>>> Hi Stefan,
>>>>>
>>>>> On Wed, Jun 7, 2017 at 12:17 PM, Stefan Karlsson <
>>>>> stefan.karlsson at oracle.com> wrote:
>>>>>
>>>>> Hi Thomas,
>>>>>>
>>>>>> On 2017-06-07 11:15, Thomas Stüfe wrote:
>>>>>>
>>>>>> Hi Stefan,
>>>>>>>
>>>>>>> I saw this, but I also see LogStreamNoResourceMark being used as a
>>>>>>> default for the (trace|debug|info|warning|error)_stream() methods of
>>>>>>> Log. In this form it is used quite a lot.
>>>>>>>
>>>>>>> Looking further, I see that one cannot just exchange
>>>>>>> LogStreamNoResourceMark with LogStreamCHeap, because there are
>>>>>>> hidden usage
>>>>>>> conventions I was not aware of:
>>>>>>>
>>>>>>> Just to be clear, I didn't propose that you did a wholesale
>>>>>> replacement
>>>>>> of LogStreamNoResourceMark with LogStreamCHeap. I merely pointed out
>>>>>> the
>>>>>> existence of this class in case you had missed it.
>>>>>>
>>>>>>
>>>>>> Sure! I implied this myself with my original post which proposed to
>>>>> replace the resource area allocation inside stringStream with malloc'd
>>>>> memory.
>>>>>
>>>>>
>>>>> LogStreamNoResourceMark is allocated with new() in
>>>>>>> create_log_stream().
>>>>>>> LogStreamNoResourceMark is an outputStream, which is a ResourceObj.
>>>>>>> In its
>>>>>>> current form ResourceObj cannot be deleted, so destructors for
>>>>>>> ResourceObj
>>>>>>> child cannot be called.
>>>>>>>
>>>>>>> By default ResourceObj classes are allocated in the resource area,
>>>>>> but
>>>>>> the class also supports CHeap allocations. For example, see some of
>>>>>> the
>>>>>> allocations of GrowableArray instances:
>>>>>>
>>>>>> _deallocate_list = new (ResourceObj::C_HEAP, mtClass)
>>>>>> GrowableArray<Metadata*>(100, true);
>>>>>>
>>>>>> These can still be deleted:
>>>>>>
>>>>>> delete _deallocate_list;
>>>>>>
>>>>>>
>>>>>> So, we could not use malloc in the stringStream - or exchange
>>>>>>> stringStream for bufferedStream - because we would need a non-empty
>>>>>>> destructor to free the malloc'd memory, and that destructor cannot
>>>>>>> exist.
>>>>>>>
>>>>>>> Looking further, I see that this imposes subtle usage restrictions
>>>>>>> for
>>>>>>> UL:
>>>>>>>
>>>>>>> LogStreamNoResourceMark objects are used via "log.debug_stream()" or
>>>>>>> similar. For example:
>>>>>>>
>>>>>>> codecache_print(log.debug_stream(), /* detailed= */ false);
>>>>>>>
>>>>>>> debug_stream() will allocate a LogStreamNoResourceMark object which
>>>>>>> lives in the resourcearea. This is a bit surprising, because
>>>>>>> "debug_stream()" feels like it returns a singleton or a member
>>>>>>> variable of
>>>>>>> log.
>>>>>>>
>>>>>>> IIRC, this was done to:
>>>>>>
>>>>>> 1) break up a cyclic dependencies between logStream.hpp and log.hpp
>>>>>>
>>>>>> 2) Not have log.hpp depend on the stream.hpp. This used to be
>>>>>> important,
>>>>>> but the includes in stream.hpp has been fixed so this might be a
>>>>>> non-issue.
>>>>>>
>>>>>>
>>>>>> If one wants to use LogStreamCHeap instead, it must not be created
>>>>>>> with
>>>>>>> new() - which would be a subtle memory leak because the destructor
>>>>>>> would
>>>>>>> never be called - but instead on the stack as automatic variable:
>>>>>>>
>>>>>>> LogStreamCHeap log_stream(log);
>>>>>>> log_stream.print("hallo");
>>>>>>>
>>>>>>> I may understand this wrong, but if this is true, this is quite a
>>>>>>> difficult API.
>>>>>>>
>>>>>>> Feel free to rework this and propose a simpler model. Anything that
>>>>>> would
>>>>>> simplify this would be helpful.
>>>>>>
>>>>>>
>>>>>> I will mull over this a bit (and I would be thankful for other
>>>>> viewpoints
>>>>> as well). A bottomline question which is difficult to answer is whether
>>>>> folks value the slight performance increase of resource area backed
>>>>> memory
>>>>> allocation in stringStream more than simplicity and robustness which
>>>>> would
>>>>> come with switching to malloced memory. And then, there is the second
>>>>> question of why outputStream objects should be ResourceObj at all; for
>>>>> me,
>>>>> they feel much more at home as stack objects. They themselves are
>>>>> small and
>>>>> do not allocate a lot of memory (if they do, they do it dynamically).
>>>>> And
>>>>> they are not allocated in vast amounts...
>>>>>
>>>>> Lets see what others think.
>>>>>
>>>>>
>>>>> I have two classes which look like siblings but
>>>>>>
>>>>>> LogStreamCHeap can only be allocated on the local stack - otherwise
>>>>>>> I'll
>>>>>>> get a memory leak - while LogStreamNoResourceMark gets created in the
>>>>>>> resource area, which prevents its destructor from running and may
>>>>>>> fill the
>>>>>>> resource area up with temporary stream objects if used in a certain
>>>>>>> way.
>>>>>>>
>>>>>>> Have I understood this right so far? If yes, would it be possible to
>>>>>>> simplify this?
>>>>>>>
>>>>>>> I think you understand the code correctly, and yes, there are
>>>>>> probably
>>>>>> ways to make this simpler.
>>>>>>
>>>>>>
>>>>>> Thanks for your input!
>>>>>
>>>>> Kind regards, Thomas
>>>>>
>>>>>
>>>>> Thanks,
>>>>>> StefanK
>>>>>>
>>>>>>
>>>>>> Kind Regards, Thomas
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 7, 2017 at 9:20 AM, Stefan Karlsson <
>>>>>>> stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Thomas,
>>>>>>>
>>>>>>>
>>>>>>> On 2017-06-06 11:40, Thomas Stüfe wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> In our VM we recently hit something similar to
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167995
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8167995> or
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8149557
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8149557>:
>>>>>>>
>>>>>>> A stringStream* was handed down to nested print functions
>>>>>>> which
>>>>>>> create
>>>>>>> their own ResourceMarks and, while being down the stack
>>>>>>> under
>>>>>>> the scope of
>>>>>>> that new ResourceMark, the stringStream needed to enlarge
>>>>>>> its
>>>>>>> internal
>>>>>>> buffer. This is the situation the assert inside
>>>>>>> stringStream::write()
>>>>>>> attempts to catch
>>>>>>> (assert(Thread::current()->current_resource_mark() ==
>>>>>>> rm); in our case this was a release build, so we just
>>>>>>> crashed.
>>>>>>>
>>>>>>> The solution for both JDK-816795 and JDK-8149557 seemed to
>>>>>>> be to
>>>>>>> just
>>>>>>> remove the offending ResourceMarks, or shuffle them around,
>>>>>>> but
>>>>>>> generally
>>>>>>> this is not an optimal solution, or?
>>>>>>>
>>>>>>> We actually question whether using resource area memory is a
>>>>>>> good idea for
>>>>>>> outputStream chuild objects at all:
>>>>>>>
>>>>>>> outputStream instances typically travel down the stack a
>>>>>>> lot by
>>>>>>> getting
>>>>>>> handed sub-print-functions, so they run danger of crossing
>>>>>>> resource mark
>>>>>>> boundaries like above. The sub functions are usually
>>>>>>> oblivious
>>>>>>> to the type
>>>>>>> of outputStream* handed down, and as well they should be.
>>>>>>> And if
>>>>>>> the
>>>>>>> allocate resource area memory themselves, it makes sense to
>>>>>>> guard them with
>>>>>>> ResourceMark in case they are called in a loop.
>>>>>>>
>>>>>>> The assert inside stringStream::write() is not a real help
>>>>>>> either, because
>>>>>>> whether or not it hits depends on pure luck - whether the
>>>>>>> realloc code path
>>>>>>> is hit just in the right moment while printing. Which
>>>>>>> depends on
>>>>>>> the buffer
>>>>>>> size and the print history, which is variable, especially
>>>>>>> with
>>>>>>> logging.
>>>>>>>
>>>>>>> The only advantage to using bufferedStream (C-Heap) is a
>>>>>>> small
>>>>>>> performance
>>>>>>> improvement when allocating. The question is whether this is
>>>>>>> really worth
>>>>>>> the risk of using resource area memory in this fashion.
>>>>>>> Especially in the
>>>>>>> context of UL where we are about to do expensive IO
>>>>>>> operations
>>>>>>> (writing to
>>>>>>> log file) or may lock (os::flockfile).
>>>>>>>
>>>>>>> Also, the difference between bufferedStream and stringStream
>>>>>>> might be
>>>>>>> reduced by improving bufferedStream (e.g. by using a member
>>>>>>> char
>>>>>>> array for
>>>>>>> small allocations and delay using malloc() for larger
>>>>>>> arrays.)
>>>>>>>
>>>>>>> What you think? Should we get rid of stringStream and only
>>>>>>> use
>>>>>>> an (possibly
>>>>>>> improved) bufferedStream? I also imagine this could make UL
>>>>>>> coding a bit
>>>>>>> simpler.
>>>>>>>
>>>>>>>
>>>>>>> Not answering your questions, but I want to point out that we
>>>>>>> already have a UL stream that uses C-Heap:
>>>>>>>
>>>>>>> logging/logStream.hpp:
>>>>>>>
>>>>>>> // The backing buffer is allocated in CHeap memory.
>>>>>>> typedef LogStreamBase<bufferedStream> LogStreamCHeap;
>>>>>>>
>>>>>>> StefanK
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> Kind Regards, Thomas
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>
>>
>
>
More information about the hotspot-dev
mailing list