stringStream in UL and nested ResourceMarks

Mon Jun 12 15:47:22 UTC 2017

On Mon, Jun 12, 2017 at 12:16 PM, Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> Hi Marcus,
>
> On Mon, Jun 12, 2017 at 10:28 AM, Marcus Larsson <
> marcus.larsson at oracle.com> wrote:
>
>> Hi,
>>
>> Thanks for looking into this.
>>
>> On 2017-06-12 08:52, Thomas Stüfe wrote:
>>
>> Hi guys,
>>
>> After our friday discussion I try to collect my thoughts about how to
>> solve this. Could you please check if what I think makes sense:
>>
>> -> LogStream, an outputStream child, is used to give us outputStream
>> compatibility to write to UL. It assembles a line and, once complete, sends
>> it off to UL backend. It needs memory to do this, and it often uses
>> resource memory for this. This is a problem if we cross resource marks.
>>
>> -> We do not want to use resource memory. Alternatives would be
>> a) using a fixed sized char array as member in LogStream. This would be a
>> very simple solution. But that imposes a max. line length, which we do not
>> want(?) and, where LogStream is used as a Stack object, may consume too
>> much stack space.
>> b) using malloc-ed memory (perhaps in combination with having a small
>> fixed sized char array for small lines to reduce the number of malloc
>> invocations). But that requires us to be able to free the malloced memory,
>> so the destructor of LogStream must be called.
>>
>> -> If possible, we want the UL API to stay the same, so we want to avoid
>> having to fix up many UL call sites.
>>
>> Lets look at alternative (b) closer, because I think (a) may be out of
>> question (?).
>>
>>
>> Yes, I believe (a) is not an option.
>>
>>
>>
>> ------
>>
>> LogStreams are used in three ways which I can see:
>>
>> A) As a stack object:
>>     LogTarget(Error, logging) target;
>>     LogStreamCHeap errstream(target);
>>     return LogConfiguration::parse_log_arguments(_gc_log_filename,
>> gc_conf, NULL, NULL, &errstream);
>>
>> No problem here: LogStream is an automatic variable and its destructor
>> will be called when it goes out of scope. So we can free malloced memory.
>>
>> B) More often, by calling the static method LogImpl<>::xxx_stream():
>>
>>    Either like this:
>>
>>     1) Log(jit, compilation) log;
>>        if (log.is_debug()) {
>>           print_impl(log.debug_stream(), ... );
>>        }
>>
>>    Or like this:
>>
>>    2)  outputStream* st = Log(stackwalk)::debug_stream();
>>
>> (So, either via an instance of LogImpl<>, or directly via the class::).
>>
>> Here, LogImpl<>::debug_stream() creates a LogStream with
>> ResourceObj::operator new(). For this LogStream the destructor cannot be
>> called (not without rewriting the call site).
>>
>> ----
>>
>> If we want to keep the syntax of the call sites as much as possible, we
>> need to keep the xxxx_stream() methods as they are.
>>
>> But instead of allocating the LogStream objects each time anew, we could
>> change LogImpl<> to manage pointers to its LogStream objects. This would
>> help in all those cases where an instance of LogImpl<> is placed on the
>> stack, which seems to be the majority of call sites. So, if there is a
>> LogImpl<> object on the stack, like in (B1), its destructor would run when
>> it goes out of scope and it would clean up its LogStream objects.
>>
>> Care must be taken that the construction of the LogImpl<> object stays
>> cheap, because we do not want to pay for logging if we do not log. Which
>> means that construction of the LogStream objects must be deferred to when
>> they are actually needed.
>>
>> for instance:
>>
>> LogImpl<> {
>>   outputStream* _debugStream;
>>   outputStream* _errorStream;
>>   ...
>> public:
>>   LogImpl<> :: _debugStream(NULL), _errorStream(NULL), ... {}
>>   ~LogImpl<> () {
>>     if (_debugStream) { cleanup debugStream; }
>>     if (_errorStream) { cleanup errorStream; }
>>     ...
>>   }
>>   debugStream* debug_stream() {
>>     if (!_debugStream) _debugStream = new LogStream();
>>       return _debugStream;
>>     }
>>   }
>> };
>>
>> Unfortunately, this would not help for cases where xxx_stream() is called
>> directly as static method (Log(..)::xx_stream()) - I do not see a
>> possibility to keep this syntax. If we go with the "LogImpl<> manages its
>> associated LogStream objects" solution, these sites would have to be
>> rewritten:
>>
>> From:
>>  outputStream* st = Log(stackwalk)::debug_stream();
>> to:
>>  Log(stackwalk) log;
>>  outputStream* st = log.debug_stream();
>>
>> ---
>>
>> What do you think? Does any of this make sense?
>>
>>
>> I think it makes sense.
>>
>> IMO, we could as well get rid of the xxx_info() functions as a whole, and
>> instead always require stack objects (LogStream instance). We would need to
>> change some of the call sites with the proposed solution, so why not change
>> all of them while we're at it. It saves complexity and makes it more
>> obvious to the user where and how the memory is allocated.
>>
>>
> I'd be all for it but if we really change the UL Api that drastically and
> fix up all call sites, will this not make it harder to backport it to jdk9
> if we want that?
>
> Other than this worry, I like it, and it is also simpler than my proposal.
> I will attempt a patch reworking UL in that matter: remove the xx_stream()
> functions and make LogStream malloc-backed and stack-allocatable-only.
>
>

Okay, I put a bit of work into it (stack-only LogStream) and encountered
nothing difficult, just a lot of boilerplate work.

This is how I am changing the callsites:

http://cr.openjdk.java.net/~stuefe/webrevs/8181917-UL-should-not-use-resource-memory-for-LogStream/current-work/webrev/

Please tell me if you had something different in mind; I want to make sure
we have the same idea before changing such a lot of code.

Notes:
- I only remove ResourceMark objects where I can be sure that they are
intended to scope the LogStream itself. If I cannot be sure, I leave them
in for now.
- using LogStream as a stack object means we need to include logStream.hpp
(or decide to move LogStream to log.hpp).
- If there is only one debug level involved (true in a lot of places), I
prefer LogTarget to Log, because it minimizes error possibilities
- In some places the coding gets more complicated, usually if the code
wants to decide the log level dynamically, but the log level is a template
parameter to LogTarget. See e.g. linkresolver.cpp.
- I see the macro "log_develop_is_enabled" and wonder why it exists and why
the code is not just surrounded by #ifndef PRODUCT?

Kind Regards, Thomas

> ...Thomas
>
> Thanks,
>> Marcus
>>
>>
>>
>>
>> Also, I opened https://bugs.openjdk.java.net/browse/JDK-8181917 to track
>> this issue. Note that even though I assigned this to myself for now, if any
>> of you rather wants to take it this over it is fine by me.
>>
>>
>> Kind regards, Thomas
>>
>>
>>
>>
>>
>> On Sat, Jun 10, 2017 at 8:33 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
>> wrote:
>>
>>> Yes, this seems to be the same issue.
>>>
>>> ..Thomas
>>>
>>> On Fri, Jun 9, 2017 at 10:00 PM, Daniel D. Daugherty <
>>> daniel.daugherty at oracle.com> wrote:
>>>
>>>> This bug seems relevant to this discussion:
>>>>
>>>> JDK-8181807 Graal internal error "StringStream is re-allocated with a
>>>> different ResourceMark"
>>>> https://bugs.openjdk.java.net/browse/JDK-8181807
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 6/9/17 1:48 AM, Thomas Stüfe wrote:
>>>>
>>>>> Hi Stefan,
>>>>>
>>>>> just a small question to verify that I understood everything correctly.
>>>>>
>>>>> The LogStream classes (LogStreamBase and children) are basically the
>>>>> write-to-UL frontend classes, right? Their purpose is to collect input
>>>>> via
>>>>> various print.. methods until \n is encountered, then pipe the
>>>>> assembled
>>>>> line to the UL backend. To do that it needs a backing store for the
>>>>> to-be-assembled-line, and this is the whole reason stringStream is used
>>>>> (via the "streamClass" template parameter for LogStreamBase)?
>>>>>
>>>>> So, basically the whole rather involved class tree rooted at
>>>>> LogStreamBase
>>>>> only deals with the various ways that one line backing store is
>>>>> allocated?
>>>>> Including LogStream itself, which contains - I was very surprised to
>>>>> see -
>>>>> an embedded ResourceMark (stringStreamWithResourceMark). There are no
>>>>> other
>>>>> reasons for this ResourceMark?
>>>>>
>>>>> I am currently experimenting with changing LogStream to use a simple
>>>>> malloc'd backing store, in combination with a small fixed size member
>>>>> buffer for small lines; I'd like to see if that had any measurable
>>>>> negative
>>>>> performance impact. The coding is halfway done already, but the callers
>>>>> need fixing up, because due to my change LogStreamBase children cannot
>>>>> be
>>>>> allocated with new anymore, because of the ResourceObj-destructor
>>>>> problem.
>>>>>
>>>>> What do you think, is this worthwhile and am I overlooking something
>>>>> obvious? The UL coding is quite large, after all.
>>>>>
>>>>> Kind Regards, Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 7, 2017 at 12:25 PM, Thomas Stüfe <thomas.stuefe at gmail.com
>>>>> >
>>>>> wrote:
>>>>>
>>>>> Hi Stefan,
>>>>>>
>>>>>> On Wed, Jun 7, 2017 at 12:17 PM, Stefan Karlsson <
>>>>>> stefan.karlsson at oracle.com> wrote:
>>>>>>
>>>>>> Hi Thomas,
>>>>>>>
>>>>>>> On 2017-06-07 11:15, Thomas Stüfe wrote:
>>>>>>>
>>>>>>> Hi Stefan,
>>>>>>>>
>>>>>>>> I saw this, but I also see LogStreamNoResourceMark being used as a
>>>>>>>> default for the (trace|debug|info|warning|error)_stream() methods
>>>>>>>> of
>>>>>>>> Log. In this form it is used quite a lot.
>>>>>>>>
>>>>>>>> Looking further, I see that one cannot just exchange
>>>>>>>> LogStreamNoResourceMark with LogStreamCHeap, because there are
>>>>>>>> hidden usage
>>>>>>>> conventions I was not aware of:
>>>>>>>>
>>>>>>>> Just to be clear, I didn't propose that you did a wholesale
>>>>>>> replacement
>>>>>>> of LogStreamNoResourceMark with LogStreamCHeap. I merely pointed out
>>>>>>> the
>>>>>>> existence of this class in case you had missed it.
>>>>>>>
>>>>>>>
>>>>>>> Sure! I implied this myself with my original post which proposed to
>>>>>> replace the resource area allocation inside stringStream with malloc'd
>>>>>> memory.
>>>>>>
>>>>>>
>>>>>> LogStreamNoResourceMark  is allocated with new() in
>>>>>>>> create_log_stream().
>>>>>>>> LogStreamNoResourceMark is an outputStream, which is a ResourceObj.
>>>>>>>> In its
>>>>>>>> current form ResourceObj cannot be deleted, so destructors for
>>>>>>>> ResourceObj
>>>>>>>> child cannot be called.
>>>>>>>>
>>>>>>>> By default ResourceObj classes are allocated in the resource area,
>>>>>>> but
>>>>>>> the class also supports CHeap allocations. For example, see some of
>>>>>>> the
>>>>>>> allocations of GrowableArray instances:
>>>>>>>
>>>>>>> _deallocate_list = new (ResourceObj::C_HEAP, mtClass)
>>>>>>> GrowableArray<Metadata*>(100, true);
>>>>>>>
>>>>>>> These can still be deleted:
>>>>>>>
>>>>>>> delete _deallocate_list;
>>>>>>>
>>>>>>>
>>>>>>> So, we could not use malloc in the stringStream - or exchange
>>>>>>>> stringStream for bufferedStream - because we would need a non-empty
>>>>>>>> destructor to free the malloc'd memory, and that destructor cannot
>>>>>>>> exist.
>>>>>>>>
>>>>>>>> Looking further, I see that this imposes subtle usage restrictions
>>>>>>>> for
>>>>>>>> UL:
>>>>>>>>
>>>>>>>> LogStreamNoResourceMark objects are used via "log.debug_stream()" or
>>>>>>>> similar. For example:
>>>>>>>>
>>>>>>>>       codecache_print(log.debug_stream(), /* detailed= */ false);
>>>>>>>>
>>>>>>>> debug_stream() will allocate a LogStreamNoResourceMark object which
>>>>>>>> lives in the resourcearea. This is a bit surprising, because
>>>>>>>> "debug_stream()" feels like it returns a singleton or a member
>>>>>>>> variable of
>>>>>>>> log.
>>>>>>>>
>>>>>>>> IIRC, this was done to:
>>>>>>>
>>>>>>> 1) break up a cyclic dependencies between logStream.hpp and log.hpp
>>>>>>>
>>>>>>> 2) Not have log.hpp depend on the stream.hpp. This used to be
>>>>>>> important,
>>>>>>> but the includes in stream.hpp has been fixed so this might be a
>>>>>>> non-issue.
>>>>>>>
>>>>>>>
>>>>>>> If one wants to use LogStreamCHeap instead, it must not be created
>>>>>>>> with
>>>>>>>> new() - which would be a subtle memory leak because the destructor
>>>>>>>> would
>>>>>>>> never be called - but instead on the stack as automatic variable:
>>>>>>>>
>>>>>>>>       LogStreamCHeap log_stream(log);
>>>>>>>>       log_stream.print("hallo");
>>>>>>>>
>>>>>>>> I may understand this wrong, but if this is true, this is quite a
>>>>>>>> difficult API.
>>>>>>>>
>>>>>>>> Feel free to rework this and propose a simpler model. Anything that
>>>>>>> would
>>>>>>> simplify this would be helpful.
>>>>>>>
>>>>>>>
>>>>>>> I will mull over this a bit (and I would be thankful for other
>>>>>> viewpoints
>>>>>> as well). A bottomline question which is difficult to answer is
>>>>>> whether
>>>>>> folks value the slight performance increase of resource area backed
>>>>>> memory
>>>>>> allocation in stringStream more than simplicity and robustness which
>>>>>> would
>>>>>> come with switching to malloced memory. And then, there is the second
>>>>>> question of why outputStream objects should be ResourceObj at all;
>>>>>> for me,
>>>>>> they feel much more at home as stack objects. They themselves are
>>>>>> small and
>>>>>> do not allocate a lot of memory (if they do, they do it dynamically).
>>>>>> And
>>>>>> they are not allocated in vast amounts...
>>>>>>
>>>>>> Lets see what others think.
>>>>>>
>>>>>>
>>>>>>   I have two classes which look like siblings but
>>>>>>>
>>>>>>> LogStreamCHeap can only be allocated on the local stack - otherwise
>>>>>>>> I'll
>>>>>>>> get a memory leak - while LogStreamNoResourceMark gets created in
>>>>>>>> the
>>>>>>>> resource area, which prevents its destructor from running and may
>>>>>>>> fill the
>>>>>>>> resource area up with temporary stream objects if used in a certain
>>>>>>>> way.
>>>>>>>>
>>>>>>>> Have I understood this right so far? If yes, would it be possible to
>>>>>>>> simplify this?
>>>>>>>>
>>>>>>>> I think you understand the code correctly, and yes, there are
>>>>>>> probably
>>>>>>> ways to make this simpler.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for your input!
>>>>>>
>>>>>> Kind regards, Thomas
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>> StefanK
>>>>>>>
>>>>>>>
>>>>>>> Kind Regards, Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 7, 2017 at 9:20 AM, Stefan Karlsson <
>>>>>>>> stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>      Hi Thomas,
>>>>>>>>
>>>>>>>>
>>>>>>>>      On 2017-06-06 11:40, Thomas Stüfe wrote:
>>>>>>>>
>>>>>>>>          Hi all,
>>>>>>>>
>>>>>>>>          In our VM we recently hit something similar to
>>>>>>>>          https://bugs.openjdk.java.net/browse/JDK-8167995
>>>>>>>>          <https://bugs.openjdk.java.net/browse/JDK-8167995> or
>>>>>>>>          https://bugs.openjdk.java.net/browse/JDK-8149557
>>>>>>>>          <https://bugs.openjdk.java.net/browse/JDK-8149557>:
>>>>>>>>
>>>>>>>>          A stringStream* was handed down to nested print functions
>>>>>>>> which
>>>>>>>>          create
>>>>>>>>          their own ResourceMarks and, while being down the stack
>>>>>>>> under
>>>>>>>>          the scope of
>>>>>>>>          that new ResourceMark, the stringStream needed to enlarge
>>>>>>>> its
>>>>>>>>          internal
>>>>>>>>          buffer. This is the situation the assert inside
>>>>>>>>          stringStream::write()
>>>>>>>>          attempts to catch
>>>>>>>>          (assert(Thread::current()->current_resource_mark() ==
>>>>>>>>          rm); in our case this was a release build, so we just
>>>>>>>> crashed.
>>>>>>>>
>>>>>>>>          The solution for both JDK-816795 and JDK-8149557 seemed to
>>>>>>>> be to
>>>>>>>>          just
>>>>>>>>          remove the offending ResourceMarks, or shuffle them
>>>>>>>> around, but
>>>>>>>>          generally
>>>>>>>>          this is not an optimal solution, or?
>>>>>>>>
>>>>>>>>          We actually question whether using resource area memory is
>>>>>>>> a
>>>>>>>>          good idea for
>>>>>>>>          outputStream chuild objects at all:
>>>>>>>>
>>>>>>>>          outputStream instances typically travel down the stack a
>>>>>>>> lot by
>>>>>>>>          getting
>>>>>>>>          handed sub-print-functions, so they run danger of crossing
>>>>>>>>          resource mark
>>>>>>>>          boundaries like above. The sub functions are usually
>>>>>>>> oblivious
>>>>>>>>          to the type
>>>>>>>>          of outputStream* handed down, and as well they should be.
>>>>>>>> And if
>>>>>>>> the
>>>>>>>>          allocate resource area memory themselves, it makes sense to
>>>>>>>>          guard them with
>>>>>>>>          ResourceMark in case they are called in a loop.
>>>>>>>>
>>>>>>>>          The assert inside stringStream::write() is not a real help
>>>>>>>>          either, because
>>>>>>>>          whether or not it hits depends on pure luck - whether the
>>>>>>>>          realloc code path
>>>>>>>>          is hit just in the right moment while printing. Which
>>>>>>>> depends on
>>>>>>>>          the buffer
>>>>>>>>          size and the print history, which is variable, especially
>>>>>>>> with
>>>>>>>>          logging.
>>>>>>>>
>>>>>>>>          The only advantage to using bufferedStream (C-Heap) is a
>>>>>>>> small
>>>>>>>>          performance
>>>>>>>>          improvement when allocating. The question is whether this
>>>>>>>> is
>>>>>>>>          really worth
>>>>>>>>          the risk of using resource area memory in this fashion.
>>>>>>>>          Especially in the
>>>>>>>>          context of UL where we are about to do expensive IO
>>>>>>>> operations
>>>>>>>>          (writing to
>>>>>>>>          log file) or may lock (os::flockfile).
>>>>>>>>
>>>>>>>>          Also, the difference between bufferedStream and
>>>>>>>> stringStream
>>>>>>>>          might be
>>>>>>>>          reduced by improving bufferedStream (e.g. by using a
>>>>>>>> member char
>>>>>>>>          array for
>>>>>>>>          small allocations and delay using malloc() for larger
>>>>>>>> arrays.)
>>>>>>>>
>>>>>>>>          What you think? Should we get rid of stringStream and only
>>>>>>>> use
>>>>>>>>          an (possibly
>>>>>>>>          improved) bufferedStream? I also imagine this could make UL
>>>>>>>>          coding a bit
>>>>>>>>          simpler.
>>>>>>>>
>>>>>>>>
>>>>>>>>      Not answering your questions, but I want to point out that we
>>>>>>>>      already have a UL stream that uses C-Heap:
>>>>>>>>
>>>>>>>>      logging/logStream.hpp:
>>>>>>>>
>>>>>>>>      // The backing buffer is allocated in CHeap memory.
>>>>>>>>      typedef LogStreamBase<bufferedStream> LogStreamCHeap;
>>>>>>>>
>>>>>>>>      StefanK
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>          Thank you,
>>>>>>>>
>>>>>>>>          Kind Regards, Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>>
>>
>>
>