stringStream in UL and nested ResourceMarks

Marcus Larsson marcus.larsson at oracle.com
Mon Jun 12 08:28:54 UTC 2017


Hi,

Thanks for looking into this.


On 2017-06-12 08:52, Thomas Stüfe wrote:
> Hi guys,
>
> After our friday discussion I try to collect my thoughts about how to 
> solve this. Could you please check if what I think makes sense:
>
> -> LogStream, an outputStream child, is used to give us outputStream 
> compatibility to write to UL. It assembles a line and, once complete, 
> sends it off to UL backend. It needs memory to do this, and it often 
> uses resource memory for this. This is a problem if we cross resource 
> marks.
>
> -> We do not want to use resource memory. Alternatives would be
> a) using a fixed sized char array as member in LogStream. This would 
> be a very simple solution. But that imposes a max. line length, which 
> we do not want(?) and, where LogStream is used as a Stack object, may 
> consume too much stack space.
> b) using malloc-ed memory (perhaps in combination with having a small 
> fixed sized char array for small lines to reduce the number of malloc 
> invocations). But that requires us to be able to free the malloced 
> memory, so the destructor of LogStream must be called.
>
> -> If possible, we want the UL API to stay the same, so we want to 
> avoid having to fix up many UL call sites.
>
> Lets look at alternative (b) closer, because I think (a) may be out of 
> question (?).

Yes, I believe (a) is not an option.

>
> ------
>
> LogStreams are used in three ways which I can see:
>
> A) As a stack object:
>     LogTarget(Error, logging) target;
>     LogStreamCHeap errstream(target);
>     return LogConfiguration::parse_log_arguments(_gc_log_filename, 
> gc_conf, NULL, NULL, &errstream);
>
> No problem here: LogStream is an automatic variable and its destructor 
> will be called when it goes out of scope. So we can free malloced memory.
>
> B) More often, by calling the static method LogImpl<>::xxx_stream():
>
>    Either like this:
>
>     1) Log(jit, compilation) log;
>        if (log.is_debug()) {
>           print_impl(log.debug_stream(), ... );
>        }
>
>    Or like this:
>
>    2)  outputStream* st = Log(stackwalk)::debug_stream();
>
> (So, either via an instance of LogImpl<>, or directly via the class::).
>
> Here, LogImpl<>::debug_stream() creates a LogStream with 
> ResourceObj::operator new(). For this LogStream the destructor cannot 
> be called (not without rewriting the call site).
>
> ----
>
> If we want to keep the syntax of the call sites as much as possible, 
> we need to keep the xxxx_stream() methods as they are.
>
> But instead of allocating the LogStream objects each time anew, we 
> could change LogImpl<> to manage pointers to its LogStream objects. 
> This would help in all those cases where an instance of LogImpl<> is 
> placed on the stack, which seems to be the majority of call sites. So, 
> if there is a LogImpl<> object on the stack, like in (B1), its 
> destructor would run when it goes out of scope and it would clean up 
> its LogStream objects.
>
> Care must be taken that the construction of the LogImpl<> object stays 
> cheap, because we do not want to pay for logging if we do not log. 
> Which means that construction of the LogStream objects must be 
> deferred to when they are actually needed.
>
> for instance:
>
> LogImpl<> {
>   outputStream* _debugStream;
>   outputStream* _errorStream;
>   ...
> public:
>   LogImpl<> :: _debugStream(NULL), _errorStream(NULL), ... {}
>   ~LogImpl<> () {
>     if (_debugStream) { cleanup debugStream; }
>     if (_errorStream) { cleanup errorStream; }
>     ...
>   }
>   debugStream* debug_stream() {
>     if (!_debugStream) _debugStream = new LogStream();
>       return _debugStream;
>     }
>   }
> };
>
> Unfortunately, this would not help for cases where xxx_stream() is 
> called directly as static method (Log(..)::xx_stream()) - I do not see 
> a possibility to keep this syntax. If we go with the "LogImpl<> 
> manages its associated LogStream objects" solution, these sites would 
> have to be rewritten:
>
> From:
>  outputStream* st = Log(stackwalk)::debug_stream();
> to:
>  Log(stackwalk) log;
>  outputStream* st = log.debug_stream();
>
> ---
>
> What do you think? Does any of this make sense?

I think it makes sense.

IMO, we could as well get rid of the xxx_info() functions as a whole, 
and instead always require stack objects (LogStream instance). We would 
need to change some of the call sites with the proposed solution, so why 
not change all of them while we're at it. It saves complexity and makes 
it more obvious to the user where and how the memory is allocated.

Thanks,
Marcus

>
>
> Also, I opened https://bugs.openjdk.java.net/browse/JDK-8181917 to 
> track this issue. Note that even though I assigned this to myself for 
> now, if any of you rather wants to take it this over it is fine by me.
>
>
> Kind regards, Thomas
>
>
>
>
>
> On Sat, Jun 10, 2017 at 8:33 AM, Thomas Stüfe <thomas.stuefe at gmail.com 
> <mailto:thomas.stuefe at gmail.com>> wrote:
>
>     Yes, this seems to be the same issue.
>
>     ..Thomas
>
>     On Fri, Jun 9, 2017 at 10:00 PM, Daniel D. Daugherty
>     <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>>
>     wrote:
>
>         This bug seems relevant to this discussion:
>
>         JDK-8181807 Graal internal error "StringStream is re-allocated
>         with a different ResourceMark"
>         https://bugs.openjdk.java.net/browse/JDK-8181807
>         <https://bugs.openjdk.java.net/browse/JDK-8181807>
>
>         Dan
>
>
>
>         On 6/9/17 1:48 AM, Thomas Stüfe wrote:
>
>             Hi Stefan,
>
>             just a small question to verify that I understood
>             everything correctly.
>
>             The LogStream classes (LogStreamBase and children) are
>             basically the
>             write-to-UL frontend classes, right? Their purpose is to
>             collect input via
>             various print.. methods until \n is encountered, then pipe
>             the assembled
>             line to the UL backend. To do that it needs a backing
>             store for the
>             to-be-assembled-line, and this is the whole reason
>             stringStream is used
>             (via the "streamClass" template parameter for LogStreamBase)?
>
>             So, basically the whole rather involved class tree rooted
>             at LogStreamBase
>             only deals with the various ways that one line backing
>             store is allocated?
>             Including LogStream itself, which contains - I was very
>             surprised to see -
>             an embedded ResourceMark (stringStreamWithResourceMark).
>             There are no other
>             reasons for this ResourceMark?
>
>             I am currently experimenting with changing LogStream to
>             use a simple
>             malloc'd backing store, in combination with a small fixed
>             size member
>             buffer for small lines; I'd like to see if that had any
>             measurable negative
>             performance impact. The coding is halfway done already,
>             but the callers
>             need fixing up, because due to my change LogStreamBase
>             children cannot be
>             allocated with new anymore, because of the
>             ResourceObj-destructor problem.
>
>             What do you think, is this worthwhile and am I overlooking
>             something
>             obvious? The UL coding is quite large, after all.
>
>             Kind Regards, Thomas
>
>
>
>
>
>
>
>
>             On Wed, Jun 7, 2017 at 12:25 PM, Thomas Stüfe
>             <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>>
>             wrote:
>
>                 Hi Stefan,
>
>                 On Wed, Jun 7, 2017 at 12:17 PM, Stefan Karlsson <
>                 stefan.karlsson at oracle.com
>                 <mailto:stefan.karlsson at oracle.com>> wrote:
>
>                     Hi Thomas,
>
>                     On 2017-06-07 11:15, Thomas Stüfe wrote:
>
>                         Hi Stefan,
>
>                         I saw this, but I also see
>                         LogStreamNoResourceMark being used as a
>                         default for the
>                         (trace|debug|info|warning|error)_stream()
>                         methods of
>                         Log. In this form it is used quite a lot.
>
>                         Looking further, I see that one cannot just
>                         exchange
>                         LogStreamNoResourceMark with LogStreamCHeap,
>                         because there are hidden usage
>                         conventions I was not aware of:
>
>                     Just to be clear, I didn't propose that you did a
>                     wholesale replacement
>                     of LogStreamNoResourceMark with LogStreamCHeap. I
>                     merely pointed out the
>                     existence of this class in case you had missed it.
>
>
>                 Sure! I implied this myself with my original post
>                 which proposed to
>                 replace the resource area allocation inside
>                 stringStream with malloc'd
>                 memory.
>
>
>                         LogStreamNoResourceMark  is allocated with
>                         new() in create_log_stream().
>                         LogStreamNoResourceMark is an outputStream,
>                         which is a ResourceObj. In its
>                         current form ResourceObj cannot be deleted, so
>                         destructors for ResourceObj
>                         child cannot be called.
>
>                     By default ResourceObj classes are allocated in
>                     the resource area, but
>                     the class also supports CHeap allocations. For
>                     example, see some of the
>                     allocations of GrowableArray instances:
>
>                     _deallocate_list = new (ResourceObj::C_HEAP, mtClass)
>                     GrowableArray<Metadata*>(100, true);
>
>                     These can still be deleted:
>
>                     delete _deallocate_list;
>
>
>                         So, we could not use malloc in the
>                         stringStream - or exchange
>                         stringStream for bufferedStream - because we
>                         would need a non-empty
>                         destructor to free the malloc'd memory, and
>                         that destructor cannot exist.
>
>                         Looking further, I see that this imposes
>                         subtle usage restrictions for
>                         UL:
>
>                         LogStreamNoResourceMark objects are used via
>                         "log.debug_stream()" or
>                         similar. For example:
>
>                               codecache_print(log.debug_stream(), /*
>                         detailed= */ false);
>
>                         debug_stream() will allocate a
>                         LogStreamNoResourceMark object which
>                         lives in the resourcearea. This is a bit
>                         surprising, because
>                         "debug_stream()" feels like it returns a
>                         singleton or a member variable of
>                         log.
>
>                     IIRC, this was done to:
>
>                     1) break up a cyclic dependencies between
>                     logStream.hpp and log.hpp
>
>                     2) Not have log.hpp depend on the stream.hpp. This
>                     used to be important,
>                     but the includes in stream.hpp has been fixed so
>                     this might be a non-issue.
>
>
>                         If one wants to use LogStreamCHeap instead, it
>                         must not be created with
>                         new() - which would be a subtle memory leak
>                         because the destructor would
>                         never be called - but instead on the stack as
>                         automatic variable:
>
>                               LogStreamCHeap log_stream(log);
>                               log_stream.print("hallo");
>
>                         I may understand this wrong, but if this is
>                         true, this is quite a
>                         difficult API.
>
>                     Feel free to rework this and propose a simpler
>                     model. Anything that would
>                     simplify this would be helpful.
>
>
>                 I will mull over this a bit (and I would be thankful
>                 for other viewpoints
>                 as well). A bottomline question which is difficult to
>                 answer is whether
>                 folks value the slight performance increase of
>                 resource area backed memory
>                 allocation in stringStream more than simplicity and
>                 robustness which would
>                 come with switching to malloced memory. And then,
>                 there is the second
>                 question of why outputStream objects should be
>                 ResourceObj at all; for me,
>                 they feel much more at home as stack objects. They
>                 themselves are small and
>                 do not allocate a lot of memory (if they do, they do
>                 it dynamically). And
>                 they are not allocated in vast amounts...
>
>                 Lets see what others think.
>
>
>                       I have two classes which look like siblings but
>
>                         LogStreamCHeap can only be allocated on the
>                         local stack - otherwise I'll
>                         get a memory leak - while
>                         LogStreamNoResourceMark gets created in the
>                         resource area, which prevents its destructor
>                         from running and may fill the
>                         resource area up with temporary stream objects
>                         if used in a certain way.
>
>                         Have I understood this right so far? If yes,
>                         would it be possible to
>                         simplify this?
>
>                     I think you understand the code correctly, and
>                     yes, there are probably
>                     ways to make this simpler.
>
>
>                 Thanks for your input!
>
>                 Kind regards, Thomas
>
>
>                     Thanks,
>                     StefanK
>
>
>                         Kind Regards, Thomas
>
>
>
>
>
>                         On Wed, Jun 7, 2017 at 9:20 AM, Stefan Karlsson <
>                         stefan.karlsson at oracle.com
>                         <mailto:stefan.karlsson at oracle.com>
>                         <mailto:stefan.karlsson at oracle.com
>                         <mailto:stefan.karlsson at oracle.com>>> wrote:
>
>                              Hi Thomas,
>
>
>                              On 2017-06-06 11:40, Thomas Stüfe wrote:
>
>                                  Hi all,
>
>                                  In our VM we recently hit something
>                         similar to
>                         https://bugs.openjdk.java.net/browse/JDK-8167995
>                         <https://bugs.openjdk.java.net/browse/JDK-8167995>
>                                
>                          <https://bugs.openjdk.java.net/browse/JDK-8167995
>                         <https://bugs.openjdk.java.net/browse/JDK-8167995>>
>                         or
>                         https://bugs.openjdk.java.net/browse/JDK-8149557
>                         <https://bugs.openjdk.java.net/browse/JDK-8149557>
>                                
>                          <https://bugs.openjdk.java.net/browse/JDK-8149557
>                         <https://bugs.openjdk.java.net/browse/JDK-8149557>>:
>
>                                  A stringStream* was handed down to
>                         nested print functions which
>                                  create
>                                  their own ResourceMarks and, while
>                         being down the stack under
>                                  the scope of
>                                  that new ResourceMark, the
>                         stringStream needed to enlarge its
>                                  internal
>                                  buffer. This is the situation the
>                         assert inside
>                                  stringStream::write()
>                                  attempts to catch
>                          (assert(Thread::current()->current_resource_mark()
>                         ==
>                                  rm); in our case this was a release
>                         build, so we just crashed.
>
>                                  The solution for both JDK-816795 and
>                         JDK-8149557 seemed to be to
>                                  just
>                                  remove the offending ResourceMarks,
>                         or shuffle them around, but
>                                  generally
>                                  this is not an optimal solution, or?
>
>                                  We actually question whether using
>                         resource area memory is a
>                                  good idea for
>                                  outputStream chuild objects at all:
>
>                                  outputStream instances typically
>                         travel down the stack a lot by
>                                  getting
>                                  handed sub-print-functions, so they
>                         run danger of crossing
>                                  resource mark
>                                  boundaries like above. The sub
>                         functions are usually oblivious
>                                  to the type
>                                  of outputStream* handed down, and as
>                         well they should be. And if
>                         the
>                                  allocate resource area memory
>                         themselves, it makes sense to
>                                  guard them with
>                                  ResourceMark in case they are called
>                         in a loop.
>
>                                  The assert inside
>                         stringStream::write() is not a real help
>                                  either, because
>                                  whether or not it hits depends on
>                         pure luck - whether the
>                                  realloc code path
>                                  is hit just in the right moment while
>                         printing. Which depends on
>                                  the buffer
>                                  size and the print history, which is
>                         variable, especially with
>                                  logging.
>
>                                  The only advantage to using
>                         bufferedStream (C-Heap) is a small
>                                  performance
>                                  improvement when allocating. The
>                         question is whether this is
>                                  really worth
>                                  the risk of using resource area
>                         memory in this fashion.
>                                  Especially in the
>                                  context of UL where we are about to
>                         do expensive IO operations
>                                  (writing to
>                                  log file) or may lock (os::flockfile).
>
>                                  Also, the difference between
>                         bufferedStream and stringStream
>                                  might be
>                                  reduced by improving bufferedStream
>                         (e.g. by using a member char
>                                  array for
>                                  small allocations and delay using
>                         malloc() for larger arrays.)
>
>                                  What you think? Should we get rid of
>                         stringStream and only use
>                                  an (possibly
>                                  improved) bufferedStream? I also
>                         imagine this could make UL
>                                  coding a bit
>                                  simpler.
>
>
>                              Not answering your questions, but I want
>                         to point out that we
>                              already have a UL stream that uses C-Heap:
>
>                              logging/logStream.hpp:
>
>                              // The backing buffer is allocated in
>                         CHeap memory.
>                              typedef LogStreamBase<bufferedStream>
>                         LogStreamCHeap;
>
>                              StefanK
>
>
>
>                                  Thank you,
>
>                                  Kind Regards, Thomas
>
>
>
>
>
>



More information about the hotspot-dev mailing list