RFR: 8141211: Convert TraceExceptions to Unified Logging
John Rose
john.r.rose at oracle.com
Fri Dec 18 09:59:12 UTC 2015
On Dec 17, 2015, at 9:41 PM, David Holmes <david.holmes at oracle.com> wrote:
>
>> As for decoration in multi-line messages, I don't think it matters.
>
> I think it may need to be configurable.
Yes, probably. A small issue is escaping newlines and
text which looks like log decorators, which otherwise serve
as message delimiters. A larger one is dealing with large log
messages, if they are are so large that they defeat the natural
atomicity of the underlying FILE object. (Thomas alluded to
this.)
In the old logging system, the role of decorations is played by XML
elements and their attributes. The user controls everything,
and is responsible for not emitting broken XML. The API
attempts to make this simple, in most cases. The characters
"<&>" are special-cased by the normal text-writing paths,
so readers won't confuse them with XML decoration syntax.
But there's also a "raw write" API point for burping up
pre-fermented XML.
In the new UL system, it would be reasonable to supply not only
"write this one line, decorated" (as a favored and simple API point)
but also "write decorations now, but wait for more chunks" and
"write another chunk now" and finally "done with this message".
Maybe also "here are some more decorations for everybody".
And maybe the API allows the user to say "trust me, I don't
want more decorations at this point, but here's some raw output
which will look like one or more log messages".
logmessage := allinone | byparts | trustme
allinone := start chunk end
trustme := ( chunk | '\n' )*
byparts := start moredecs ( chunk | linebreak )* end
start := ( '[' decoration ']' )*
moredecs := ( decoration )*
decoration := ( ANY )+
end := '\n'
chunk := ( ANY - '\n' )*
linebreak := continuation '\n'
continuation := ' ' /* or something else */
Each chunk corresponds to a printf or puts call, but with
no preceding decoration or following newline.
Embedded newlines must be processed (at least configurably)
so as to make them unambiguously parseable. At a minimum,
a character (or series of spaces) should be added after every
embedded newline so that it looks unambiguously different from
a decoration.
(For that matter, it is important that the beginning of the first
chunk is unambiguously distinct from additional decoration.
This may be a bug in the current design. Consider the call
log_info(logging)("[123]"). A continuation marker is useful
here also, and also if the line already appears to begin
with a continuation marker. I care about this stuff in part
because automated log parsers don't handle ambiguity
very well.)
(Very long log messages might have to be broken and
continued later, in principle, if they are longer than the
internal buffering of FILE objects. This in turn would
seem to require some sort of sequence numbering
so the chunks can be reassembled. The old system
never had to do this because all the huge stuff
was handled by thread-local logs, with no contention.)
In a log with both XML structure and a UL foundation,
it will be useful to emit mostly long runs of lines, without
much concern for log message boundaries. Occasionally
it will be helpful to ask the UL foundation to spit out
a packet of decorations, but this would be organized
so as not to interrupt the XML. Note that the XML
we generate is extremely line-oriented. Almost all
XML elements are alone on their own lines.
(An exception: Operands in PrintAssembly
output are marked up in place.)
This keeps line-oriented channels and tools
(like Emacs) from being overwhelmed by angle
brackets… thus requiring multi-line log messages
to carry the line-oriented XML.
Although I don't like XML, I don't know a better
way to dump the structures we need to dump.
Running blocks of XML through the logging channels
would seem to require (at a minimum) multi-line log
messages. It also requires either a way to make
regular messages not parse as XML, or else
a reliable way to mark the XML-bearing messages
as such. The old system takes the former path,
while the new system will likely use the latter.
HTH
— John
P.S. For examples of the stuff I'm talking about, turn
on +LogVMOutput or +LogCompilation, and
look inside the resulting hotspot*.log file.
P.P.S. One final bit of design fluff: The old logging system
does not include "tid" in (most) decorations. Instead,
the stream which is concurrently shared (the tty)
will insert a decoration (on a separate line) between
every pair of lines written by different threads.
The decoration says who is writing the following
line. This is easy on both eyes and parsers.
But it requires polymorphic decorations, which
is the only thing XML can do, but which the UL
system does not do. The UL decorations, in order to
be terse, are extremely coy about what they mean.
More information about the hotspot-runtime-dev
mailing list