Unified Logging for network

Wed Nov 4 08:12:22 UTC 2020

Hi, Yasumasa,

On 2020-11-04 02:38, Yasumasa Suenaga wrote:
> As I said in reply to Thomas [1], I don't want to dump UL to files 
> because it should be one log file (should not rotate), so file size 
> might be large.
> It might occur problems in disk usage in container.

Sorry I don't follow, so I'm guessing here what you mean.

The direct output from UL is only used by syslog, so something like:
"-Xlog:all=error:file=my_vm.%p.log::filecount=2,filesize=1m" should be
fine.
- Use syslog to collect from my_vm.*.log.*
- Configure syslog to output both to that one large file you mentioned
and over network to your e.g. elastic search.

If this is not sufficient then implementation syslog/windows event log 
is much better and simpler alternative, as also mentioned in JEP:
https://openjdk.java.net/jeps/158

E.g.

"-Xlog:all=error:syslog=my_vm"

Thanks, Robbin

> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> [1] 
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043342.html 
> 
> 
> 
>> /Robbin
>>
>> On 2020-11-03 09:09, Thomas Stüfe wrote:
>>> Hi Yasumasa,
>>>
>>> I don't argue that such a feature would not be useful. Of course it 
>>> would!
>>>
>>> But as with any other added feature, it will come at the cost of
>>> complexity. It will have to be maintained, tests will have to be written
>>> and run. That increases technical debt for us all.
>>>
>>> That is not a reason not to do it, but to think before doing it and
>>> exploring alternatives.
>>>
>>> -- 
>>>
>>> To me, the fact that a logging call now could possibly do Network IO 
>>> fills
>>> me with deep unease. It violates the principle of least surprise. 
>>> Logging
>>> should be as basic as possible, in order to be usable anywhere in code.
>>>
>>> - as had been said before, it would introduce unpredictable timing
>>> behavior. The fact that we have this already today is not a big 
>>> consolation
>>> :(
>>>
>>> - similar to "the User should know what he does" argument - 
>>> unfortunately
>>> many don't, so a balance has to be found to limit support from these 
>>> cases
>>>
>>> - AFAICS we do not do network IO anywhere in the hotspot today. That 
>>> coding
>>> would have to be written and tested. Reusing some other code - e.g. from
>>> the corelibs - is out of question for such a low level API, since you 
>>> don't
>>> want to risk circularities.
>>>
>>> - But now we have a complete network stack below the innocuous logging
>>> call. This imposes further restrictions on where we can log - eg even 
>>> if it
>>> were possible before, logging from signal handling is impossible now.
>>> Without these restrictions documented and tested anywhere. To me this 
>>> makes
>>> UL more and more questionable, and I already tend to shun it when 
>>> possible
>>> in favour of plain tty printing.
>>>
>>> I argued yesterday against Ioi's concurrent-log-draining, but that is
>>> actually more attractive the more I think about it.
>>>
>>> Only, could the same not be achieved with piping stdout/err to a 
>>> separate
>>> tool like netcat, as Leo suggested?
>>>
>>> That solution exists today. If netcat does not do it for you, this could
>>> also be a separate utility - could be even part of the jdk. Conceptually
>>> this would be much the same as a separate thread printing out UL, 
>>> with the
>>> pipe size being the buffer size. Or, communication could happen via 
>>> shared
>>> memory...
>>>
>>> This would have two distinct advantages over doing network IO in UL:
>>> - we see the whole stderr output (e.g. output from the libc, or any 
>>> third
>>> party tools)
>>> - we see output also from a VM which crashed and burned. E.g. any last
>>> words from hs-err reporting.
>>>
>>> Cheers, Thomas
>>>
>>>
>>> On Tue, Nov 3, 2020 at 3:12 AM Yasumasa Suenaga 
>>> <suenaga at oss.nttdata.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I agree this proposal might occur performance issue. However I think 
>>>> it is
>>>> the responsibility of the user.
>>>> If this proposal is implemented, I think it would be transferred to 
>>>> local
>>>> log shipper process (fluentd, logstash on 127.0.0.1) in most case 
>>>> because
>>>> HotSpot does not send log with JSON. And also log shipper may support
>>>> message buffering and message queue persistence.
>>>> We can avoid (in part of) performance/reliability issues with log 
>>>> shipper.
>>>>
>>>> Even if current implementation, performance issues is occurs when 
>>>> the disk
>>>> is very slow (e.g. storage is broken).
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/11/03 6:31, Thomas Stüfe wrote:
>>>>> Hi Ioi,
>>>>>
>>>>> I dimly remember proposals like this from the past. Main problem I 
>>>>> see is
>>>>> how large would you dimension the buffer, and what do you do if the
>>>> buffer
>>>>> cannot be drained rapidly enough. Discard log output? Hold? The former
>>>>> sounds bad, the latter negates the advantages of such a buffer.
>>>>>
>>>>> Then, access to such a buffer would probably have to be synchronized,
>>>>> whereas today AFAIK the log calls do not have to be.
>>>>>
>>>>> Cheers, Thomas
>>>>>
>>>>> On Mon 2. Nov 2020 at 22:18, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>
>>>>>> For performance, maybe the implementation can log into a memory 
>>>>>> buffer,
>>>>>> and use a worker thread to send the output over the network? That 
>>>>>> way we
>>>>>> can minimize the overhead per log_xxx() call.
>>>>>>
>>>>>> I agree that using "-Xlog:foo=debug:network=xyz.com:1234" would be
>>>> quite
>>>>>> handy when you have lots of containers. You don't need to enable 
>>>>>> remote
>>>>>> access to the container's file system just to get to the log file.
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>>
>>>>>> On 11/2/20 11:10 AM, Kirk Pepperdine wrote:
>>>>>>> Hi Thomas,
>>>>>>>
>>>>>>> I appreciate Yasumasa’s desire to be able to redirect UL output to
>>>>>> somewhere other than… I also appreciate that the highly granular 
>>>>>> nature
>>>> of
>>>>>> how UL messages are currently structure can be and indeed are an 
>>>>>> issue.
>>>>>> That said, I’d also like the ability to push the data to some where
>>>> other
>>>>>> than a file on disk.
>>>>>>>
>>>>>>> To the point of granularity, UL might benefit from some message
>>>>>> coarsening. This might also help in with other logging related
>>>> performance
>>>>>> issues that I’ve noted here and there. Quite frankly dealing with 
>>>>>> logs
>>>> in
>>>>>> containers isn’t a wonderful experience. And while I firmly 
>>>>>> believe that
>>>>>> there is more that containers can do to ease this, being able to
>>>> redirect
>>>>>> output to something other than a log file does feel like it would be
>>>>>> helpful. That said, I’m also concerned about the potential 
>>>>>> performance
>>>>>> impacts but I think for this things that one would generally log, 
>>>>>> this
>>>>>> should be minimal.
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Kirk Pepperdine
>>>>>>>
>>>>>>>
>>>>>>>> On Nov 2, 2020, at 4:26 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> one problem I see is that this could introduce a surprising 
>>>>>>>> amount of
>>>>>> lag
>>>>>>>> into log() calls which do look inconspicuous, thereby distorting
>>>> timing
>>>>>>>> behavior or even create timeout effects. We already have that 
>>>>>>>> problem
>>>>>> now
>>>>>>>> to some degree when logging to network shares.
>>>>>>>>
>>>>>>>> Another thing, log output can be very fine granular, which would
>>>> create
>>>>>> a
>>>>>>>> lot of network traffic.
>>>>>>>>
>>>>>>>> Such an addition may also open some security questions.
>>>>>>>>
>>>>>>>>    From a more philosophical standpoint, I like the "do one 
>>>>>>>> thing and
>>>> do
>>>>>> it
>>>>>>>> right" Unix way and this seems more like something an outside tool
>>>>>> should
>>>>>>>> be doing. Which could also aggregate log output better. But I admit
>>>> that
>>>>>>>> argument is weak.
>>>>>>>>
>>>>>>>> Cheers, Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 2, 2020 at 12:21 PM Yasumasa Suenaga <
>>>>>> suenaga at oss.nttdata.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> We need to out UL to stdout and/or file. If we can out it to TCP
>>>>>> socket, I
>>>>>>>>> think it is useful.
>>>>>>>>>
>>>>>>>>> For example, some system gather all logs to document oriented
>>>> databases
>>>>>>>>> (e.g. Elasticsearch) and/or cloud monitoring platform (e.g.
>>>>>> CloudWatch). If
>>>>>>>>> HotSpot can out UL to TCP socket, we can send all logs to them via
>>>> TCP
>>>>>>>>> input plugin (Fluentd, Logstash).
>>>>>>>>>
>>>>>>>>> I think it is useful for container platform. What do you think?
>>>>>>>>> If it is worth to work, I will add CSR and JBS ticket, and also 
>>>>>>>>> will
>>>>>>>>> create patch.
>>>>>>>>>
>>>>>>
>>>>>>
>>>>