Unified Logging for network

Wed Nov 4 10:06:24 UTC 2020

Hi Robbin,

On 2020/11/04 17:12, Robbin Ehn wrote:
> Hi, Yasumasa,
> 
> On 2020-11-04 02:38, Yasumasa Suenaga wrote:
>> As I said in reply to Thomas [1], I don't want to dump UL to files because it should be one log file (should not rotate), so file size might be large.
>> It might occur problems in disk usage in container.
> 
> Sorry I don't follow, so I'm guessing here what you mean.
> 
> The direct output from UL is only used by syslog, so something like:
> "-Xlog:all=error:file=my_vm.%p.log::filecount=2,filesize=1m" should be
> fine.
> - Use syslog to collect from my_vm.*.log.*

IIUC we need to set "my_vm.log.current" to text source in syslog-ng in this solution.
However my-vm.log.current would be reopened when log rotation is happen, so syslog-ng might not follow current log.
To follow all logs, we should not set log rotation, and log file would be large.

> - Configure syslog to output both to that one large file you mentioned
> and over network to your e.g. elastic search.
> 
> If this is not sufficient then implementation syslog/windows event log is much better and simpler alternative, as also mentioned in JEP:
> https://openjdk.java.net/jeps/158
> 
> E.g.
> 
> "-Xlog:all=error:syslog=my_vm"

"Future possible extensions" JEP 158 mentions socket output as a backend.
I think this proposal is same with it.

Thanks,

Yasumasa

> Thanks, Robbin
> 
> 
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043342.html
>>
>>
>>> /Robbin
>>>
>>> On 2020-11-03 09:09, Thomas Stüfe wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I don't argue that such a feature would not be useful. Of course it would!
>>>>
>>>> But as with any other added feature, it will come at the cost of
>>>> complexity. It will have to be maintained, tests will have to be written
>>>> and run. That increases technical debt for us all.
>>>>
>>>> That is not a reason not to do it, but to think before doing it and
>>>> exploring alternatives.
>>>>
>>>> -- 
>>>>
>>>> To me, the fact that a logging call now could possibly do Network IO fills
>>>> me with deep unease. It violates the principle of least surprise. Logging
>>>> should be as basic as possible, in order to be usable anywhere in code.
>>>>
>>>> - as had been said before, it would introduce unpredictable timing
>>>> behavior. The fact that we have this already today is not a big consolation
>>>> :(
>>>>
>>>> - similar to "the User should know what he does" argument - unfortunately
>>>> many don't, so a balance has to be found to limit support from these cases
>>>>
>>>> - AFAICS we do not do network IO anywhere in the hotspot today. That coding
>>>> would have to be written and tested. Reusing some other code - e.g. from
>>>> the corelibs - is out of question for such a low level API, since you don't
>>>> want to risk circularities.
>>>>
>>>> - But now we have a complete network stack below the innocuous logging
>>>> call. This imposes further restrictions on where we can log - eg even if it
>>>> were possible before, logging from signal handling is impossible now.
>>>> Without these restrictions documented and tested anywhere. To me this makes
>>>> UL more and more questionable, and I already tend to shun it when possible
>>>> in favour of plain tty printing.
>>>>
>>>> I argued yesterday against Ioi's concurrent-log-draining, but that is
>>>> actually more attractive the more I think about it.
>>>>
>>>> Only, could the same not be achieved with piping stdout/err to a separate
>>>> tool like netcat, as Leo suggested?
>>>>
>>>> That solution exists today. If netcat does not do it for you, this could
>>>> also be a separate utility - could be even part of the jdk. Conceptually
>>>> this would be much the same as a separate thread printing out UL, with the
>>>> pipe size being the buffer size. Or, communication could happen via shared
>>>> memory...
>>>>
>>>> This would have two distinct advantages over doing network IO in UL:
>>>> - we see the whole stderr output (e.g. output from the libc, or any third
>>>> party tools)
>>>> - we see output also from a VM which crashed and burned. E.g. any last
>>>> words from hs-err reporting.
>>>>
>>>> Cheers, Thomas
>>>>
>>>>
>>>> On Tue, Nov 3, 2020 at 3:12 AM Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I agree this proposal might occur performance issue. However I think it is
>>>>> the responsibility of the user.
>>>>> If this proposal is implemented, I think it would be transferred to local
>>>>> log shipper process (fluentd, logstash on 127.0.0.1) in most case because
>>>>> HotSpot does not send log with JSON. And also log shipper may support
>>>>> message buffering and message queue persistence.
>>>>> We can avoid (in part of) performance/reliability issues with log shipper.
>>>>>
>>>>> Even if current implementation, performance issues is occurs when the disk
>>>>> is very slow (e.g. storage is broken).
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/11/03 6:31, Thomas Stüfe wrote:
>>>>>> Hi Ioi,
>>>>>>
>>>>>> I dimly remember proposals like this from the past. Main problem I see is
>>>>>> how large would you dimension the buffer, and what do you do if the
>>>>> buffer
>>>>>> cannot be drained rapidly enough. Discard log output? Hold? The former
>>>>>> sounds bad, the latter negates the advantages of such a buffer.
>>>>>>
>>>>>> Then, access to such a buffer would probably have to be synchronized,
>>>>>> whereas today AFAIK the log calls do not have to be.
>>>>>>
>>>>>> Cheers, Thomas
>>>>>>
>>>>>> On Mon 2. Nov 2020 at 22:18, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>
>>>>>>> For performance, maybe the implementation can log into a memory buffer,
>>>>>>> and use a worker thread to send the output over the network? That way we
>>>>>>> can minimize the overhead per log_xxx() call.
>>>>>>>
>>>>>>> I agree that using "-Xlog:foo=debug:network=xyz.com:1234" would be
>>>>> quite
>>>>>>> handy when you have lots of containers. You don't need to enable remote
>>>>>>> access to the container's file system just to get to the log file.
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>> On 11/2/20 11:10 AM, Kirk Pepperdine wrote:
>>>>>>>> Hi Thomas,
>>>>>>>>
>>>>>>>> I appreciate Yasumasa’s desire to be able to redirect UL output to
>>>>>>> somewhere other than… I also appreciate that the highly granular nature
>>>>> of
>>>>>>> how UL messages are currently structure can be and indeed are an issue.
>>>>>>> That said, I’d also like the ability to push the data to some where
>>>>> other
>>>>>>> than a file on disk.
>>>>>>>>
>>>>>>>> To the point of granularity, UL might benefit from some message
>>>>>>> coarsening. This might also help in with other logging related
>>>>> performance
>>>>>>> issues that I’ve noted here and there. Quite frankly dealing with logs
>>>>> in
>>>>>>> containers isn’t a wonderful experience. And while I firmly believe that
>>>>>>> there is more that containers can do to ease this, being able to
>>>>> redirect
>>>>>>> output to something other than a log file does feel like it would be
>>>>>>> helpful. That said, I’m also concerned about the potential performance
>>>>>>> impacts but I think for this things that one would generally log, this
>>>>>>> should be minimal.
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Kirk Pepperdine
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Nov 2, 2020, at 4:26 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> one problem I see is that this could introduce a surprising amount of
>>>>>>> lag
>>>>>>>>> into log() calls which do look inconspicuous, thereby distorting
>>>>> timing
>>>>>>>>> behavior or even create timeout effects. We already have that problem
>>>>>>> now
>>>>>>>>> to some degree when logging to network shares.
>>>>>>>>>
>>>>>>>>> Another thing, log output can be very fine granular, which would
>>>>> create
>>>>>>> a
>>>>>>>>> lot of network traffic.
>>>>>>>>>
>>>>>>>>> Such an addition may also open some security questions.
>>>>>>>>>
>>>>>>>>>    From a more philosophical standpoint, I like the "do one thing and
>>>>> do
>>>>>>> it
>>>>>>>>> right" Unix way and this seems more like something an outside tool
>>>>>>> should
>>>>>>>>> be doing. Which could also aggregate log output better. But I admit
>>>>> that
>>>>>>>>> argument is weak.
>>>>>>>>>
>>>>>>>>> Cheers, Thomas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 2, 2020 at 12:21 PM Yasumasa Suenaga <
>>>>>>> suenaga at oss.nttdata.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> We need to out UL to stdout and/or file. If we can out it to TCP
>>>>>>> socket, I
>>>>>>>>>> think it is useful.
>>>>>>>>>>
>>>>>>>>>> For example, some system gather all logs to document oriented
>>>>> databases
>>>>>>>>>> (e.g. Elasticsearch) and/or cloud monitoring platform (e.g.
>>>>>>> CloudWatch). If
>>>>>>>>>> HotSpot can out UL to TCP socket, we can send all logs to them via
>>>>> TCP
>>>>>>>>>> input plugin (Fluentd, Logstash).
>>>>>>>>>>
>>>>>>>>>> I think it is useful for container platform. What do you think?
>>>>>>>>>> If it is worth to work, I will add CSR and JBS ticket, and also will
>>>>>>>>>> create patch.
>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>