Unified Logging for network

Wed Nov 4 11:12:18 UTC 2020

Hi Yasumasa,

On 2020-11-04 11:06, Yasumasa Suenaga wrote:
> Hi Robbin,
> 
> On 2020/11/04 17:12, Robbin Ehn wrote:
>> Hi, Yasumasa,
>>
>> On 2020-11-04 02:38, Yasumasa Suenaga wrote:
>>> As I said in reply to Thomas [1], I don't want to dump UL to files 
>>> because it should be one log file (should not rotate), so file size 
>>> might be large.
>>> It might occur problems in disk usage in container.
>>
>> Sorry I don't follow, so I'm guessing here what you mean.
>>
>> The direct output from UL is only used by syslog, so something like:
>> "-Xlog:all=error:file=my_vm.%p.log::filecount=2,filesize=1m" should be
>> fine.
>> - Use syslog to collect from my_vm.*.log.*
> 
> IIUC we need to set "my_vm.log.current" to text source in syslog-ng in 
> this solution.
> However my-vm.log.current would be reopened when log rotation is happen, 
> so syslog-ng might not follow current log.
> To follow all logs, we should not set log rotation, and log file would 
> be large.

https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.17/administration-guide/18

"The syslog-ng application notices if a file is renamed or replaced with 
a new file, so it can correctly follow the file even if logrotation is 
used."

If we do something weird so that this does not work we should fix the 
log-rotation so syslog can be proper used.

Thanks, Robbin

> 
> 
>> - Configure syslog to output both to that one large file you mentioned
>> and over network to your e.g. elastic search.
>>
>> If this is not sufficient then implementation syslog/windows event log 
>> is much better and simpler alternative, as also mentioned in JEP:
>> https://openjdk.java.net/jeps/158
>>
>> E.g.
>>
>> "-Xlog:all=error:syslog=my_vm"
> 
> "Future possible extensions" JEP 158 mentions socket output as a backend.
> I think this proposal is same with it.
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Thanks, Robbin
>>
>>
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] 
>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043342.html 
>>>
>>>
>>>
>>>> /Robbin
>>>>
>>>> On 2020-11-03 09:09, Thomas Stüfe wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> I don't argue that such a feature would not be useful. Of course it 
>>>>> would!
>>>>>
>>>>> But as with any other added feature, it will come at the cost of
>>>>> complexity. It will have to be maintained, tests will have to be 
>>>>> written
>>>>> and run. That increases technical debt for us all.
>>>>>
>>>>> That is not a reason not to do it, but to think before doing it and
>>>>> exploring alternatives.
>>>>>
>>>>> -- 
>>>>>
>>>>> To me, the fact that a logging call now could possibly do Network 
>>>>> IO fills
>>>>> me with deep unease. It violates the principle of least surprise. 
>>>>> Logging
>>>>> should be as basic as possible, in order to be usable anywhere in 
>>>>> code.
>>>>>
>>>>> - as had been said before, it would introduce unpredictable timing
>>>>> behavior. The fact that we have this already today is not a big 
>>>>> consolation
>>>>> :(
>>>>>
>>>>> - similar to "the User should know what he does" argument - 
>>>>> unfortunately
>>>>> many don't, so a balance has to be found to limit support from 
>>>>> these cases
>>>>>
>>>>> - AFAICS we do not do network IO anywhere in the hotspot today. 
>>>>> That coding
>>>>> would have to be written and tested. Reusing some other code - e.g. 
>>>>> from
>>>>> the corelibs - is out of question for such a low level API, since 
>>>>> you don't
>>>>> want to risk circularities.
>>>>>
>>>>> - But now we have a complete network stack below the innocuous logging
>>>>> call. This imposes further restrictions on where we can log - eg 
>>>>> even if it
>>>>> were possible before, logging from signal handling is impossible now.
>>>>> Without these restrictions documented and tested anywhere. To me 
>>>>> this makes
>>>>> UL more and more questionable, and I already tend to shun it when 
>>>>> possible
>>>>> in favour of plain tty printing.
>>>>>
>>>>> I argued yesterday against Ioi's concurrent-log-draining, but that is
>>>>> actually more attractive the more I think about it.
>>>>>
>>>>> Only, could the same not be achieved with piping stdout/err to a 
>>>>> separate
>>>>> tool like netcat, as Leo suggested?
>>>>>
>>>>> That solution exists today. If netcat does not do it for you, this 
>>>>> could
>>>>> also be a separate utility - could be even part of the jdk. 
>>>>> Conceptually
>>>>> this would be much the same as a separate thread printing out UL, 
>>>>> with the
>>>>> pipe size being the buffer size. Or, communication could happen via 
>>>>> shared
>>>>> memory...
>>>>>
>>>>> This would have two distinct advantages over doing network IO in UL:
>>>>> - we see the whole stderr output (e.g. output from the libc, or any 
>>>>> third
>>>>> party tools)
>>>>> - we see output also from a VM which crashed and burned. E.g. any last
>>>>> words from hs-err reporting.
>>>>>
>>>>> Cheers, Thomas
>>>>>
>>>>>
>>>>> On Tue, Nov 3, 2020 at 3:12 AM Yasumasa Suenaga 
>>>>> <suenaga at oss.nttdata.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I agree this proposal might occur performance issue. However I 
>>>>>> think it is
>>>>>> the responsibility of the user.
>>>>>> If this proposal is implemented, I think it would be transferred 
>>>>>> to local
>>>>>> log shipper process (fluentd, logstash on 127.0.0.1) in most case 
>>>>>> because
>>>>>> HotSpot does not send log with JSON. And also log shipper may support
>>>>>> message buffering and message queue persistence.
>>>>>> We can avoid (in part of) performance/reliability issues with log 
>>>>>> shipper.
>>>>>>
>>>>>> Even if current implementation, performance issues is occurs when 
>>>>>> the disk
>>>>>> is very slow (e.g. storage is broken).
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/11/03 6:31, Thomas Stüfe wrote:
>>>>>>> Hi Ioi,
>>>>>>>
>>>>>>> I dimly remember proposals like this from the past. Main problem 
>>>>>>> I see is
>>>>>>> how large would you dimension the buffer, and what do you do if the
>>>>>> buffer
>>>>>>> cannot be drained rapidly enough. Discard log output? Hold? The 
>>>>>>> former
>>>>>>> sounds bad, the latter negates the advantages of such a buffer.
>>>>>>>
>>>>>>> Then, access to such a buffer would probably have to be 
>>>>>>> synchronized,
>>>>>>> whereas today AFAIK the log calls do not have to be.
>>>>>>>
>>>>>>> Cheers, Thomas
>>>>>>>
>>>>>>> On Mon 2. Nov 2020 at 22:18, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>
>>>>>>>> For performance, maybe the implementation can log into a memory 
>>>>>>>> buffer,
>>>>>>>> and use a worker thread to send the output over the network? 
>>>>>>>> That way we
>>>>>>>> can minimize the overhead per log_xxx() call.
>>>>>>>>
>>>>>>>> I agree that using "-Xlog:foo=debug:network=xyz.com:1234" would be
>>>>>> quite
>>>>>>>> handy when you have lots of containers. You don't need to enable 
>>>>>>>> remote
>>>>>>>> access to the container's file system just to get to the log file.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> - Ioi
>>>>>>>>
>>>>>>>> On 11/2/20 11:10 AM, Kirk Pepperdine wrote:
>>>>>>>>> Hi Thomas,
>>>>>>>>>
>>>>>>>>> I appreciate Yasumasa’s desire to be able to redirect UL output to
>>>>>>>> somewhere other than… I also appreciate that the highly granular 
>>>>>>>> nature
>>>>>> of
>>>>>>>> how UL messages are currently structure can be and indeed are an 
>>>>>>>> issue.
>>>>>>>> That said, I’d also like the ability to push the data to some where
>>>>>> other
>>>>>>>> than a file on disk.
>>>>>>>>>
>>>>>>>>> To the point of granularity, UL might benefit from some message
>>>>>>>> coarsening. This might also help in with other logging related
>>>>>> performance
>>>>>>>> issues that I’ve noted here and there. Quite frankly dealing 
>>>>>>>> with logs
>>>>>> in
>>>>>>>> containers isn’t a wonderful experience. And while I firmly 
>>>>>>>> believe that
>>>>>>>> there is more that containers can do to ease this, being able to
>>>>>> redirect
>>>>>>>> output to something other than a log file does feel like it 
>>>>>>>> would be
>>>>>>>> helpful. That said, I’m also concerned about the potential 
>>>>>>>> performance
>>>>>>>> impacts but I think for this things that one would generally 
>>>>>>>> log, this
>>>>>>>> should be minimal.
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>> Kirk Pepperdine
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Nov 2, 2020, at 4:26 AM, Thomas Stüfe 
>>>>>>>>>> <thomas.stuefe at gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> one problem I see is that this could introduce a surprising 
>>>>>>>>>> amount of
>>>>>>>> lag
>>>>>>>>>> into log() calls which do look inconspicuous, thereby distorting
>>>>>> timing
>>>>>>>>>> behavior or even create timeout effects. We already have that 
>>>>>>>>>> problem
>>>>>>>> now
>>>>>>>>>> to some degree when logging to network shares.
>>>>>>>>>>
>>>>>>>>>> Another thing, log output can be very fine granular, which would
>>>>>> create
>>>>>>>> a
>>>>>>>>>> lot of network traffic.
>>>>>>>>>>
>>>>>>>>>> Such an addition may also open some security questions.
>>>>>>>>>>
>>>>>>>>>>    From a more philosophical standpoint, I like the "do one 
>>>>>>>>>> thing and
>>>>>> do
>>>>>>>> it
>>>>>>>>>> right" Unix way and this seems more like something an outside 
>>>>>>>>>> tool
>>>>>>>> should
>>>>>>>>>> be doing. Which could also aggregate log output better. But I 
>>>>>>>>>> admit
>>>>>> that
>>>>>>>>>> argument is weak.
>>>>>>>>>>
>>>>>>>>>> Cheers, Thomas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Nov 2, 2020 at 12:21 PM Yasumasa Suenaga <
>>>>>>>> suenaga at oss.nttdata.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> We need to out UL to stdout and/or file. If we can out it to TCP
>>>>>>>> socket, I
>>>>>>>>>>> think it is useful.
>>>>>>>>>>>
>>>>>>>>>>> For example, some system gather all logs to document oriented
>>>>>> databases
>>>>>>>>>>> (e.g. Elasticsearch) and/or cloud monitoring platform (e.g.
>>>>>>>> CloudWatch). If
>>>>>>>>>>> HotSpot can out UL to TCP socket, we can send all logs to 
>>>>>>>>>>> them via
>>>>>> TCP
>>>>>>>>>>> input plugin (Fluentd, Logstash).
>>>>>>>>>>>
>>>>>>>>>>> I think it is useful for container platform. What do you think?
>>>>>>>>>>> If it is worth to work, I will add CSR and JBS ticket, and 
>>>>>>>>>>> also will
>>>>>>>>>>> create patch.
>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>