RFR: 8292989: Avoid dynamic memory in AsyncLogWriter [v2]

Tue Sep 6 16:33:07 UTC 2022

On Fri, 2 Sep 2022 22:35:41 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> Current implementation of AsyncLogWriter uses dynamic memory. There are 2 sources. 
>> 
>> 1. Overhead of pointer-based linked-list. 
>> 2. strdup of message contents
>> 
>> This implementation has impact on glibc/malloc. If allocation of logsites interleave with other allocation, it's hard to clean up all glibc arenas. This worsens fragmentation issue.
>> 
>> In this PR, we replace linked-list with `2 pre-allocated raw buffers`. AsyncLogWriter appends payload AsyncLogMessage to the serving buffer and avoids all dynamic allocation. Please note this effort won't eliminate mutex lock. We use ping-pong buffers to guarantee AsyncLogWriter is still non-blocking. A buffer serves as a FIFO queue like before. 
>> 
>> In addition, AsyncLogWriter doesn't enqueue meta messages anymore when it needs to report the number of discarded messages. This is archived using a temporary hashtable called `snapshot`. It copies the working hashtable with the protection of lock and reset it. After writing all regular messages, AsyncLogWriter writes meta messages from snapshot.
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update according to feedbacks.

Hi Xin,

This is good, and goes into the right direction.

General remark:

Since you allocate the buffers up-front, you really don't have to handle allocation failures. Just consider: 

- In the vast majority of cases the user will not modify `AsyncLogBufferSize`, and if we fail to malloc 4 MB at VM start we have a much larger problem.
- And *if* the user specifies `AsyncLogBufferSize`, but with such an outrageously large buffer that it fails to allocate, it is also better to fail. Because the user had a good reason to specify `-Xlog:async`, and you don't want to quietly fall back to synchronous logging. Also, maybe the user made just a stupid argument mistake, then again, you don't want to quielty swallow that error.

If you remove the allocation failure handling, you can simplify quite a bit, e.g. also get rid of `Buffer::is_valid()`, `AsyncLogWriter::Buffer::~Buffer()`, etc.

More remarks inline.

Cheers, Thomas

src/hotspot/share/logging/logAsyncWriter.cpp line 75:

> 73:     p->set_message(dest);
> 74:     _pos += sz;
> 75:     return true;

That's pretty hacky... :-) And `set_message` only exists because you need to repair the message pointer here, right?

Proposal:

Make AsyncLogMessage self-contained and variable-sized by letting the message follow the header. Like this:

class AsyncLogMessage {
  LogFileStreamOutput* _output;
  const LogDecorations _decorations;
  const size_t _len; // optional, or could also be 32- maybe even 16 bit
public:
  size_t len_in_bytes() const { return align_up(sizeof(AsyncLogMessage) + _len); }
  const char* message() { return (const char*) (this + 1); }
};

Then give it a placement new() and place it inside the Buffer right away instead of creating it on the stack and then copying it to the buffer. Just an idea.

src/hotspot/share/logging/logAsyncWriter.hpp line 50:

> 48:     constexpr size_t size = align_up(sizeof(*this), sizeof(void*));
> 49:     return _message != nullptr ? align_up(size + strlen(_message) + 1, sizeof(void*)): size;
> 50:   }

See my proposal. I would cache the strlen of the message in the header as the last member. You can use a smaller type too (Idk if we ever print out more than 64K in one log line). Since the string does not have to be aligned, you can save some bytes that way.

src/hotspot/share/logging/logAsyncWriter.hpp line 115:

> 113:       size_t _curr;
> 114: 
> 115:       void* raw_ptr() const {

Why a raw pointer? Why not `AsyncLogMessage*`? Iterator should never *not* point to a valid `AsyncLogMessage*`, unless we are at the end of the buffer.

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.org/jdk/pull/10092