[jdk17] RFR: 8269865: Async UL needs to handle ERANGE on exceeding SEM_VALUE_MAX

David Holmes david.holmes at oracle.com
Tue Jul 13 04:39:40 UTC 2021


On 13/07/2021 1:03 pm, David Holmes wrote:
> On 13/07/2021 11:41 am, Xin Liu wrote:
>> On Tue, 13 Jul 2021 01:18:28 GMT, David Holmes 
>> <david.holmes at oracle.com> wrote:
>>
>>> But we already circumvent that for async logging. We can't use async
>> logging until after we have called AsyncLogWriter::initialize(). So the
>> synchronization objects used by AsyncLogWriter can be plain
>> Mutex/Monitor, they don't need to be Semaphore.
>>
>> That is not true. I deleted finalize() because it was complex and not 
>> necessary after we switch from Monitor to Semaphore.
>> A thread may emit logs even it has deletes itself.
> 
> Ah - sorry. I was looking at the VM initialization issue not the thread 
> termination issue. Pity.

As @pchilano has pointed out to me we can use the lower-level 
PlatformMonitor to avoid the problems with Monitor. It is of a similar 
low-level as Semaphore.

Instead of having a binary semaphore to control access to the _buffer, 
and a counting semaphore to control the wakeup of the async writer 
thread, we can just use a PlatformMonitor with a _data_available state 
field e.g (rough outline).

void AsyncLogWriter::write() {
   AsyncLogBuffer logs;

   { // critical region
     AsyncLogLocker locker;

     _buffer.pop_all(&logs);
     // append meta-messages of dropped counters
     AsyncLogMapIterator dropped_counters_iter(logs);
     _stats.iterate(&dropped_counters_iter);
     _data_available = false;
   }
   ...
}

void AsyncLogWriter::run() {
   while (true) {
     {
       AsyncLogLocker locker;
       while (!_data_available) _lock.wait();
     }
     write();
   }
}

void AsyncLogWriter::enqueue_locked(const AsyncLogMessage& msg) {
...
   _buffer.push_back(msg);
   _data_available = true;
   _lock.notify();
}

David
-----

>> I reviewed this line. 
>> https://github.com/apple/darwin-xnu/blob/main/osfmk/kern/sync_sema.c#L400
>> In my understanding, `semaphore_signal` will try to wait up one in the 
>> wait queue of the semaphore. if kr != KERN_SUCCESS, it will reset 
>> counter to 0 because no one is waiting on that. In our case, we should 
>> take this route.  I think the behavior is defined.
> 
> Sorry but there is no way I'm going to trust any ad-hoc guesses about 
> what exactly that code will do. I don't know any of the invariants of 
> the data structure, exactly what the different fields represent, or how 
> exactly they are maintained. Heck we don't even know that really is the 
> code involved!
> 
> I will continue to think this about this problem, but I'm not seeing any 
> solution I like so far.
> 
> Thanks,
> David
> -----
> 
> 
>>
>> thanks,
>> --lx
>>
>> -------------
>>
>> PR: https://git.openjdk.java.net/jdk17/pull/216
>>


More information about the hotspot-runtime-dev mailing list