[Bug 1743] New: Intermittant deadlock in PluginRequestProcessor

Mon Apr 21 23:23:44 UTC 2014

http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=1743

            Bug ID: 1743
           Summary: Intermittant deadlock in PluginRequestProcessor
           Product: IcedTea-Web
           Version: 1.4
          Hardware: x86_64
                OS: OpenBSD
            Status: NEW
          Severity: critical
          Priority: P5
         Component: Plugin
          Assignee: dbhole at redhat.com
          Reporter: kurt at intricatesoftware.com
                CC: unassigned at icedtea.classpath.org

Created attachment 1073
  --> http://icedtea.classpath.org/bugzilla/attachment.cgi?id=1073&action=edit
Deadlock and global variable fix -version 1

icedtea-web deadlocks intermittantly on OpenBSD. This problem has been debugged
and a patch created to resolve the problem. The deadlock stems from an improper
use of a condition variable broadcast and wait managing the message queue.

Message queing looks like this:

            // Update queue synchronously
            pthread_mutex_lock(&message_queue_mutex);
            message_queue->push_back(message_parts);
            pthread_mutex_unlock(&message_queue_mutex);

            // Broadcast that a message is now available
            pthread_cond_broadcast(&cond_message_available);

Message dequeing looks like this:

        pthread_mutex_lock(&message_queue_mutex);
        if (message_queue->size() > 0)
        {
            message_parts = message_queue->front();
            message_queue->erase(message_queue->begin());
        }
        pthread_mutex_unlock(&message_queue_mutex);

        if (message_parts)
        {
<snip>
        } else
        {
            pthread_mutex_lock(&wait_mutex);
            pthread_cond_wait(&cond_message_available, &wait_mutex);
            pthread_mutex_unlock(&wait_mutex);
        }

When the message queue goes to zero, the dequeing threads can enter the
cond_wait state but the queing thread race in and add a message to the queue
which the dequing thread misses. In this case there will be a message on the
queue but the dequing threads are blocked waiting for another message. If
another message doesn't appear, the dequing threads will remain blocked
forever.

Note that condition variable wait uses a completely unrealated mutex for its
wait and it doesn't check that the queue is realy at zero length prior to going
into wait. In order to avoid missed wakeups, or waiting when there is realy
something on the queue, the following changes are needed:

1) the condition variable must use the same mutex that is used to queue
messages to the queue
2) while holding the message_queue_mutex, check that the queue is indeed empty
prior to calling pthread_cond_wait()
3) When the producer queues an item to the queue, while holding the
message_queue_mutex locked, call 
pthread_cond_broadcast(&cond_message_available) or even better 
pthread_cond_signal(&cond_message_available) to avoid the thundering herd
problem.

The minimal fix for the deadlocks should look like this:

Message queueing:

            // Update queue synchronously
            pthread_mutex_lock(&message_queue_mutex);
            message_queue->push_back(message_parts);
            pthread_cond_signal(&cond_message_available);
            pthread_mutex_unlock(&message_queue_mutex);

Message dequeing:

<snip>
        if (message_parts)
        {
<snip>
        } else
        {
            pthread_mutex_lock(&message_queue_mutex);
            if (message_queue->size() == 0)
            {
                pthread_cleanup_push(queue_wait_cleanup,
&processor->message_queue_mutex);
                pthread_cond_wait(&cond_message_available,
&processor->message_queue_mutex);
                pthread_cleanup_pop(0);
            }
            pthread_mutex_unlock(&message_queue_mutex);
        }

Plus an additional static function:

static void
queue_wait_cleanup(void* data)
{
    pthread_mutex_unlock((pthread_mutex_t*) data);
}

There are some other related issues with the PluginRequestProcessor and
queue_processor(); it mixes global locks and condition variables with a single
instance PluginRequestProcessor. That is not a great design choice and would be
more robust if the PluginRequestProcessor class contanined the mutexes and
condition variables that it requires. For example, the current design could
never allow for a second instance of PluginRequestProcessor to exist as it
would reinit and destroy the global mutexes and condition variables the first
instance created. Even though two instances will never be created, trying to
understand this code required more work to confirm this.

I have created a patch which incorprorates both the minimal deadlock fix and
the elminiation of the global variables used by the PluginRequestProcessor.
This patch is attached. It could be improved somewhat by making
message_queue_mutex, cond_message_available, message_queue, and syn_write_mutex
private and adjusting queue_processor() to call a new public function
queueProcessor() that accesses the private variables.

I would be happy to submit another version that performs that cleanup as well.
Please let me know.

Regards,
-Kurt

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20140421/23c1c7d1/attachment.html>