logging fixes for last pre-ZBB JDK7-TL snapshot (6977677, 7016208, 7041595)

David Holmes David.Holmes at oracle.com
Mon May 16 20:52:51 PDT 2011


Daniel D. Daugherty said the following on 05/17/11 13:31:
> First, no apology is needed. I very much appreciate
> the catch on your part.
> 
> Second, it seems that the pendulum that is the Logging API
> continues to swing back and forth: We fix a deadlock... we
> introduce a race... we fix the race... we introduce a deadlock...
> lather, rinse, repeat...

Brings to mind my catch-phrase from the late 90's:

Concurrency! - Just Say NO!

:-)

> I'll file a bug tomorrow and have a test ready sometime
> tomorrow also. Sigh...
> 
> My gut says that this race is not that critical because
> in a properly functioning system one of the threads would
> get an IllegalArgumentException. Please let me know if
> I'm misunderstanding what you're saying here...

Right - the most likely thing to fail because of this is a test that 
wants to see the IllegalArgumentException. Even then the test would have 
to be deliberately attempting this via multiple threads - which seems 
unlikely. A basic test of the form:

     getLogger("A", "foo");
     try {
        getLogger("A", "bar");
        fail();
     }
     catch (IllegalArgumentException iae) {
        // success
     }

will be ok.

The only other thing, I just checked, is whether two threads racing to 
define the same logger will encounter a problem. Other than consuming 
time unnecessarily it appears that findResourceBundle is idempotent if 
called with the same bundle name. So this seems to be safe as well 
(famous last words)

David
-----


> Dan
> 
> 
> 
> On 5/16/2011 8:00 PM, David Holmes wrote:
>> Dan,
>>
>> I'm sorry this didn't come through in time ...
>>
>> Mandy Chung said the following on 05/17/11 05:22:
>>> Logger.java
>>>    Looks good.   The removal of the synchronization required by 
>>> Logger.getLogger fixes the deadlock issue.
>>
>> But this method now has a race condition:
>>
>>  372     // Synchronization is not required here. All synchronization for
>>  373     // adding a new Logger object is handled by 
>> LogManager.addLogger().
>>  374     public static Logger getLogger(String name, String 
>> resourceBundleName) {
>>  375         LogManager manager = LogManager.getLogManager();
>>  376         Logger result = manager.demandLogger(name);
>>  377         if (result.resourceBundleName == null) {
>>  378             // Note: we may get a MissingResourceException here.
>>  379             result.setupResourceInfo(resourceBundleName);
>>  380         } else if 
>> (!result.resourceBundleName.equals(resourceBundleName)) {
>>  381             throw new 
>> IllegalArgumentException(result.resourceBundleName +
>>  382                                 " != " + resourceBundleName);
>>  383         }
>>  384         return result;
>>  385     }
>>
>> Two threads calling this method with the same name but different 
>> resource bundle names can both see null at line 377 and install their 
>> different resource bundles. That section needs to be atomic.
>>
>> David
>> -----
>>
>>>
>>> LogManager.java
>>>     Retrying in a while loop is one way to fix weak ref / gc timing 
>>> issue.
>>>     What about if you refactor the addLogger method to add a new private
>>>     private method equivalent to addLogger (say add(Logger)) that 
>>> will return
>>>     logger if added successfully or null if not.
>>>
>>>     The addLogger method will return add(logger) != logger.  The 
>>> synchronization
>>>     will be done in the new add method.
>>>
>>>     Would this fix 7016208?    It seems to me that addLogger 
>>> returning boolean
>>>     is the cause for this bug.
>>>
>>> Mandy
>>>
>>> On 05/15/11 11:40, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> The final pre-ZBB JDK7-TL snapshot is happening @ 1700 PT on
>>>> Monday, May 14, 2011. This snapshot is targeted for JDK7-B143
>>>> or JDK7-B144 (I'm not sure which). Yes, we're still trying to
>>>> figure out how to merge our JDK & HotSpot Express processes
>>>> with Oracle processes. Please be patient while we iron out the
>>>> wrinkles...
>>>>
>>>> I have three bug fixes for the java.util.logging area:
>>>>
>>>>    6977677 3/2 Deadlock on logging subsystem initialization
>>>>    7016208 4/3 null sometimes returned by java.util.logging.Logger.
>>>>                getLogger(String name) in -server -Xcomp
>>>>    7041595 4/4 add lost test for 6487638
>>>>
>>>> 6977677 is a deadlock between java.util.logging.Logger.getLogger()
>>>> and LogManager.<clinit> via a PlatformLogger. This fix involves:
>>>>
>>>>    src/share/classes/java/util/logging/Logger.java
>>>>    test/java/util/logging/LoggingDeadlock4.java
>>>>
>>>> Mandy, I would like your review of the above bug fix.
>>>>
>>>> 7016208 is an unexpected null return from Logger.getLogger() due to
>>>> the lack of a strong reference. This fix involves:
>>>>
>>>>    src/share/classes/java/util/logging/LogManager.java
>>>>
>>>> Tom R., I would like your review of the above bug fix.
>>>>
>>>> 7041595 is just pushing a Logging deadlock test that got lost a
>>>> long time ago. This fix involves:
>>>>
>>>>    test/java/util/logging/LoggingDeadlock3.java
>>>>    test/java/util/logging/LoggingDeadlock3.props
>>>>
>>>> Because I'm fixing a deadlock in 6977677, I wanted to make sure
>>>> that this Logging deadlock test was back in the mix.
>>>>
>>>> Here is the webrev URL:
>>>>
>>>>    http://cr.openjdk.java.net/~dcubed/logging-batch-20110515-webrev/0/
>>>>
>>>> The comments that I added to the code changes should make the
>>>> reason(s) for the code changes pretty self explanatory.
>>>>
>>>> These changes have been run through JPRT and pass the "jdk_util"
>>>> tests on all platforms. I have also run SDK/JDK logging tests
>>>> and the VM/NSK logging tests on the following configs:
>>>>
>>>>    Solaris X86 * {Client VM, Server VM} product * {-Xmixed, -Xcomp}
>>>>    WinXP * {Client VM, Server VM} product * {-Xmixed, -Xcomp}
>>>>
>>>> Here is the summary from Solaris X86:
>>>>
>>>> Summary of Test Results (8 result dirs)
>>>> =========================================
>>>>    all executed: 2264  all passed: 2264  all ignored: 0  all failed: 0
>>>>    time: 0 hour(s) 17 minute(s)
>>>>
>>>> Here is the partial summary (6 of 8) from WinXP:
>>>>
>>>> Summary of Test Results (6 result dirs)
>>>> =========================================
>>>>    all executed: 1160  all passed: 1160  all ignored: 0  all failed: 0
>>>>    time: 1 hour(s) 0 minute(s)
>>>>
>>>> The last two VM/NSK test runs are still going; yes, WinXP is slower 
>>>> than
>>>> Solaris X86...
>>>>
>>>> Thanks, in advance, for any reviews.
>>>>
>>>> Dan
>>>>
>>>


More information about the hotspot-runtime-dev mailing list