[OpenJDK 2D-Dev] [10] RFR: JDK-8188083- NPE in java.awt.image.FilteredImageSource.startProduction

Prahalad Kumar Narayanan prahalad.kumar.narayanan at oracle.com
Thu Nov 9 10:18:02 UTC 2017


Hello Everyone

First, Thanks to Phil & Sergey for their time in review & suggestions.

This is a follow-up on Phil's suggestion to investigate whether the fix proposed in webrev.00 could lead to deadlocks.
Link: http://cr.openjdk.java.net/~pnarayanan/8188083/webrev.00/

I inspected the code, and wrote code snippets to check the behavior. 
Inferences are as follows-
    1. FilteredImageSource helps in producing the image, applying an image filter on the pixels, and sharing the result to the ImageConsumer.
    2. The object holds- An ImageProducer, An ImageFilter, and instantiates a HashTable<ImageConsumer, ImageFilter> for its operation.

    3. The methods of FilteredImageSource are not supposed to be invoked directly from user-code. So how are these methods invoked?
          . startProduction:
                . This method is invoked when a user creates an Image like- Component.createImage(new FilteredImageSource(imgProducer, imgFilter));
                . Such creation could occur on any thread in user code and this internally invokes FilteredImageSource.startProduction through ToolkitImage.
                . The method updates the hash table as required and invokes ImageProducer.startProduction to produce the image. 
          . removeConsumer:
                . This method is invoked once ImageProducer is done with image preparation.
                . The exact thread in which the removeConsumer gets invoked depends on how ImageProducer prepares the image.

    4. Is there a possibility of a deadlock with the proposed fix ? 
          . In my inspection & observation with code snippets, I found that removeConsumer could be invoked in two ways- Multi-threaded & Single threaded.
          . Multi-threaded- 
                . This was observed with FileImageSource as ImageProducer
                . startProduction on this image producer requests image decode operation on a different thread- ImageFetcher thread.
                . Once decoding is complete, removeConsumer is invoked on the ImageFetcher thread.
                . Though startProduction & removeConsumer are invoked on different threads, there is no possibility of a deadlock as the code flow from startProduction returns immediately after posting the request to decode image onto ImageFetcher 's queue.
          . Single threaded-
                . This was observed with OffscreenImageSource as ImageProducer
                . startProduction and removeConsumer are invoked on the same thread. 
                . Since re-entrant threads acquire the locks that they already hold, this wouldn't result in a deadlock.
          . Thus, in both cases I don't see possibility of a deadlock. Hence, we can consider the change in webrev.00 safe to fix the issue.

    5. I have run JTreg test cases before and after the proposed change on java/awt/image/ and javax/swing (where ImageFilter is used).
         No new regressions were observed during the test.

Kindly review by assessing the above information.
If you believe, the fix could be approved, kindly review the CSR as well.

Thank you once again for your time
Have a good day

Prahalad

----- Original Message -----
From: Philip Race 
Sent: Thursday, October 26, 2017 9:30 PM
To: Prahalad Kumar Narayanan
Cc: Sergey Bylokhov; 2d-dev
Subject: Re: [OpenJDK 2D-Dev] [10] RFR: JDK-8188083- NPE in java.awt.image.FilteredImageSource.startProduction

The screenshot shows you directly calling this method in your test which
the documentation says you are not supposed to do.
So I am not able to be 100% sure that the test you have re-creates what
the submitter saw .. in his stack trace you have below it seems to be valid.

But to have a NPE at line 181 is odd.
 181             proxies.put(ic, imgf);

What is null ? proxies was just initialised in this same method so
I don't see how it can be null unless something elsewhere is
resetting it to null.

The only place I see that is removeConsumer
 138     public synchronized void removeConsumer(ImageConsumer ic) {
 139         if (proxies != null) {
 140             ImageFilter imgf =  proxies.get(ic);
 141             if (imgf != null) {
 142                 src.removeConsumer(imgf);
 143                 proxies.remove(ic);
 144                 if (proxies.isEmpty()) {
 145                     proxies = null;
 146                 }
 147             }
 148         }
 149     }

Now that method is synchronized .. OK so I think it might make sense
to mark the additional methods synchronized too but then I automatically
worry about introducing a deadlock.

I can't say for sure that you have done so however and likely there
are no nested locks here so it is probably OK but please run as many
tests as you can find to try to ensure that.

Adding or removing sychronized is binary compatible but since
these are public API methods you should still do a CSR
before pushing if this ends up being the approved fix.

-phil.



On 10/25/17, 11:58 PM, Prahalad Kumar Narayanan wrote: 
Hello Sergey

Thank you for the suggestion.

I 've updated JBS with below information.
   1 . Stack trace information as provided by submitter- 
        java.lang.NullPointerException 
            at java.awt.image.FilteredImageSource.startProduction(FilteredImageSource.java:181) 
            at java.awt.image.FilteredImageSource.startProduction(FilteredImageSource.java:183) 
            at sun.awt.image.ImageRepresentation.startProduction(ImageRepresentation.java:732) 
            at sun.awt.image.ToolkitImage.addWatcher(ToolkitImage.java:221) 
    2. Screenshot showing the occurrence of the exception at FilteredImageSource.java: 181. 
            . The image also shows the test passing with JDK10 containing the fix on VM running Linux Ubuntu x64.

Thank you
Have a good day

Prahalad N.

-----Original Message-----
From: Sergey Bylokhov 
Sent: Thursday, October 26, 2017 12:05 PM
To: Prahalad Kumar Narayanan; 2d-dev
Subject: Re: [OpenJDK 2D-Dev] [10] RFR: JDK-8188083- NPE in java.awt.image.FilteredImageSource.startProduction

Hi, Prahalad.
Can you please add a stack trace which include a line numbers to the bug report? Currently it is unclear in what line an exception is occurred.

On 25/10/2017 21:07, Prahalad Kumar Narayanan wrote:
Hello Everyone

Good day to you.

Kindly review a fix for the bug
     Bug ID: JDK-8188083
     Description: Null Pointer Exception in 
java.awt.image.FilteredImageSource.startProduction

Root Cause
     . FilteredImageSource implements ImageProducer interface
     . All the methods of FilteredImageSource operate on a common data -HashTable, but only a few are synchronized methods.
     . Thus, when synchronized & un-synchronized methods access / modify the hash table in a multi-threaded scenario, it renders the class vulnerable to this exception.

Exception occurrence
     . The submitter has mentioned that there is no test-case to reproduce to this issue.
     . Luckily I was able to observe this issue with a "crude" test code
           . The test triggered 7k threads in an ExecutorService randomly adding/ removing/ invoking startProduction on FilteredImageSource object.
           . I didn't feel it robust enough to be added as a part of regression tests. I tried optimizing the test but in vain.
           . Hence the test code isn't added to the webrev.

Details on the Fix:
     . The concerned methods have been "synchronized" in the fix.
     . No new regression failures were observed with this change.

Kindly review the change and provide your feedback.
In addition, kindly suggest whether this requires CSR as it adds "synchronized" to method signature.

Review Link: http://cr.openjdk.java.net/~pnarayanan/8188083/webrev.00/

Thank you for your time in review
Have a good day

Prahalad N.







--
Best regards, Sergey.


More information about the 2d-dev mailing list