[OpenJDK 2D-Dev] <AWT Dev> RFR: 8240654 : ZGC can cause severe UI application repaint issues

Philip Race philip.race at oracle.com
Thu Jun 11 17:48:18 UTC 2020


Updated webrev here

http://cr.openjdk.java.net/~prr/8240654.1/index.html

The only changes are to the tests - to add -Dsun.java2d.uiScale=1 to the 
onscreen test
and to add printer to the keys for the printing test.

It was pointed out by Stefan from the ZGC team that the changes in 
awt_TrayIcon.cpp and awt_Cursor.cpp
should not be needed because the GDI code in Create_BMP that ultimately 
consumes the data
has processed and copied it into memory allocated by CreateDIBSection 
before passing it to
CreateBitmap. I considered reverting those two files but decided to keep 
them because I
think I would like this fix anyway. We really don't need to lock down 
the VM in these cases.

-phil.


On 6/11/20, 9:55 AM, Philip Race wrote:
> I have confirmed hit a different code path. It goes through generic 2D 
> s/w loops in this case.
> ie we don't use GDIBlitLoops at all. The code in 
> sun/java2d/pipe/DrawImage.java ends up in
> scaleSurfaceData which uses the loops in ScaledBlit.c.
>
> It is a bit surprising to me since I'd expect us to be able to blit 
> directly at device resolution.
> Could we also be taking a performance hit here ? The D3D case doesn't 
> not go through this loop.
>
> However all that is outside the scope of this fix ... I think setting 
> uiScale=1 in the test is all that needs to be done.
>
> -phil.
>
> On 6/11/2020 7:51 AM, Philip Race wrote:
>> Or, maybe we hit a different code path. I'll check that.
>> uiScale=1 is the way to ensure we hit this code path.
>>
>> -phil.
>>
>> On 6/11/20, 7:44 AM, Philip Race wrote:
>>> Can I get clarification here.
>>>
>>> > I do, and had to run with "-Dsun.java2d.uiScale=1" in order to see 
>>> the failure with LargeWindowPaintTest.
>>>
>>> So you both mean a JDK 15 promoted build without this fix and 
>>> without this property passes because you have
>>> a hidpi setup. And to see the failure without the fix you needed the 
>>> above property.
>>> If so we could just be looking at a similar anomaly as I saw with 
>>> printing which uses a very large
>>> image - it reported failure but actually worked !
>>>
>>> Also - for both of you - with the fix and without forcing uiScale=1 
>>> does the test pass ?
>>>
>>> -phil.
>>>
>>> On 6/11/20, 7:10 AM, Jayathirth D v wrote:
>>>> Yes my machine was at 150% scaling.
>>>>
>>>> If I force uiScale = 1, I see that:
>>>> LargeWindowPaintTest fails without patch and passes with patch.
>>>> AlphaPrintTest shows instructions without patch also.
>>>>
>>>> @Phil : I think its better if we test at uiScale=1(larger memory 
>>>> footprint). Please clarify.
>>>>
>>>> Thanks,
>>>> Jay
>>>>
>>>>> On 11-Jun-2020, at 5:53 PM, Kevin Rushforth 
>>>>> <kevin.rushforth at oracle.com <mailto:kevin.rushforth at oracle.com>> 
>>>>> wrote:
>>>>>
>>>>> Do you have a Hi-DPI machine? I do, and had to run with 
>>>>> "-Dsun.java2d.uiScale=1" in order to see the failure with 
>>>>> LargeWindowPaintTest.
>>>>>
>>>>> For AlphaPrintTest, the test deliberately ensures that you print 
>>>>> before saying whether it passes or not. FWIW, I verified that the 
>>>>> printing test on my system was hitting the fallback code with the 
>>>>> patch, but it seemed to print correctly even without the patch.
>>>>>
>>>>> -- Kevin
>>>>>
>>>>>
>>>>> On 6/11/2020 1:58 AM, Jayathirth D v wrote:
>>>>>> Typo : I tried tested -> I tried testing
>>>>>>
>>>>>>> On 11-Jun-2020, at 2:27 PM, Jayathirth D v 
>>>>>>> <JAYATHIRTH.D.V at ORACLE.COM <mailto:JAYATHIRTH.D.V at ORACLE.COM>> 
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Phil,
>>>>>>>
>>>>>>> I tried tested the fix in my Windows 10 machine with Intel 
>>>>>>> integrated UHD Graphics 620.
>>>>>>>
>>>>>>> LargeWindowPaintTest.java passes with/without fix in my machine.
>>>>>>> AlphaPrintTest.java without fix just opens up blank frame 
>>>>>>> without any instructions and with fix it shows instructions for 
>>>>>>> the test.
>>>>>>> Is this expected behaviour?
>>>>>>>
>>>>>>> AlphaPrintTest.java with fix when it shows instructions if I 
>>>>>>> click on Pass(Since I don’t have printer right now) it doesn’t 
>>>>>>> pass/close the window. Only after I click on Print button and 
>>>>>>> then close print dialog it allows me to click on Pass button.
>>>>>>>
>>>>>>> Also how does these tests behave in our internal CI machines?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jay
>>>>>>>
>>>>>>>> On 11-Jun-2020, at 2:18 AM, Philip Race <philip.race at oracle.com 
>>>>>>>> <mailto:philip.race at oracle.com>> wrote:
>>>>>>>>
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
>>>>>>>> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>>>>>>>>
>>>>>>>> This is for JDK 15 so review ASAP please since RDP 1 and the 
>>>>>>>> test cycle are looming.
>>>>>>>>
>>>>>>>> This is not a fix for a JDK bug. It is a bunch of workarounds 
>>>>>>>> for a Microsoft Windows bug affecting
>>>>>>>> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
>>>>>>>> Some extra details about the Windows bug at the end, but first 
>>>>>>>> the technical details of the fix.
>>>>>>>>
>>>>>>>> With ZGC's memory allocation requirement of reserving memory in 
>>>>>>>> 2Mb chunks  some Windows GDI
>>>>>>>> functions, mostly involving some bitmaps APIs may return a 
>>>>>>>> failure code (ie fail!)
>>>>>>>> This typically occurs when Java heap memory is used for a Java 
>>>>>>>> image and then in a JNI
>>>>>>>> call we use GetPrimitiveArrayCritical so that Java heap 
>>>>>>>> allocated memory is passed to a GDI
>>>>>>>> function AND the Java heap memory spans one of the 2Mb boundaries.
>>>>>>>> This is very easy to trigger in almost any Java UI app if the 
>>>>>>>> window is of a large enough (ie typical) size.
>>>>>>>> NB: if you have an Nvidia or ATI card, then you won't see it, 
>>>>>>>> because the D3D pipeline doesn't
>>>>>>>> call the affected method but if you have an Intel chip as do 
>>>>>>>> 90% (?) of laptops you will see it.
>>>>>>>> There are also several other places we found that are affected. 
>>>>>>>> Printing is the other one
>>>>>>>> somewhat easy to trigger. The others : custom cursors and tray 
>>>>>>>> icons are less common.
>>>>>>>> The painful thing here is that there is no definitive list (a 
>>>>>>>> list of the known ones is below) of
>>>>>>>> affected Windows GDI APIs and we are just hunting around our 
>>>>>>>> code trying to see where it
>>>>>>>> might be side-swiped by this bug.
>>>>>>>>
>>>>>>>> The basic approach in these workarounds is that for cases where 
>>>>>>>> performance does not matter we now copy
>>>>>>>> and for cases where performance does matter or larger amounts 
>>>>>>>> of memory is involved we check if
>>>>>>>> the return value of the GDI function indicates failure and then 
>>>>>>>> re-try with a copy of the heap memory.
>>>>>>>> Unless GDI was randomly failing already (unlikely) this should 
>>>>>>>> be a no-risk solution in the high profile cases.
>>>>>>>> We have done performance measurements on the important screen 
>>>>>>>> case and the failures
>>>>>>>> happen fast so the penalty is then in the re-try which is only 
>>>>>>>> if ZGC is enabled.
>>>>>>>> Always copying the memory is slower (and memcpy is the slow 
>>>>>>>> operation) than an alternative approach
>>>>>>>> that "knows" about the memory allocation of ZGC but this 
>>>>>>>> coupling and the complexity seem like they aren't
>>>>>>>> worth it since I haven't seen any visible performance 
>>>>>>>> consequence. That can be revisited
>>>>>>>> some day if need be, but for now we have correctness which is 
>>>>>>>> the key as well as sufficient performance.
>>>>>>>>
>>>>>>>> I've created an automated test for the most important on-screen 
>>>>>>>> case.
>>>>>>>> Also a manual printing test case which invokes ZGC is provided 
>>>>>>>> since there we also only
>>>>>>>> conditionally copy. In the other cases we now always copy so 
>>>>>>>> existing test cases should over those.
>>>>>>>>
>>>>>>>> There is some clean up in this fix - one completely unused  
>>>>>>>> (provably so because it was #if'd out)
>>>>>>>> JNI method in awt_PrintJob.cpp is removed since it had code 
>>>>>>>> that looked like it needed a workaround,
>>>>>>>> which would be somewhat of a waste of effort.
>>>>>>>>
>>>>>>>> the doPrintBand code and its callee bitsToDevice has code I 
>>>>>>>> think we can remove too since
>>>>>>>> I don't see how it ever gets executed (the top down case for 
>>>>>>>> browserPrint == true) but
>>>>>>>> I think I'll save that for a P4 follow-on since it does nothing 
>>>>>>>> that would be affected by this
>>>>>>>> Windows bug.
>>>>>>>>
>>>>>>>> One oddity is the in the printing case I observed that some 
>>>>>>>> times the rendering is performed
>>>>>>>> even if an error code is returned. I don't know why, but in 
>>>>>>>> code we can't tell that it was actually
>>>>>>>> rendered and in any case there is no harm in repeating the call 
>>>>>>>> with copied memory.
>>>>>>>>
>>>>>>>> We are right before the JDK15 stabilisation fork and this fix 
>>>>>>>> needs to go there and will
>>>>>>>> but the webrev is against jdk/client simply because jdk15 does 
>>>>>>>> not exist yet !
>>>>>>>>
>>>>>>>> Please test and review ASAP.
>>>>>>>>
>>>>>>>> About the bug:
>>>>>>>> Microsoft has acknowleged the bug and will publish a knowledge 
>>>>>>>> base article about it
>>>>>>>> but a fix may show up only in a future version of Windows. Not, 
>>>>>>>> it seems, any time soon.
>>>>>>>> Below is a list of potentially affected GDI APIs. Per microsoft 
>>>>>>>> whether it actually manifests in
>>>>>>>> any specific case depends on "branching"
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>>>>>>>>
>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>>>>>>>>
>>>>>>>>
>>>>>>>> -phil.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/2d-dev/attachments/20200611/c68fd493/attachment-0001.htm>


More information about the 2d-dev mailing list