[OpenJDK 2D-Dev] <AWT Dev> RFR: 8240654 : ZGC can cause severe UI application repaint issues

Philip Race philip.race at oracle.com
Thu Jun 11 18:46:01 UTC 2020


one more update to the tests
cr.openjdk.java.net/~prr/8240654.2/

I added
@requires vm.gc.Z per the VM folks, ZGC needs Windows Server 2019 or the 
same vintage Window 10. if run on Windows Server 2016
ZGC errors out. This should prevent that. -phil.
On 6/11/2020 11:13 AM, Kevin Rushforth wrote:
> +1
>
> The updated LargeWindowPaintTest fails for me without the fix without 
> my having to manually set uiScale. It passes with the fix.
>
> Interestingly enough, I finally saw the problem that Jay reported with 
> AlphaPrintTest: without the fix I initially get a blank (all white) 
> window. If I resize it then it is drawn. With the fix everything is fine.
>
> -- Kevin
>
> On 6/11/2020 10:48 AM, Philip Race wrote:
>> Updated webrev here
>>
>> http://cr.openjdk.java.net/~prr/8240654.1/index.html
>>
>> The only changes are to the tests - to add -Dsun.java2d.uiScale=1 to 
>> the onscreen test
>> and to add printer to the keys for the printing test.
>>
>> It was pointed out by Stefan from the ZGC team that the changes in 
>> awt_TrayIcon.cpp and awt_Cursor.cpp
>> should not be needed because the GDI code in Create_BMP that 
>> ultimately consumes the data
>> has processed and copied it into memory allocated by CreateDIBSection 
>> before passing it to
>> CreateBitmap. I considered reverting those two files but decided to 
>> keep them because I
>> think I would like this fix anyway. We really don't need to lock down 
>> the VM in these cases.
>>
>> -phil.
>>
>>
>> On 6/11/20, 9:55 AM, Philip Race wrote:
>>> I have confirmed hit a different code path. It goes through generic 
>>> 2D s/w loops in this case.
>>> ie we don't use GDIBlitLoops at all. The code in 
>>> sun/java2d/pipe/DrawImage.java ends up in
>>> scaleSurfaceData which uses the loops in ScaledBlit.c.
>>>
>>> It is a bit surprising to me since I'd expect us to be able to blit 
>>> directly at device resolution.
>>> Could we also be taking a performance hit here ? The D3D case 
>>> doesn't not go through this loop.
>>>
>>> However all that is outside the scope of this fix ... I think 
>>> setting uiScale=1 in the test is all that needs to be done.
>>>
>>> -phil.
>>>
>>> On 6/11/2020 7:51 AM, Philip Race wrote:
>>>> Or, maybe we hit a different code path. I'll check that.
>>>> uiScale=1 is the way to ensure we hit this code path.
>>>>
>>>> -phil.
>>>>
>>>> On 6/11/20, 7:44 AM, Philip Race wrote:
>>>>> Can I get clarification here.
>>>>>
>>>>> > I do, and had to run with "-Dsun.java2d.uiScale=1" in order to 
>>>>> see the failure with LargeWindowPaintTest.
>>>>>
>>>>> So you both mean a JDK 15 promoted build without this fix and 
>>>>> without this property passes because you have
>>>>> a hidpi setup. And to see the failure without the fix you needed 
>>>>> the above property.
>>>>> If so we could just be looking at a similar anomaly as I saw with 
>>>>> printing which uses a very large
>>>>> image - it reported failure but actually worked !
>>>>>
>>>>> Also - for both of you - with the fix and without forcing 
>>>>> uiScale=1 does the test pass ?
>>>>>
>>>>> -phil.
>>>>>
>>>>> On 6/11/20, 7:10 AM, Jayathirth D v wrote:
>>>>>> Yes my machine was at 150% scaling.
>>>>>>
>>>>>> If I force uiScale = 1, I see that:
>>>>>> LargeWindowPaintTest fails without patch and passes with patch.
>>>>>> AlphaPrintTest shows instructions without patch also.
>>>>>>
>>>>>> @Phil : I think its better if we test at uiScale=1(larger memory 
>>>>>> footprint). Please clarify.
>>>>>>
>>>>>> Thanks,
>>>>>> Jay
>>>>>>
>>>>>>> On 11-Jun-2020, at 5:53 PM, Kevin Rushforth 
>>>>>>> <kevin.rushforth at oracle.com <mailto:kevin.rushforth at oracle.com>> 
>>>>>>> wrote:
>>>>>>>
>>>>>>> Do you have a Hi-DPI machine? I do, and had to run with 
>>>>>>> "-Dsun.java2d.uiScale=1" in order to see the failure with 
>>>>>>> LargeWindowPaintTest.
>>>>>>>
>>>>>>> For AlphaPrintTest, the test deliberately ensures that you print 
>>>>>>> before saying whether it passes or not. FWIW, I verified that 
>>>>>>> the printing test on my system was hitting the fallback code 
>>>>>>> with the patch, but it seemed to print correctly even without 
>>>>>>> the patch.
>>>>>>>
>>>>>>> -- Kevin
>>>>>>>
>>>>>>>
>>>>>>> On 6/11/2020 1:58 AM, Jayathirth D v wrote:
>>>>>>>> Typo : I tried tested -> I tried testing
>>>>>>>>
>>>>>>>>> On 11-Jun-2020, at 2:27 PM, Jayathirth D v 
>>>>>>>>> <JAYATHIRTH.D.V at ORACLE.COM <mailto:JAYATHIRTH.D.V at ORACLE.COM>> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Phil,
>>>>>>>>>
>>>>>>>>> I tried tested the fix in my Windows 10 machine with Intel 
>>>>>>>>> integrated UHD Graphics 620.
>>>>>>>>>
>>>>>>>>> LargeWindowPaintTest.java passes with/without fix in my machine.
>>>>>>>>> AlphaPrintTest.java without fix just opens up blank frame 
>>>>>>>>> without any instructions and with fix it shows instructions 
>>>>>>>>> for the test.
>>>>>>>>> Is this expected behaviour?
>>>>>>>>>
>>>>>>>>> AlphaPrintTest.java with fix when it shows instructions if I 
>>>>>>>>> click on Pass(Since I don’t have printer right now) it doesn’t 
>>>>>>>>> pass/close the window. Only after I click on Print button and 
>>>>>>>>> then close print dialog it allows me to click on Pass button.
>>>>>>>>>
>>>>>>>>> Also how does these tests behave in our internal CI machines?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jay
>>>>>>>>>
>>>>>>>>>> On 11-Jun-2020, at 2:18 AM, Philip Race 
>>>>>>>>>> <philip.race at oracle.com <mailto:philip.race at oracle.com>> wrote:
>>>>>>>>>>
>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>>>>>>>>>>
>>>>>>>>>> This is for JDK 15 so review ASAP please since RDP 1 and the 
>>>>>>>>>> test cycle are looming.
>>>>>>>>>>
>>>>>>>>>> This is not a fix for a JDK bug. It is a bunch of workarounds 
>>>>>>>>>> for a Microsoft Windows bug affecting
>>>>>>>>>> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
>>>>>>>>>> Some extra details about the Windows bug at the end, but 
>>>>>>>>>> first the technical details of the fix.
>>>>>>>>>>
>>>>>>>>>> With ZGC's memory allocation requirement of reserving memory 
>>>>>>>>>> in 2Mb chunks  some Windows GDI
>>>>>>>>>> functions, mostly involving some bitmaps APIs may return a 
>>>>>>>>>> failure code (ie fail!)
>>>>>>>>>> This typically occurs when Java heap memory is used for a 
>>>>>>>>>> Java image and then in a JNI
>>>>>>>>>> call we use GetPrimitiveArrayCritical so that Java heap 
>>>>>>>>>> allocated memory is passed to a GDI
>>>>>>>>>> function AND the Java heap memory spans one of the 2Mb 
>>>>>>>>>> boundaries.
>>>>>>>>>> This is very easy to trigger in almost any Java UI app if the 
>>>>>>>>>> window is of a large enough (ie typical) size.
>>>>>>>>>> NB: if you have an Nvidia or ATI card, then you won't see it, 
>>>>>>>>>> because the D3D pipeline doesn't
>>>>>>>>>> call the affected method but if you have an Intel chip as do 
>>>>>>>>>> 90% (?) of laptops you will see it.
>>>>>>>>>> There are also several other places we found that are 
>>>>>>>>>> affected. Printing is the other one
>>>>>>>>>> somewhat easy to trigger. The others : custom cursors and 
>>>>>>>>>> tray icons are less common.
>>>>>>>>>> The painful thing here is that there is no definitive list (a 
>>>>>>>>>> list of the known ones is below) of
>>>>>>>>>> affected Windows GDI APIs and we are just hunting around our 
>>>>>>>>>> code trying to see where it
>>>>>>>>>> might be side-swiped by this bug.
>>>>>>>>>>
>>>>>>>>>> The basic approach in these workarounds is that for cases 
>>>>>>>>>> where performance does not matter we now copy
>>>>>>>>>> and for cases where performance does matter or larger amounts 
>>>>>>>>>> of memory is involved we check if
>>>>>>>>>> the return value of the GDI function indicates failure and 
>>>>>>>>>> then re-try with a copy of the heap memory.
>>>>>>>>>> Unless GDI was randomly failing already (unlikely) this 
>>>>>>>>>> should be a no-risk solution in the high profile cases.
>>>>>>>>>> We have done performance measurements on the important screen 
>>>>>>>>>> case and the failures
>>>>>>>>>> happen fast so the penalty is then in the re-try which is 
>>>>>>>>>> only if ZGC is enabled.
>>>>>>>>>> Always copying the memory is slower (and memcpy is the slow 
>>>>>>>>>> operation) than an alternative approach
>>>>>>>>>> that "knows" about the memory allocation of ZGC but this 
>>>>>>>>>> coupling and the complexity seem like they aren't
>>>>>>>>>> worth it since I haven't seen any visible performance 
>>>>>>>>>> consequence. That can be revisited
>>>>>>>>>> some day if need be, but for now we have correctness which is 
>>>>>>>>>> the key as well as sufficient performance.
>>>>>>>>>>
>>>>>>>>>> I've created an automated test for the most important 
>>>>>>>>>> on-screen case.
>>>>>>>>>> Also a manual printing test case which invokes ZGC is 
>>>>>>>>>> provided since there we also only
>>>>>>>>>> conditionally copy. In the other cases we now always copy so 
>>>>>>>>>> existing test cases should over those.
>>>>>>>>>>
>>>>>>>>>> There is some clean up in this fix - one completely unused  
>>>>>>>>>> (provably so because it was #if'd out)
>>>>>>>>>> JNI method in awt_PrintJob.cpp is removed since it had code 
>>>>>>>>>> that looked like it needed a workaround,
>>>>>>>>>> which would be somewhat of a waste of effort.
>>>>>>>>>>
>>>>>>>>>> the doPrintBand code and its callee bitsToDevice has code I 
>>>>>>>>>> think we can remove too since
>>>>>>>>>> I don't see how it ever gets executed (the top down case for 
>>>>>>>>>> browserPrint == true) but
>>>>>>>>>> I think I'll save that for a P4 follow-on since it does 
>>>>>>>>>> nothing that would be affected by this
>>>>>>>>>> Windows bug.
>>>>>>>>>>
>>>>>>>>>> One oddity is the in the printing case I observed that some 
>>>>>>>>>> times the rendering is performed
>>>>>>>>>> even if an error code is returned. I don't know why, but in 
>>>>>>>>>> code we can't tell that it was actually
>>>>>>>>>> rendered and in any case there is no harm in repeating the 
>>>>>>>>>> call with copied memory.
>>>>>>>>>>
>>>>>>>>>> We are right before the JDK15 stabilisation fork and this fix 
>>>>>>>>>> needs to go there and will
>>>>>>>>>> but the webrev is against jdk/client simply because jdk15 
>>>>>>>>>> does not exist yet !
>>>>>>>>>>
>>>>>>>>>> Please test and review ASAP.
>>>>>>>>>>
>>>>>>>>>> About the bug:
>>>>>>>>>> Microsoft has acknowleged the bug and will publish a 
>>>>>>>>>> knowledge base article about it
>>>>>>>>>> but a fix may show up only in a future version of Windows. 
>>>>>>>>>> Not, it seems, any time soon.
>>>>>>>>>> Below is a list of potentially affected GDI APIs. Per 
>>>>>>>>>> microsoft whether it actually manifests in
>>>>>>>>>> any specific case depends on "branching"
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -phil.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/2d-dev/attachments/20200611/983f3eb7/attachment-0001.htm>


More information about the 2d-dev mailing list