[OpenJDK 2D-Dev] <AWT Dev> RFR: 8240654 : ZGC can cause severe UI application repaint issues

Philip Race philip.race at oracle.com
Thu Jun 11 16:55:13 UTC 2020


I have confirmed hit a different code path. It goes through generic 2D 
s/w loops in this case.
ie we don't use GDIBlitLoops at all. The code in 
sun/java2d/pipe/DrawImage.java ends up in
scaleSurfaceData which uses the loops in ScaledBlit.c.

It is a bit surprising to me since I'd expect us to be able to blit 
directly at device resolution.
Could we also be taking a performance hit here ? The D3D case doesn't 
not go through this loop.

However all that is outside the scope of this fix ... I think setting 
uiScale=1 in the test is all that needs to be done.

-phil.

On 6/11/2020 7:51 AM, Philip Race wrote:
> Or, maybe we hit a different code path. I'll check that.
> uiScale=1 is the way to ensure we hit this code path.
>
> -phil.
>
> On 6/11/20, 7:44 AM, Philip Race wrote:
>> Can I get clarification here.
>>
>> > I do, and had to run with "-Dsun.java2d.uiScale=1" in order to see 
>> the failure with LargeWindowPaintTest.
>>
>> So you both mean a JDK 15 promoted build without this fix and without 
>> this property passes because you have
>> a hidpi setup. And to see the failure without the fix you needed the 
>> above property.
>> If so we could just be looking at a similar anomaly as I saw with 
>> printing which uses a very large
>> image - it reported failure but actually worked !
>>
>> Also - for both of you - with the fix and without forcing uiScale=1 
>> does the test pass ?
>>
>> -phil.
>>
>> On 6/11/20, 7:10 AM, Jayathirth D v wrote:
>>> Yes my machine was at 150% scaling.
>>>
>>> If I force uiScale = 1, I see that:
>>> LargeWindowPaintTest fails without patch and passes with patch.
>>> AlphaPrintTest shows instructions without patch also.
>>>
>>> @Phil : I think its better if we test at uiScale=1(larger memory 
>>> footprint). Please clarify.
>>>
>>> Thanks,
>>> Jay
>>>
>>>> On 11-Jun-2020, at 5:53 PM, Kevin Rushforth 
>>>> <kevin.rushforth at oracle.com <mailto:kevin.rushforth at oracle.com>> wrote:
>>>>
>>>> Do you have a Hi-DPI machine? I do, and had to run with 
>>>> "-Dsun.java2d.uiScale=1" in order to see the failure with 
>>>> LargeWindowPaintTest.
>>>>
>>>> For AlphaPrintTest, the test deliberately ensures that you print 
>>>> before saying whether it passes or not. FWIW, I verified that the 
>>>> printing test on my system was hitting the fallback code with the 
>>>> patch, but it seemed to print correctly even without the patch.
>>>>
>>>> -- Kevin
>>>>
>>>>
>>>> On 6/11/2020 1:58 AM, Jayathirth D v wrote:
>>>>> Typo : I tried tested -> I tried testing
>>>>>
>>>>>> On 11-Jun-2020, at 2:27 PM, Jayathirth D v 
>>>>>> <JAYATHIRTH.D.V at ORACLE.COM <mailto:JAYATHIRTH.D.V at ORACLE.COM>> wrote:
>>>>>>
>>>>>> Hi Phil,
>>>>>>
>>>>>> I tried tested the fix in my Windows 10 machine with Intel 
>>>>>> integrated UHD Graphics 620.
>>>>>>
>>>>>> LargeWindowPaintTest.java passes with/without fix in my machine.
>>>>>> AlphaPrintTest.java without fix just opens up blank frame without 
>>>>>> any instructions and with fix it shows instructions for the test.
>>>>>> Is this expected behaviour?
>>>>>>
>>>>>> AlphaPrintTest.java with fix when it shows instructions if I 
>>>>>> click on Pass(Since I don’t have printer right now) it doesn’t 
>>>>>> pass/close the window. Only after I click on Print button and 
>>>>>> then close print dialog it allows me to click on Pass button.
>>>>>>
>>>>>> Also how does these tests behave in our internal CI machines?
>>>>>>
>>>>>> Thanks,
>>>>>> Jay
>>>>>>
>>>>>>> On 11-Jun-2020, at 2:18 AM, Philip Race <philip.race at oracle.com 
>>>>>>> <mailto:philip.race at oracle.com>> wrote:
>>>>>>>
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
>>>>>>> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>>>>>>>
>>>>>>> This is for JDK 15 so review ASAP please since RDP 1 and the 
>>>>>>> test cycle are looming.
>>>>>>>
>>>>>>> This is not a fix for a JDK bug. It is a bunch of workarounds 
>>>>>>> for a Microsoft Windows bug affecting
>>>>>>> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
>>>>>>> Some extra details about the Windows bug at the end, but first 
>>>>>>> the technical details of the fix.
>>>>>>>
>>>>>>> With ZGC's memory allocation requirement of reserving memory in 
>>>>>>> 2Mb chunks  some Windows GDI
>>>>>>> functions, mostly involving some bitmaps APIs may return a 
>>>>>>> failure code (ie fail!)
>>>>>>> This typically occurs when Java heap memory is used for a Java 
>>>>>>> image and then in a JNI
>>>>>>> call we use GetPrimitiveArrayCritical so that Java heap 
>>>>>>> allocated memory is passed to a GDI
>>>>>>> function AND the Java heap memory spans one of the 2Mb boundaries.
>>>>>>> This is very easy to trigger in almost any Java UI app if the 
>>>>>>> window is of a large enough (ie typical) size.
>>>>>>> NB: if you have an Nvidia or ATI card, then you won't see it, 
>>>>>>> because the D3D pipeline doesn't
>>>>>>> call the affected method but if you have an Intel chip as do 90% 
>>>>>>> (?) of laptops you will see it.
>>>>>>> There are also several other places we found that are affected. 
>>>>>>> Printing is the other one
>>>>>>> somewhat easy to trigger. The others : custom cursors and tray 
>>>>>>> icons are less common.
>>>>>>> The painful thing here is that there is no definitive list (a 
>>>>>>> list of the known ones is below) of
>>>>>>> affected Windows GDI APIs and we are just hunting around our 
>>>>>>> code trying to see where it
>>>>>>> might be side-swiped by this bug.
>>>>>>>
>>>>>>> The basic approach in these workarounds is that for cases where 
>>>>>>> performance does not matter we now copy
>>>>>>> and for cases where performance does matter or larger amounts of 
>>>>>>> memory is involved we check if
>>>>>>> the return value of the GDI function indicates failure and then 
>>>>>>> re-try with a copy of the heap memory.
>>>>>>> Unless GDI was randomly failing already (unlikely) this should 
>>>>>>> be a no-risk solution in the high profile cases.
>>>>>>> We have done performance measurements on the important screen 
>>>>>>> case and the failures
>>>>>>> happen fast so the penalty is then in the re-try which is only 
>>>>>>> if ZGC is enabled.
>>>>>>> Always copying the memory is slower (and memcpy is the slow 
>>>>>>> operation) than an alternative approach
>>>>>>> that "knows" about the memory allocation of ZGC but this 
>>>>>>> coupling and the complexity seem like they aren't
>>>>>>> worth it since I haven't seen any visible performance 
>>>>>>> consequence. That can be revisited
>>>>>>> some day if need be, but for now we have correctness which is 
>>>>>>> the key as well as sufficient performance.
>>>>>>>
>>>>>>> I've created an automated test for the most important on-screen 
>>>>>>> case.
>>>>>>> Also a manual printing test case which invokes ZGC is provided 
>>>>>>> since there we also only
>>>>>>> conditionally copy. In the other cases we now always copy so 
>>>>>>> existing test cases should over those.
>>>>>>>
>>>>>>> There is some clean up in this fix - one completely unused 
>>>>>>> (provably so because it was #if'd out)
>>>>>>> JNI method in awt_PrintJob.cpp is removed since it had code that 
>>>>>>> looked like it needed a workaround,
>>>>>>> which would be somewhat of a waste of effort.
>>>>>>>
>>>>>>> the doPrintBand code and its callee bitsToDevice has code I 
>>>>>>> think we can remove too since
>>>>>>> I don't see how it ever gets executed (the top down case for 
>>>>>>> browserPrint == true) but
>>>>>>> I think I'll save that for a P4 follow-on since it does nothing 
>>>>>>> that would be affected by this
>>>>>>> Windows bug.
>>>>>>>
>>>>>>> One oddity is the in the printing case I observed that some 
>>>>>>> times the rendering is performed
>>>>>>> even if an error code is returned. I don't know why, but in code 
>>>>>>> we can't tell that it was actually
>>>>>>> rendered and in any case there is no harm in repeating the call 
>>>>>>> with copied memory.
>>>>>>>
>>>>>>> We are right before the JDK15 stabilisation fork and this fix 
>>>>>>> needs to go there and will
>>>>>>> but the webrev is against jdk/client simply because jdk15 does 
>>>>>>> not exist yet !
>>>>>>>
>>>>>>> Please test and review ASAP.
>>>>>>>
>>>>>>> About the bug:
>>>>>>> Microsoft has acknowleged the bug and will publish a knowledge 
>>>>>>> base article about it
>>>>>>> but a fix may show up only in a future version of Windows. Not, 
>>>>>>> it seems, any time soon.
>>>>>>> Below is a list of potentially affected GDI APIs. Per microsoft 
>>>>>>> whether it actually manifests in
>>>>>>> any specific case depends on "branching"
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>>>>>>>
>>>>>>>
>>>>>>> -phil.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/2d-dev/attachments/20200611/f6117409/attachment-0001.htm>


More information about the 2d-dev mailing list