[OpenJDK 2D-Dev] RFR: 8240654 : ZGC can cause severe UI application repaint issues

Kevin Rushforth kevin.rushforth at oracle.com
Wed Jun 10 22:57:38 UTC 2020


+1

Code changes look good.

I verified that the bug happens on my Windows 10 laptop with SwingSet2 
and with the LargeWindowPaintTest without the patch, and everything 
looks good with the patch.

-- Kevin


On 6/10/2020 1:48 PM, Philip Race wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>
> This is for JDK 15 so review ASAP please since RDP 1 and the test 
> cycle are looming.
>
> This is not a fix for a JDK bug. It is a bunch of workarounds for a 
> Microsoft Windows bug affecting
> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
> Some extra details about the Windows bug at the end, but first the 
> technical details of the fix.
>
> With ZGC's memory allocation requirement of reserving memory in 2Mb 
> chunks  some Windows GDI
> functions, mostly involving some bitmaps APIs may return a failure 
> code (ie fail!)
> This typically occurs when Java heap memory is used for a Java image 
> and then in a JNI
> call we use GetPrimitiveArrayCritical so that Java heap allocated 
> memory is passed to a GDI
> function AND the Java heap memory spans one of the 2Mb boundaries.
> This is very easy to trigger in almost any Java UI app if the window 
> is of a large enough (ie typical) size.
> NB: if you have an Nvidia or ATI card, then you won't see it, because 
> the D3D pipeline doesn't
> call the affected method but if you have an Intel chip as do 90% (?) 
> of laptops you will see it.
> There are also several other places we found that are affected. 
> Printing is the other one
> somewhat easy to trigger. The others : custom cursors and tray icons 
> are less common.
> The painful thing here is that there is no definitive list (a list of 
> the known ones is below) of
> affected Windows GDI APIs and we are just hunting around our code 
> trying to see where it
> might be side-swiped by this bug.
>
> The basic approach in these workarounds is that for cases where 
> performance does not matter we now copy
> and for cases where performance does matter or larger amounts of 
> memory is involved we check if
> the return value of the GDI function indicates failure and then re-try 
> with a copy of the heap memory.
> Unless GDI was randomly failing already (unlikely) this should be a 
> no-risk solution in the high profile cases.
> We have done performance measurements on the important screen case and 
> the failures
> happen fast so the penalty is then in the re-try which is only if ZGC 
> is enabled.
> Always copying the memory is slower (and memcpy is the slow operation) 
> than an alternative approach
> that "knows" about the memory allocation of ZGC but this coupling and 
> the complexity seem like they aren't
> worth it since I haven't seen any visible performance consequence. 
> That can be revisited
> some day if need be, but for now we have correctness which is the key 
> as well as sufficient performance.
>
> I've created an automated test for the most important on-screen case.
> Also a manual printing test case which invokes ZGC is provided since 
> there we also only
> conditionally copy. In the other cases we now always copy so existing 
> test cases should over those.
>
> There is some clean up in this fix - one completely unused (provably 
> so because it was #if'd out)
> JNI method in awt_PrintJob.cpp is removed since it had code that 
> looked like it needed a workaround,
> which would be somewhat of a waste of effort.
>
> the doPrintBand code and its callee bitsToDevice has code I think we 
> can remove too since
> I don't see how it ever gets executed (the top down case for 
> browserPrint == true) but
> I think I'll save that for a P4 follow-on since it does nothing that 
> would be affected by this
> Windows bug.
>
> One oddity is the in the printing case I observed that some times the 
> rendering is performed
> even if an error code is returned. I don't know why, but in code we 
> can't tell that it was actually
> rendered and in any case there is no harm in repeating the call with 
> copied memory.
>
> We are right before the JDK15 stabilisation fork and this fix needs to 
> go there and will
> but the webrev is against jdk/client simply because jdk15 does not 
> exist yet !
>
> Please test and review ASAP.
>
> About the bug:
> Microsoft has acknowleged the bug and will publish a knowledge base 
> article about it
> but a fix may show up only in a future version of Windows. Not, it 
> seems, any time soon.
> Below is a list of potentially affected GDI APIs. Per microsoft 
> whether it actually manifests in
> any specific case depends on "branching"
>
> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>
> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>
>
> -phil.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/2d-dev/attachments/20200610/e1195b6b/attachment-0001.htm>


More information about the 2d-dev mailing list