<AWT Dev> [OpenJDK 2D-Dev] RFR: 8240654 : ZGC can cause severe UI application repaint issues

Sergey Bylokhov Sergey.Bylokhov at oracle.com
Thu Jun 11 23:58:22 UTC 2020


+1

On 6/11/20 3:33 pm, Philip Race wrote:
> BitBlt is not on the list of affected APIs provided by Microsoft.
> And it is the JDK blit loop that copies from the Java heap memory into a
> Windows bitmap buffer, so I don't see it being affected.
> The BitBlt to the windows surface then operates on that windows allocated buffer.
> 
> -phil.
> 
> On 6/11/20, 2:47 PM, Sergey Bylokhov wrote:
>> And if -Dsun.java2d.uiScale is set to "2" it will use the normal Blit.
>>
>> It looks like the ScaledBlit uses "direct" access to the gdi surface.
>> ScaledBlit reads content of the GDIWindowSurfaceData in the GDIWinSD_GetRasInfo and write it back in the
>> GDIWinSD_Unlock via BitBlt. Not sure this code might be affected by this bug or not.
>>
>> BTW I just realized that the test depends on the blit of Swing backbuffer to the window,
>> it does not force usage of "memory->GDI" blit by itself.
>>
>> On 6/11/20 12:05 pm, Philip Race wrote:
>>> You can see the difference here from turning on tracing :-
>>>
>>> $ jdk-client/build/windows-x64/jdk/bin/java -Dsun.java2d.d3d=false -Dsun.java2d.uiScale=1.1 -Dsun.java2d.trace=log -XX:+UseZGC LargeWindowPaintTest
>>>
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.ScaledBlit::ScaledBlit(IntRgb, SrcNoEa, IntRgb)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillParallelogram::FillParallelogram(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.ScaledBlit::ScaledBlit(IntRgb, SrcNoEa, IntRgb)
>>> java.awt.Rectangle[x=0,y=0,width=1746,height=925]
>>>
>>>
>>> $ jdk-client/build/windows-x64/jdk/bin/java -Dsun.java2d.d3d=false -Dsun.java2d.uiScale=1 -Dsun.java2d.trace=log -XX:+UseZGC LargeWindowPaintTest
>>>
>>> sun.java2d.loops.FillRect::FillRect(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillRect::FillRect(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillRect::FillRect(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillRect::FillRect(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.loops.FillRect::FillRect(AnyColor, SrcNoEa, AnyInt)
>>> sun.java2d.windows.GDIBlitLoops::Blit(IntRgb, SrcNoEa, "GDI")
>>> java.awt.Rectangle[x=0,y=0,width=1920,height=1017]
>>>
>>> -phil
>>>
>>> On 6/11/2020 11:35 AM, Sergey Bylokhov wrote:
>>>> On 6/11/20 9:55 am, Philip Race wrote:
>>>>> I have confirmed hit a different code path. It goes through generic 2D s/w loops in this case.
>>>>> ie we don't use GDIBlitLoops at all. The code in sun/java2d/pipe/DrawImage.java ends up in
>>>>> scaleSurfaceData which uses the loops in ScaledBlit.c.
>>>>
>>>> But as far as I understand we somehow should use gdiblits at the end,
>>>> because only these blits can draw something on the screen when GDI is enabled(or not???).
>>>> So it is unclear why the GDIBlitLoops are not used in the HiDPI mode.
>>>>
>>>>>
>>>>> It is a bit surprising to me since I'd expect us to be able to blit directly at device resolution.
>>>>> Could we also be taking a performance hit here ? The D3D case doesn't not go through this loop.
>>>>>
>>>>> However all that is outside the scope of this fix ... I think setting uiScale=1 in the test is all that needs to be done.
>>>>>
>>>>> -phil.
>>>>>
>>>>> On 6/11/2020 7:51 AM, Philip Race wrote:
>>>>>> Or, maybe we hit a different code path. I'll check that.
>>>>>> uiScale=1 is the way to ensure we hit this code path.
>>>>>>
>>>>>> -phil.
>>>>>>
>>>>>> On 6/11/20, 7:44 AM, Philip Race wrote:
>>>>>>> Can I get clarification here.
>>>>>>>
>>>>>>> > I do, and had to run with "-Dsun.java2d.uiScale=1" in order to see the failure with LargeWindowPaintTest.
>>>>>>>
>>>>>>> So you both mean a JDK 15 promoted build without this fix and without this property passes because you have
>>>>>>> a hidpi setup. And to see the failure without the fix you needed the above property.
>>>>>>> If so we could just be looking at a similar anomaly as I saw with printing which uses a very large
>>>>>>> image - it reported failure but actually worked !
>>>>>>>
>>>>>>> Also - for both of you - with the fix and without forcing uiScale=1 does the test pass ?
>>>>>>>
>>>>>>> -phil.
>>>>>>>
>>>>>>> On 6/11/20, 7:10 AM, Jayathirth D v wrote:
>>>>>>>> Yes my machine was at 150% scaling.
>>>>>>>>
>>>>>>>> If I force uiScale = 1, I see that:
>>>>>>>> LargeWindowPaintTest fails without patch and passes with patch.
>>>>>>>> AlphaPrintTest shows instructions without patch also.
>>>>>>>>
>>>>>>>> @Phil : I think its better if we test at uiScale=1(larger memory footprint). Please clarify.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jay
>>>>>>>>
>>>>>>>>> On 11-Jun-2020, at 5:53 PM, Kevin Rushforth <kevin.rushforth at oracle.com <mailto:kevin.rushforth at oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>> Do you have a Hi-DPI machine? I do, and had to run with "-Dsun.java2d.uiScale=1" in order to see the failure with LargeWindowPaintTest.
>>>>>>>>>
>>>>>>>>> For AlphaPrintTest, the test deliberately ensures that you print before saying whether it passes or not. FWIW, I verified that the printing test on my system was hitting the fallback code with the patch, but it seemed to print correctly even without the patch.
>>>>>>>>>
>>>>>>>>> -- Kevin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/11/2020 1:58 AM, Jayathirth D v wrote:
>>>>>>>>>> Typo : I tried tested -> I tried testing
>>>>>>>>>>
>>>>>>>>>>> On 11-Jun-2020, at 2:27 PM, Jayathirth D v <JAYATHIRTH.D.V at ORACLE.COM <mailto:JAYATHIRTH.D.V at ORACLE.COM>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Phil,
>>>>>>>>>>>
>>>>>>>>>>> I tried tested the fix in my Windows 10 machine with Intel integrated UHD Graphics 620.
>>>>>>>>>>>
>>>>>>>>>>> LargeWindowPaintTest.java passes with/without fix in my machine.
>>>>>>>>>>> AlphaPrintTest.java without fix just opens up blank frame without any instructions and with fix it shows instructions for the test.
>>>>>>>>>>> Is this expected behaviour?
>>>>>>>>>>>
>>>>>>>>>>> AlphaPrintTest.java with fix when it shows instructions if I click on Pass(Since I don’t have printer right now) it doesn’t pass/close the window. Only after I click on Print button and then close print dialog it allows me to click on Pass button.
>>>>>>>>>>>
>>>>>>>>>>> Also how does these tests behave in our internal CI machines?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jay
>>>>>>>>>>>
>>>>>>>>>>>> On 11-Jun-2020, at 2:18 AM, Philip Race <philip.race at oracle.com <mailto:philip.race at oracle.com>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>>>>>>>>>>>>
>>>>>>>>>>>> This is for JDK 15 so review ASAP please since RDP 1 and the test cycle are looming.
>>>>>>>>>>>>
>>>>>>>>>>>> This is not a fix for a JDK bug. It is a bunch of workarounds for a Microsoft Windows bug affecting
>>>>>>>>>>>> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
>>>>>>>>>>>> Some extra details about the Windows bug at the end, but first the technical details of the fix.
>>>>>>>>>>>>
>>>>>>>>>>>> With ZGC's memory allocation requirement of reserving memory in 2Mb chunks  some Windows GDI
>>>>>>>>>>>> functions, mostly involving some bitmaps APIs may return a failure code (ie fail!)
>>>>>>>>>>>> This typically occurs when Java heap memory is used for a Java image and then in a JNI
>>>>>>>>>>>> call we use GetPrimitiveArrayCritical so that Java heap allocated memory is passed to a GDI
>>>>>>>>>>>> function AND the Java heap memory spans one of the 2Mb boundaries.
>>>>>>>>>>>> This is very easy to trigger in almost any Java UI app if the window is of a large enough (ie typical) size.
>>>>>>>>>>>> NB: if you have an Nvidia or ATI card, then you won't see it, because the D3D pipeline doesn't
>>>>>>>>>>>> call the affected method but if you have an Intel chip as do 90% (?) of laptops you will see it.
>>>>>>>>>>>> There are also several other places we found that are affected. Printing is the other one
>>>>>>>>>>>> somewhat easy to trigger. The others : custom cursors and tray icons are less common.
>>>>>>>>>>>> The painful thing here is that there is no definitive list (a list of the known ones is below) of
>>>>>>>>>>>> affected Windows GDI APIs and we are just hunting around our code trying to see where it
>>>>>>>>>>>> might be side-swiped by this bug.
>>>>>>>>>>>>
>>>>>>>>>>>> The basic approach in these workarounds is that for cases where performance does not matter we now copy
>>>>>>>>>>>> and for cases where performance does matter or larger amounts of memory is involved we check if
>>>>>>>>>>>> the return value of the GDI function indicates failure and then re-try with a copy of the heap memory.
>>>>>>>>>>>> Unless GDI was randomly failing already (unlikely) this should be a no-risk solution in the high profile cases.
>>>>>>>>>>>> We have done performance measurements on the important screen case and the failures
>>>>>>>>>>>> happen fast so the penalty is then in the re-try which is only if ZGC is enabled.
>>>>>>>>>>>> Always copying the memory is slower (and memcpy is the slow operation) than an alternative approach
>>>>>>>>>>>> that "knows" about the memory allocation of ZGC but this coupling and the complexity seem like they aren't
>>>>>>>>>>>> worth it since I haven't seen any visible performance consequence. That can be revisited
>>>>>>>>>>>> some day if need be, but for now we have correctness which is the key as well as sufficient performance.
>>>>>>>>>>>>
>>>>>>>>>>>> I've created an automated test for the most important on-screen case.
>>>>>>>>>>>> Also a manual printing test case which invokes ZGC is provided since there we also only
>>>>>>>>>>>> conditionally copy. In the other cases we now always copy so existing test cases should over those.
>>>>>>>>>>>>
>>>>>>>>>>>> There is some clean up in this fix - one completely unused (provably so because it was #if'd out)
>>>>>>>>>>>> JNI method in awt_PrintJob.cpp is removed since it had code that looked like it needed a workaround,
>>>>>>>>>>>> which would be somewhat of a waste of effort.
>>>>>>>>>>>>
>>>>>>>>>>>> the doPrintBand code and its callee bitsToDevice has code I think we can remove too since
>>>>>>>>>>>> I don't see how it ever gets executed (the top down case for browserPrint == true) but
>>>>>>>>>>>> I think I'll save that for a P4 follow-on since it does nothing that would be affected by this
>>>>>>>>>>>> Windows bug.
>>>>>>>>>>>>
>>>>>>>>>>>> One oddity is the in the printing case I observed that some times the rendering is performed
>>>>>>>>>>>> even if an error code is returned. I don't know why, but in code we can't tell that it was actually
>>>>>>>>>>>> rendered and in any case there is no harm in repeating the call with copied memory.
>>>>>>>>>>>>
>>>>>>>>>>>> We are right before the JDK15 stabilisation fork and this fix needs to go there and will
>>>>>>>>>>>> but the webrev is against jdk/client simply because jdk15 does not exist yet !
>>>>>>>>>>>>
>>>>>>>>>>>> Please test and review ASAP.
>>>>>>>>>>>>
>>>>>>>>>>>> About the bug:
>>>>>>>>>>>> Microsoft has acknowleged the bug and will publish a knowledge base article about it
>>>>>>>>>>>> but a fix may show up only in a future version of Windows. Not, it seems, any time soon.
>>>>>>>>>>>> Below is a list of potentially affected GDI APIs. Per microsoft whether it actually manifests in
>>>>>>>>>>>> any specific case depends on "branching"
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>>>>>>>>>>>>
>>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -phil.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>


-- 
Best regards, Sergey.


More information about the awt-dev mailing list