<AWT Dev> [OpenJDK 2D-Dev] RFR: 8240654 : ZGC can cause severe UI application repaint issues
Philip Race
philip.race at oracle.com
Thu Jun 11 16:55:13 UTC 2020
I have confirmed hit a different code path. It goes through generic 2D
s/w loops in this case.
ie we don't use GDIBlitLoops at all. The code in
sun/java2d/pipe/DrawImage.java ends up in
scaleSurfaceData which uses the loops in ScaledBlit.c.
It is a bit surprising to me since I'd expect us to be able to blit
directly at device resolution.
Could we also be taking a performance hit here ? The D3D case doesn't
not go through this loop.
However all that is outside the scope of this fix ... I think setting
uiScale=1 in the test is all that needs to be done.
-phil.
On 6/11/2020 7:51 AM, Philip Race wrote:
> Or, maybe we hit a different code path. I'll check that.
> uiScale=1 is the way to ensure we hit this code path.
>
> -phil.
>
> On 6/11/20, 7:44 AM, Philip Race wrote:
>> Can I get clarification here.
>>
>> > I do, and had to run with "-Dsun.java2d.uiScale=1" in order to see
>> the failure with LargeWindowPaintTest.
>>
>> So you both mean a JDK 15 promoted build without this fix and without
>> this property passes because you have
>> a hidpi setup. And to see the failure without the fix you needed the
>> above property.
>> If so we could just be looking at a similar anomaly as I saw with
>> printing which uses a very large
>> image - it reported failure but actually worked !
>>
>> Also - for both of you - with the fix and without forcing uiScale=1
>> does the test pass ?
>>
>> -phil.
>>
>> On 6/11/20, 7:10 AM, Jayathirth D v wrote:
>>> Yes my machine was at 150% scaling.
>>>
>>> If I force uiScale = 1, I see that:
>>> LargeWindowPaintTest fails without patch and passes with patch.
>>> AlphaPrintTest shows instructions without patch also.
>>>
>>> @Phil : I think its better if we test at uiScale=1(larger memory
>>> footprint). Please clarify.
>>>
>>> Thanks,
>>> Jay
>>>
>>>> On 11-Jun-2020, at 5:53 PM, Kevin Rushforth
>>>> <kevin.rushforth at oracle.com <mailto:kevin.rushforth at oracle.com>> wrote:
>>>>
>>>> Do you have a Hi-DPI machine? I do, and had to run with
>>>> "-Dsun.java2d.uiScale=1" in order to see the failure with
>>>> LargeWindowPaintTest.
>>>>
>>>> For AlphaPrintTest, the test deliberately ensures that you print
>>>> before saying whether it passes or not. FWIW, I verified that the
>>>> printing test on my system was hitting the fallback code with the
>>>> patch, but it seemed to print correctly even without the patch.
>>>>
>>>> -- Kevin
>>>>
>>>>
>>>> On 6/11/2020 1:58 AM, Jayathirth D v wrote:
>>>>> Typo : I tried tested -> I tried testing
>>>>>
>>>>>> On 11-Jun-2020, at 2:27 PM, Jayathirth D v
>>>>>> <JAYATHIRTH.D.V at ORACLE.COM <mailto:JAYATHIRTH.D.V at ORACLE.COM>> wrote:
>>>>>>
>>>>>> Hi Phil,
>>>>>>
>>>>>> I tried tested the fix in my Windows 10 machine with Intel
>>>>>> integrated UHD Graphics 620.
>>>>>>
>>>>>> LargeWindowPaintTest.java passes with/without fix in my machine.
>>>>>> AlphaPrintTest.java without fix just opens up blank frame without
>>>>>> any instructions and with fix it shows instructions for the test.
>>>>>> Is this expected behaviour?
>>>>>>
>>>>>> AlphaPrintTest.java with fix when it shows instructions if I
>>>>>> click on Pass(Since I don’t have printer right now) it doesn’t
>>>>>> pass/close the window. Only after I click on Print button and
>>>>>> then close print dialog it allows me to click on Pass button.
>>>>>>
>>>>>> Also how does these tests behave in our internal CI machines?
>>>>>>
>>>>>> Thanks,
>>>>>> Jay
>>>>>>
>>>>>>> On 11-Jun-2020, at 2:18 AM, Philip Race <philip.race at oracle.com
>>>>>>> <mailto:philip.race at oracle.com>> wrote:
>>>>>>>
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
>>>>>>> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>>>>>>>
>>>>>>> This is for JDK 15 so review ASAP please since RDP 1 and the
>>>>>>> test cycle are looming.
>>>>>>>
>>>>>>> This is not a fix for a JDK bug. It is a bunch of workarounds
>>>>>>> for a Microsoft Windows bug affecting
>>>>>>> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
>>>>>>> Some extra details about the Windows bug at the end, but first
>>>>>>> the technical details of the fix.
>>>>>>>
>>>>>>> With ZGC's memory allocation requirement of reserving memory in
>>>>>>> 2Mb chunks some Windows GDI
>>>>>>> functions, mostly involving some bitmaps APIs may return a
>>>>>>> failure code (ie fail!)
>>>>>>> This typically occurs when Java heap memory is used for a Java
>>>>>>> image and then in a JNI
>>>>>>> call we use GetPrimitiveArrayCritical so that Java heap
>>>>>>> allocated memory is passed to a GDI
>>>>>>> function AND the Java heap memory spans one of the 2Mb boundaries.
>>>>>>> This is very easy to trigger in almost any Java UI app if the
>>>>>>> window is of a large enough (ie typical) size.
>>>>>>> NB: if you have an Nvidia or ATI card, then you won't see it,
>>>>>>> because the D3D pipeline doesn't
>>>>>>> call the affected method but if you have an Intel chip as do 90%
>>>>>>> (?) of laptops you will see it.
>>>>>>> There are also several other places we found that are affected.
>>>>>>> Printing is the other one
>>>>>>> somewhat easy to trigger. The others : custom cursors and tray
>>>>>>> icons are less common.
>>>>>>> The painful thing here is that there is no definitive list (a
>>>>>>> list of the known ones is below) of
>>>>>>> affected Windows GDI APIs and we are just hunting around our
>>>>>>> code trying to see where it
>>>>>>> might be side-swiped by this bug.
>>>>>>>
>>>>>>> The basic approach in these workarounds is that for cases where
>>>>>>> performance does not matter we now copy
>>>>>>> and for cases where performance does matter or larger amounts of
>>>>>>> memory is involved we check if
>>>>>>> the return value of the GDI function indicates failure and then
>>>>>>> re-try with a copy of the heap memory.
>>>>>>> Unless GDI was randomly failing already (unlikely) this should
>>>>>>> be a no-risk solution in the high profile cases.
>>>>>>> We have done performance measurements on the important screen
>>>>>>> case and the failures
>>>>>>> happen fast so the penalty is then in the re-try which is only
>>>>>>> if ZGC is enabled.
>>>>>>> Always copying the memory is slower (and memcpy is the slow
>>>>>>> operation) than an alternative approach
>>>>>>> that "knows" about the memory allocation of ZGC but this
>>>>>>> coupling and the complexity seem like they aren't
>>>>>>> worth it since I haven't seen any visible performance
>>>>>>> consequence. That can be revisited
>>>>>>> some day if need be, but for now we have correctness which is
>>>>>>> the key as well as sufficient performance.
>>>>>>>
>>>>>>> I've created an automated test for the most important on-screen
>>>>>>> case.
>>>>>>> Also a manual printing test case which invokes ZGC is provided
>>>>>>> since there we also only
>>>>>>> conditionally copy. In the other cases we now always copy so
>>>>>>> existing test cases should over those.
>>>>>>>
>>>>>>> There is some clean up in this fix - one completely unused
>>>>>>> (provably so because it was #if'd out)
>>>>>>> JNI method in awt_PrintJob.cpp is removed since it had code that
>>>>>>> looked like it needed a workaround,
>>>>>>> which would be somewhat of a waste of effort.
>>>>>>>
>>>>>>> the doPrintBand code and its callee bitsToDevice has code I
>>>>>>> think we can remove too since
>>>>>>> I don't see how it ever gets executed (the top down case for
>>>>>>> browserPrint == true) but
>>>>>>> I think I'll save that for a P4 follow-on since it does nothing
>>>>>>> that would be affected by this
>>>>>>> Windows bug.
>>>>>>>
>>>>>>> One oddity is the in the printing case I observed that some
>>>>>>> times the rendering is performed
>>>>>>> even if an error code is returned. I don't know why, but in code
>>>>>>> we can't tell that it was actually
>>>>>>> rendered and in any case there is no harm in repeating the call
>>>>>>> with copied memory.
>>>>>>>
>>>>>>> We are right before the JDK15 stabilisation fork and this fix
>>>>>>> needs to go there and will
>>>>>>> but the webrev is against jdk/client simply because jdk15 does
>>>>>>> not exist yet !
>>>>>>>
>>>>>>> Please test and review ASAP.
>>>>>>>
>>>>>>> About the bug:
>>>>>>> Microsoft has acknowleged the bug and will publish a knowledge
>>>>>>> base article about it
>>>>>>> but a fix may show up only in a future version of Windows. Not,
>>>>>>> it seems, any time soon.
>>>>>>> Below is a list of potentially affected GDI APIs. Per microsoft
>>>>>>> whether it actually manifests in
>>>>>>> any specific case depends on "branching"
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>>>>>>>
>>>>>>> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>>>>>>>
>>>>>>>
>>>>>>> -phil.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/awt-dev/attachments/20200611/f6117409/attachment-0001.htm>
More information about the awt-dev
mailing list