[OpenJDK 2D-Dev] <AWT Dev> RFR: 8240654 : ZGC can cause severe UI application repaint issues

Martijn Verburg martijnverburg at gmail.com
Thu Jun 25 10:46:00 UTC 2020


I can't make any concrete promises but this is on our internal list of
things to try and find a resolution for with the Windows folks (I'm the
Java Engineering Manager).  One of my team will likely spin up a new thread
if/when we have further details on this issue.

Cheers,
Martijn


On Tue, 23 Jun 2020 at 18:33, Philip Race <philip.race at oracle.com> wrote:

> For the record, Microsoft have no published a KB article acknowledging the
> problem:-
>
> https://support.microsoft.com/en-us/help/4567569/gdi-apis-may-fail-when-large-pages-or-vad-spanning-is-used
>
>
> -phil.
>
>
>
> On 6/11/20, 9:12 PM, Jayathirth D v wrote:
>
> +1.
>
> Thanks,
> Jay
>
> On 12-Jun-2020, at 12:16 AM, Philip Race <philip.race at oracle.com> wrote:
>
>
> one more update to the tests
> cr.openjdk.java.net/~prr/8240654.2/
> <http://cr.openjdk.java.net/%7Eprr/8240654.2/>
>
> I added
> @requires vm.gc.Z per the VM folks, ZGC needs Windows Server 2019 or the
> same vintage Window 10. if run on Windows Server 2016
> ZGC errors out. This should prevent that. -phil.
> On 6/11/2020 11:13 AM, Kevin Rushforth wrote:
>
> +1
>
> The updated LargeWindowPaintTest fails for me without the fix without my
> having to manually set uiScale. It passes with the fix.
>
> Interestingly enough, I finally saw the problem that Jay reported with
> AlphaPrintTest: without the fix I initially get a blank (all white) window.
> If I resize it then it is drawn. With the fix everything is fine.
>
> -- Kevin
>
> On 6/11/2020 10:48 AM, Philip Race wrote:
>
> Updated webrev here
>
> http://cr.openjdk.java.net/~prr/8240654.1/index.html
>
> The only changes are to the tests - to add -Dsun.java2d.uiScale=1 to the
> onscreen test
> and to add printer to the keys for the printing test.
>
> It was pointed out by Stefan from the ZGC team that the changes in
> awt_TrayIcon.cpp and awt_Cursor.cpp
> should not be needed because the GDI code in Create_BMP that ultimately
> consumes the data
> has processed and copied it into memory allocated by CreateDIBSection
> before passing it to
> CreateBitmap. I considered reverting those two files but decided to keep
> them because I
> think I would like this fix anyway. We really don't need to lock down the
> VM in these cases.
>
> -phil.
>
>
> On 6/11/20, 9:55 AM, Philip Race wrote:
>
> I have confirmed hit a different code path. It goes through generic 2D s/w
> loops in this case.
> ie we don't use GDIBlitLoops at all. The code in
> sun/java2d/pipe/DrawImage.java ends up in
> scaleSurfaceData which uses the loops in ScaledBlit.c.
>
> It is a bit surprising to me since I'd expect us to be able to blit
> directly at device resolution.
> Could we also be taking a performance hit here ? The D3D case doesn't not
> go through this loop.
>
> However all that is outside the scope of this fix ... I think setting
> uiScale=1 in the test is all that needs to be done.
>
> -phil.
>
> On 6/11/2020 7:51 AM, Philip Race wrote:
>
> Or, maybe we hit a different code path. I'll check that.
> uiScale=1 is the way to ensure we hit this code path.
>
> -phil.
>
> On 6/11/20, 7:44 AM, Philip Race wrote:
>
> Can I get clarification here.
>
> > I do, and had to run with "-Dsun.java2d.uiScale=1" in order to see the
> failure with LargeWindowPaintTest.
>
> So you both mean a JDK 15 promoted build without this fix and without this
> property passes because you have
> a hidpi setup. And to see the failure without the fix you needed the above
> property.
> If so we could just be looking at a similar anomaly as I saw with printing
> which uses a very large
> image - it reported failure but actually worked !
>
> Also - for both of you - with the fix and without forcing uiScale=1 does
> the test pass ?
>
> -phil.
>
> On 6/11/20, 7:10 AM, Jayathirth D v wrote:
>
> Yes my machine was at 150% scaling.
>
> If I force uiScale = 1, I see that:
> LargeWindowPaintTest fails without patch and passes with patch.
> AlphaPrintTest shows instructions without patch also.
>
> @Phil : I think its better if we test at uiScale=1(larger memory
> footprint). Please clarify.
>
> Thanks,
> Jay
>
> On 11-Jun-2020, at 5:53 PM, Kevin Rushforth <kevin.rushforth at oracle.com>
> wrote:
>
> Do you have a Hi-DPI machine? I do, and had to run with
> "-Dsun.java2d.uiScale=1" in order to see the failure with
> LargeWindowPaintTest.
>
> For AlphaPrintTest, the test deliberately ensures that you print before
> saying whether it passes or not. FWIW, I verified that the printing test on
> my system was hitting the fallback code with the patch, but it seemed to
> print correctly even without the patch.
>
> -- Kevin
>
>
> On 6/11/2020 1:58 AM, Jayathirth D v wrote:
>
> Typo : I tried tested -> I tried testing
>
> On 11-Jun-2020, at 2:27 PM, Jayathirth D v <JAYATHIRTH.D.V at ORACLE.COM>
> wrote:
>
> Hi Phil,
>
> I tried tested the fix in my Windows 10 machine with Intel integrated UHD
> Graphics 620.
>
> LargeWindowPaintTest.java passes with/without fix in my machine.
> AlphaPrintTest.java without fix just opens up blank frame without any
> instructions and with fix it shows instructions for the test.
> Is this expected behaviour?
>
> AlphaPrintTest.java with fix when it shows instructions if I click on
> Pass(Since I don’t have printer right now) it doesn’t pass/close the
> window. Only after I click on Print button and then close print dialog it
> allows me to click on Pass button.
>
> Also how does these tests behave in our internal CI machines?
>
> Thanks,
> Jay
>
> On 11-Jun-2020, at 2:18 AM, Philip Race <philip.race at oracle.com> wrote:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8240654
> Webrev: http://cr.openjdk.java.net/~prr/8240654/index.html
>
> This is for JDK 15 so review ASAP please since RDP 1 and the test cycle
> are looming.
>
> This is not a fix for a JDK bug. It is a bunch of workarounds for a
> Microsoft Windows bug affecting
> GDI in the context of ZGC (http://openjdk.java.net/jeps/333).
> Some extra details about the Windows bug at the end, but first the
> technical details of the fix.
>
> With ZGC's memory allocation requirement of reserving memory in 2Mb
> chunks  some Windows GDI
> functions, mostly involving some bitmaps APIs may return a failure code
> (ie fail!)
> This typically occurs when Java heap memory is used for a Java image and
> then in a JNI
> call we use GetPrimitiveArrayCritical so that Java heap allocated memory
> is passed to a GDI
> function AND the Java heap memory spans one of the 2Mb boundaries.
> This is very easy to trigger in almost any Java UI app if the window is of
> a large enough (ie typical) size.
> NB: if you have an Nvidia or ATI card, then you won't see it, because the
> D3D pipeline doesn't
> call the affected method but if you have an Intel chip as do 90% (?) of
> laptops you will see it.
> There are also several other places we found that are affected. Printing
> is the other one
> somewhat easy to trigger. The others : custom cursors and tray icons are
> less common.
> The painful thing here is that there is no definitive list (a list of the
> known ones is below) of
> affected Windows GDI APIs and we are just hunting around our code trying
> to see where it
> might be side-swiped by this bug.
>
> The basic approach in these workarounds is that for cases where
> performance does not matter we now copy
> and for cases where performance does matter or larger amounts of memory is
> involved we check if
> the return value of the GDI function indicates failure and then re-try
> with a copy of the heap memory.
> Unless GDI was randomly failing already (unlikely) this should be a
> no-risk solution in the high profile cases.
> We have done performance measurements on the important screen case and the
> failures
> happen fast so the penalty is then in the re-try which is only if ZGC is
> enabled.
> Always copying the memory is slower (and memcpy is the slow operation)
> than an alternative approach
> that "knows" about the memory allocation of ZGC but this coupling and the
> complexity seem like they aren't
> worth it since I haven't seen any visible performance consequence. That
> can be revisited
> some day if need be, but for now we have correctness which is the key as
> well as sufficient performance.
>
> I've created an automated test for the most important on-screen case.
> Also a manual printing test case which invokes ZGC is provided since there
> we also only
> conditionally copy. In the other cases we now always copy so existing test
> cases should over those.
>
> There is some clean up in this fix - one completely unused  (provably so
> because it was #if'd out)
> JNI method in awt_PrintJob.cpp is removed since it had code that looked
> like it needed a workaround,
> which would be somewhat of a waste of effort.
>
> the doPrintBand code and its callee bitsToDevice has code I think we can
> remove too since
> I don't see how it ever gets executed (the top down case for browserPrint
> == true) but
> I think I'll save that for a P4 follow-on since it does nothing that would
> be affected by this
> Windows bug.
>
> One oddity is the in the printing case I observed that some times the
> rendering is performed
> even if an error code is returned. I don't know why, but in code we can't
> tell that it was actually
> rendered and in any case there is no harm in repeating the call with
> copied memory.
>
> We are right before the JDK15 stabilisation fork and this fix needs to go
> there and will
> but the webrev is against jdk/client simply because jdk15 does not exist
> yet !
>
> Please test and review ASAP.
>
> About the bug:
> Microsoft has acknowleged the bug and will publish a knowledge base
> article about it
> but a fix may show up only in a future version of Windows. Not, it seems,
> any time soon.
> Below is a list of potentially affected GDI APIs. Per microsoft whether it
> actually manifests in
> any specific case depends on "branching"
>
>
> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/checkbitmapbits
>
>
> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/wcs/createcolortransform
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setdibitstodevice
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getbitmapbits
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibitmap
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-polydraw
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-drawescape
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createbitmap
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-setbitmapbits
>
>
> https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-getdibits
>
>
> -phil.
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/2d-dev/attachments/20200625/48917281/attachment-0001.htm>


More information about the 2d-dev mailing list