RFR: 8313648: JavaFX application continues to show a black screen after graphic card driver crash
Thorsten Fischer
duke at openjdk.org
Sun Oct 1 15:18:56 UTC 2023
On Tue, 8 Aug 2023 17:54:03 GMT, Marius Hanl <mhanl at openjdk.org> wrote:
>> Hi,
>>
>> I did open the bug report. Some notes to this PR:
>>
>> My colleagues and I are able to reproduce this bug regularly, even though it takes sometimes up to 3 or 4 weeks until the D3DERR_DEVICEHUNG error shows up. We are currently evaluating two versions of fixes, but until now we do not have any results. I will post them as soon as I got them.
>>
>> Version 1 (this version): Based on the observation, that the TestCooperativeLevel/CheckDeviceState method returns D3D_OK again after about 20 - 60 seconds, the reinitialize is called after the first time the state is returning D3D_OK. The 'isHung' flag stores the information until then.
>>
>> Version 2: calls reinitialize directly after D3DERR_DEVICEHUNG has been returned. Basically
>> if (hr == D3DERR_DEVICEREMOVED || hr == D3DERR_DEVICEHUNG ) { .. }
>>
>> I did not modify the validatePresent method, as for our workaround (see ticket) it was not necessary. At least the native call swapchain->present dows not return that error code (https://learn.microsoft.com/en-us/windows/win32/api/d3d9/nf-d3d9-idirect3dswapchain9-present). I did not look decisively into all the native calls behind D3DRTTexture#readPixels.
>>
>> As I said I will post the results (prism.verbose output) for the 2 versions later as a base for discussions.
>
> As I also worked/checked this classes in https://github.com/openjdk/jfx/pull/1200, I now have a much better understanding of the code (and the communication with Direct3d9) and agree, this looks like the right thing to do in this situation.
Thank you @Maran23 for taking a look into this! Sadly it did not work out as expected.
Our observations with current proposal ("Version 1"): When the error occurred, the state did not go into D3D_OK again, but it stayed at D3DERR_DEVICEHUNG. The D3DERR_DEVICEHUNG error message was printed over and over (for > 1 day).
For "Version 2" (directly calling reiniztialize after the error): One day we saw the Prism initialization text 3 times in a row and the app was running fine. As it happened during a holiday week, we are unsure if there were 3 distinct errors or if there was only one error that took 3 attempts/iterations. (We turned application logging off to not pollute the console log.)
Especially that the current proposal did not work out is very unfortunate, becuase either we did not analyze correctly what happend or the error behavior is not that predictable. Either way, we are now re-running our workaround version withe the 5-minute loop (see bug ticket) and include timestamps and "Version 2" also with additional timestamp information.
We are going to rerun both version mutliple times, to have somewhat reliable information. I'll update findings here.
-------------
PR Comment: https://git.openjdk.org/jfx/pull/1199#issuecomment-1704297025
More information about the openjfx-dev
mailing list