RFR: 8238954: Improve performance of tiled snapshot rendering

Tue Mar 10 06:08:02 UTC 2020

On Fri, 28 Feb 2020 18:06:30 GMT, Frederic Thevenet <github.com+7450507+fthevenet at openjdk.org> wrote:

>> I've marked this PR as a WIP for the time being because I've only tested it on the d3d rendering pipeline so far, so I
>> think it is too early to call for a formal review yet. Still, I welcome feedback if someone wants to have a look at it
>> already.
>> In a nutshell, this is what this PR does:
>> 
>> - Reverts changes from 8088198
>> - Implements tiling within `QuantumToolkit::renderToImage` instead
>> - Gets the maxTextureSize directly from the `ResourceFactory` instance instead of relying on
>>   `PrismSettings.maxTextureSize` (which may be wrong in the event the maxTexture size supported by the hardware is less
>>   than the clamped value set  in PrismSettings) Tries to align the PixelFomat for the target `WritableImage` with that of
>>   the `RTTexture` when possible, since converting IntARGB to ByteBGRA (or the other way round) is a major cost factor
>>   when copying large amounts of pixels. Avoids as much as possible having to resize the tile in between subsequent calls
>>   to `RTTexture::readPixels`, as it is another major cost factor, under the d3d rendering pipeline at least. (Native
>>   method `D3DResourceFactory_nReadPixels` attempts to reuse the same surface in between calls but is forced to create a
>>   new one if dimensions are not exactly the same, which is quite costly).
>> 
>> TODO:
>> 
>> - [x] Make sure chosen PixelFormat is optimum when using es2 pipeline.
>> - [ ] Consider using subtexture pixel read with es2 (SW and d3d don't support it) as a better alternative to relying on
>>   same size tiles to avoid surface thrashing.
>
> I finally got a chance to do some more extensive testing when running this patch with the es2 pipeline on Linux.
> It works as expected, and from what I saw, using a IntARGB pixelBuffer when no WritableImage is provided avoids
> swapping around pixel components, like under d3d. I've also verified than the patch's behaviour is correct when a
> writable image is provided as a parameter to Node.snapshot, whether the underlying image is actually a PlatformImage or
> a wrapper for a PixelBuffer, which corrects another limitation of the previous implementation.

On My windows10 machine, I can observe 20 to 30 % reduction in time to take snapshot.
Can you please capture the time the way you did [here](https://github.com/openjdk/jfx/pull/68#issuecomment-578192074)
for previous PR

-------------

PR: https://git.openjdk.java.net/jfx/pull/112