[Rev 03] RFR: 8217472: Add attenuation for PointLight

Fri Apr 17 16:04:23 UTC 2020

On Wed, 15 Apr 2020 20:59:40 GMT, Kevin Rushforth <kcr at openjdk.org> wrote:

>> Here are the results on Phil's machine, which is a Mac Book Pro with a graphics accelerator (Nvidia, I think).
>> 
>> Without the patch:
>> 2000 quads average 8.805 fps
>> 
>> With the patch:
>> 2000 quads average 4.719 fps
>> 
>> Almost a 2x performance hit.
>
> Conclusion: The new shaders that support attenuation don't seem to have much of a performance impact on machines with
> an Intel HD, but on systems with a graphics accelerator, it is a significant slowdown.
> So we are left with the two choices of doubling the number of shaders (that is, a set of shaders with attenuation and a
> set without) or living with the performance hit (which will only be a problem on machines with a dedicated graphics
> accelerator for highly fill-limited scenes). The only way we can justify a 2x drop in performance is if we are fairly
> certain that this is a corner case, and thus unlikely to hit real applications.  If we do end up deciding to replicate
> the shaders, I don't think it is all that much work. I'm more worried about how well it would scale to subsequent
> improvements, although we could easily decide that for, say, spotlights attenuation is so common that you wouldn't
> create a version that doesn't do that.  In the D3D HLSL shaders, ifdefs are used, so the work would be to restore the
> original code and add the new code under an ifdef. Then double the number of lines of gradle (at that point, I'd do it
> in a for-each loop), then modify the logic that loads the shaders to pick the right one.  For GLSL, the different parts
> of the shader are in different files, so it's a matter of creating new versions of each of the three lighting shaders
> that handle attenuation and choosing the right one at runtime.

I discussed this with a graphics engineer. He said that a couple of branches do not have any real performance impact
even on modern mobile devices, and that, e.g., on iOS 7 using half floats instead of floats was improving shader
execution dramatically. Desktops with NVIDIA or AMD and even Intel modern cards can process dozens of branches with no
significant performance degradation.

He suggested actually to have all the light types in a single shader file (looking ahead here). He also suggested not
to permute on shaders based on the number of lights and just pass in a uniform for that number and loop over it. The
permutations on the bump, specular and self illuminations components are correct (not sure we are not doing that for
the diffuse component). If we add later shadows, which is not on my near to-do list, then we should permute there.

It also depends on our target hardware. If we take into account hardware from, say, 2005 then maybe branching will
cause significant performance loss, but that hinders our ability to increase performance for newer hardware. What is
the policy here?

I have a Win10 laptop with a GeForce 610M that I will test this weekend to see if the mobile NVidia cards have some
issue.

-------------

PR: https://git.openjdk.java.net/jfx/pull/43