Pausing Quantum Renderer

Mon Nov 23 19:01:37 UTC 2015

I tend to agree that the current proposed solution seems minimally 
intrusive. The only other thought that I have is wondering whether we 
could treat it more like the case of a lost surface on Windows. This 
seems very similar to that case.

-- Kevin

David Hill wrote:
> On 11/23/15, 11:43 AM, Morris Meyer wrote:
>> As the original author of the Quantum toolkit and the renderer, this 
>> sort of addition goes against what I had in mind when designing the 
>> PaintCollector and the renderer. As the renderer is built around a 
>> ThreadPoolExecutor, stopping system functionality for an edge case is 
>> putting the cart before the horse.
>>
>> When a Window is miniaturized or set to zero size, or moved 
>> offscreen, there should be no pulses fired at the Window.
> My concern with setting the window size to zero would be any 
> additional work that is done. If there is a layout to go to 0,0 and 
> then another layout when we go to the new real size then that is not 
> something I like.
>
> Pausing the processing of the rendering as has been proposed here 
> seems like one way to be minimally intrusive on a compute constrained 
> devices.
>
> Another options would be to set the window visibility.
>
> A real concern regardless of what is done is the potential problem of 
> multithreaded issues.
>
> Skipping render pulses for a period seems pretty safe from a 
> multithreaded point of view. It also seems to be the least likely to 
> cause a lot of throw away work.
>
> Another thought, setting a window to visible(false) at least takes it 
> off the rendering queue 
> (PaintCollector.getInstance().removeDirtyScene(this);). So in theory, 
> walking through the window list and marking all of them as not 
> visible. The problem I have with any approach along these lines is the 
> risk that state will be confused or lost, or that we do a lot of 
> additional work in a place where we are already going to cause work as 
> a new graphics context and display size are being dropped in. Even 
> something like setVisible(false) has a lot of notification work 
> associated with it.
>
> Given the above, I would tend to stick with the proposed solution.
>
> Dave
>
>
>>
>> This seems more like an issue of ensuring that if the window is 0x0 
>> that it is not considered dirty, and if there is no dirty scene that 
>> nextPulseRequested() is never called.
>>
>> There does need to be work done on Quantum to ensure that it cycles 
>> down to no CPU usage when windows are hidden and/or miniaturized on 
>> battery operated devices. That needs to be done cleanly, but even 
>> then pausing the ThreadPoolExecutor seems to be the wrong way of 
>> going about it.  The TPE model is more startup, work, then shutdown, 
>> and the QuantumToolkit intermediates JavaFX application state with 
>> that model.
>>
>> Best regards,
>>
>>         --morris meyer
>>
>> On 11/22/15 6:24 AM, Johan Vos wrote:
>>> I implemented this in the javafxports clone of the OpenJFX 8u-dev 
>>> repo, and
>>> the diff is here:
>>>
>>> https://bitbucket.org/javafxports/8u-dev-rt/commits/67a0fded8208095bd04efda6045aa257e245d6bc 
>>>
>>>
>>> I am more than happy to create an issue in the openjdk bug system
>>> (enhancement?) and provide a patch there as well, but I think it 
>>> needs a
>>> bit more discussion first?
>>>
>>> - Johan
>>>
>>> On Sat, Nov 21, 2015 at 9:23 PM, Johan Vos <johan.vos at gluonhq.com> 
>>> wrote:
>>>
>>>> I have a working implementation that needs more stress-testing on
>>>> different platforms, but it seems a good and easy solution so far.
>>>> I have this on QuantumToolkit:
>>>>
>>>>      @Override
>>>>      public void pauseRenderer(){
>>>>          Application.invokeAndWait(() -> this.pause = true);
>>>>          PaintCollector.getInstance().waitForRenderingToComplete();
>>>>      };
>>>>
>>>>      public void resumeRenderer(){
>>>>          Application.invokeAndWait(() -> this.pause = false);
>>>>      };
>>>>
>>>> pause is a boolean that is checked for in
>>>> void pulse(boolean collect) { ... }
>>>>
>>>> The difficulty I mentioned in my previous mail (how do we know 
>>>> there are
>>>> no renderJobs pending/running) was solved easily as there exists this
>>>> PaintCollector.waitForRenderingToComplete method.
>>>> This might make the pauseRenderer a bit slower, and maybe this is not
>>>> needed in all usecases. In that case, we can remove it from the
>>>> pauseRenderer() and we can add it either in the Monocle 
>>>> implementation that
>>>> will call pauseRenderer, or in a Android/iOS specific code.
>>>>
>>>> However, it seems to me that if you want to pause the renderer, you 
>>>> most
>>>> often want to make sure no one is still writing to the glSurface 
>>>> after the
>>>> pauseRenderer method is called, so I think it makes sense to keep 
>>>> it there?
>>>>
>>>> - Johan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Nov 20, 2015 at 8:14 AM, Johan Vos <johan.vos at gluonhq.com> 
>>>> wrote:
>>>>
>>>>> I didn't plan to intercept or cancel pending/submitted jobs, but I 
>>>>> have
>>>>> to wait until they are done before the android callback method 
>>>>> returns.
>>>>>
>>>>> When Android is about to destroy the context, it will call the
>>>>> surfaceTextureDestroyed method on the Activity (the FXActivity in our
>>>>> case). As long as that method doesn't return, the context won't be
>>>>> destroyed. But once that method returns, the context might become 
>>>>> invalid
>>>>> any moment. So if there are still jobs that want to do a 
>>>>> swapBuffer after
>>>>> we return, those can crash or (even worse) corrupt the egl system.
>>>>>
>>>>> So it seems to me inside the implementation of 
>>>>> surfaceTextureDestroyed,
>>>>> we need to achieve 2 things:
>>>>> 1. make sure no new pulses are generated.
>>>>> 2. don't return while the QuantumRenderer is still executing jobs.
>>>>>
>>>>> Those 2 things can be combined in a single Toolkit.pauseRenderer() 
>>>>> but it
>>>>> might be better to only achieve the first task in 
>>>>> Toolkit.pauseRenderer().
>>>>> The contract for this method is than that no new pulses will be
>>>>> generated, but existing renderJobs might still be running when 
>>>>> this method
>>>>> returns.
>>>>> The second thing, waiting for the renderJobs to be finished, can 
>>>>> be done
>>>>> in the Android specific implementation.
>>>>>
>>>>> - Johan
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 19, 2015 at 10:20 PM, Kevin Rushforth <
>>>>> kevin.rushforth at oracle.com> wrote:
>>>>>
>>>>>> This might be a tricky semantic to achieve, and great care will be
>>>>>> needed to ensure no deadlocks or race conditions. Not running any 
>>>>>> more
>>>>>> pulses after this method returns seems fine, but it might be 
>>>>>> better to let
>>>>>> any existing renderJobs run (possibly discarding the results). 
>>>>>> This way you
>>>>>> could send the pause message to the renderer as a special 
>>>>>> renderJob and not
>>>>>> have to worry about jobs that are scheduled but not yet run.
>>>>>>
>>>>>> -- Kevin
>>>>>>
>>>>>>
>>>>>>
>>>>>> Johan Vos wrote:
>>>>>>
>>>>>>> After some experiments, here is my current thinking:
>>>>>>>
>>>>>>> Toolkit can have 2 new methods:
>>>>>>> pauseRenderer()
>>>>>>> resumeRenderer()
>>>>>>>
>>>>>>> When pauseRenderer is called, it should be guaranteed that after 
>>>>>>> this
>>>>>>> call,
>>>>>>> no new pulses are fired until resumeRenderer is called.
>>>>>>> That is not hard, but it is not enough. Before we pause the 
>>>>>>> pulses, the
>>>>>>> previous pulse probably submitted a renderJob to Prism, executed 
>>>>>>> on the
>>>>>>> QuantumRenderer ThreadPoolExecutor. That job should run fine, as 
>>>>>>> the
>>>>>>> next
>>>>>>> pulse (when we're back) will call
>>>>>>> GlassScene.waitForRenderingToComplete().
>>>>>>> So we have to wait until there are no running or pending tasks 
>>>>>>> in the
>>>>>>> QuantumRenderer as well.
>>>>>>>
>>>>>>> - Johan
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Nov 18, 2015 at 9:58 PM, David Hill <David.Hill at oracle.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 11/18/15, 3:45 PM, Johan Vos wrote:
>>>>>>>>
>>>>>>>> Johan,
>>>>>>>>      I think that it would be reasonable to put in something to 
>>>>>>>> Quantum
>>>>>>>> that causes the render loop to "pause", and then resume later. I
>>>>>>>> envision
>>>>>>>> this toggle as causing the render loop to skip, rather than 
>>>>>>>> tinkering
>>>>>>>> with
>>>>>>>> the pulses.
>>>>>>>>
>>>>>>>> When resume is called, it might be best to treat the world as 
>>>>>>>> dirty.
>>>>>>>>
>>>>>>>> Added to Toolkit, this would allow someone like Monocle to make 
>>>>>>>> the
>>>>>>>> toggles as is appropriate.
>>>>>>>>
>>>>>>>> If this works for you, perhaps you could prototype it ?
>>>>>>>>
>>>>>>>> regards,
>>>>>>>>     Dave
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Android, a JavaFX Application might transfer control to 
>>>>>>>>> another app
>>>>>>>>> (and
>>>>>>>>> become invisible) and enter the foreground later. In that 
>>>>>>>>> case, the
>>>>>>>>> GLSurface we are rendering on becomes invalid. In order to avoid
>>>>>>>>> problems
>>>>>>>>> and save battery, we want to pause the renderer thread, but this
>>>>>>>>> turns out
>>>>>>>>> to be more difficult than I expected.
>>>>>>>>>
>>>>>>>>> When our app transfers control, we get a callback from 
>>>>>>>>> Android. We
>>>>>>>>> intercept this in javafxports, and we set the Screen 
>>>>>>>>> width/height to
>>>>>>>>> 0/0
>>>>>>>>> as
>>>>>>>>> we don't want to render on the (invalid) surface while we lost
>>>>>>>>> control.
>>>>>>>>> When we regain control, we resize the Screen and the app renders
>>>>>>>>> again.
>>>>>>>>>
>>>>>>>>> The reason we set the width/height to 0/0 is because the
>>>>>>>>> PresentingPainter
>>>>>>>>> will call SceneState.isValid() and this returns false in case
>>>>>>>>> getWidth()
>>>>>>>>> or
>>>>>>>>> getHeight() are 0.
>>>>>>>>>
>>>>>>>>> However, SceneState extends PresentableState and it overrides the
>>>>>>>>> update
>>>>>>>>> method. It will call PresentatbleState.update() which will set 
>>>>>>>>> the
>>>>>>>>> viewWidth to the width of the new Screen (hence, 0) , but 
>>>>>>>>> after that
>>>>>>>>> it
>>>>>>>>> overwrites the viewWidth with camera.getViewWidth(). The 
>>>>>>>>> latter still
>>>>>>>>> contains the old value. A quick inspection shows that
>>>>>>>>> camera.setViewWidth()
>>>>>>>>> is called when validate(...) is called on NGDefaultCamera, 
>>>>>>>>> which is
>>>>>>>>> called
>>>>>>>>> by ES2Context.updateRenderTarget() which happens during 
>>>>>>>>> rendering,
>>>>>>>>> hence
>>>>>>>>> *after* the PresentingPainter checks if the width is 0.
>>>>>>>>>
>>>>>>>>> So immediately after we set the width of the Screen to 0 (on 
>>>>>>>>> the FX
>>>>>>>>> App
>>>>>>>>> Thread), a Pulse happens, and this one still things the screen 
>>>>>>>>> is the
>>>>>>>>> original size. While the pulse is happening, the android system
>>>>>>>>> destroys
>>>>>>>>> our context, and the rendering fails. Moreover, the EGL system 
>>>>>>>>> is in a
>>>>>>>>> unpredictable state (recreating the surface fails).
>>>>>>>>>
>>>>>>>>> A very dirty workaround for this is to wait for 1 pulse (with 
>>>>>>>>> the new
>>>>>>>>> pulselistener API this should be possible) before we return 
>>>>>>>>> from the
>>>>>>>>> callback method called by Android when the surface is about to be
>>>>>>>>> destroyed. That way, we will have 1 bogus rendering on an 
>>>>>>>>> existing
>>>>>>>>> (but
>>>>>>>>> about-to-be-destroyed) surface.
>>>>>>>>>
>>>>>>>>> But it would be better if there is some way to tell the quantum
>>>>>>>>> renderer
>>>>>>>>> to
>>>>>>>>> immediately stop rendering. Existing pulses are no problem, as 
>>>>>>>>> the
>>>>>>>>> renderLock guarantuees that we set the size to 0 only when no 
>>>>>>>>> other
>>>>>>>>> thread
>>>>>>>>> (quantum renderer) has a lock on the renderLock.
>>>>>>>>>
>>>>>>>>> - Johan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> -- 
>>>>>>>> David Hill<David.Hill at Oracle.com>
>>>>>>>> Java Embedded Development
>>>>>>>>
>>>>>>>> "A man's feet should be planted in his country, but his eyes 
>>>>>>>> should
>>>>>>>> survey
>>>>>>>> the world."
>>>>>>>> -- George Santayana (1863 - 1952)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>
>
>