RFR: bug: Timely Reducing Unused Committed Memory

Wed Sep 19 08:37:31 UTC 2018

Hi Rodrigo,

I pasted your reply here to keep the discussion in one thread.

> I understand that it is hard to define what is idle. However, if we require the
> user to provide one, I guess that most regular users that suffer from the problem
> that this patch is trying to solve will simply not do it because it requires knowledge
> and effort. If we provide an idle check that we think will benefit most users, then 
> we are probably helping a lot of users. For those that the default idle check is 
> not good enough, they can always disable this idle check and implement the idle
> check logic it in an external tool.
> 
I agree, if we can find a solution that benefits most users, we should 
do it. And this is why I would like to hear from more users if this 
would benefit their use cases. Another thing that I don't fully 
understand is why the flags are manageable if there isn't supposed to be 
some external logic that sets them?

> We can also change the semantics of "idleness".  Currently it checks the load.
> I think that checking the allocation rate might be another good option (instead of
> load). The only corner case is  an application that does not allocate but consumes 
> a lot of CPU. For this case, we might only trigger compaction at most once because, 
> as it does not allocate memory, we will not get over committed memory (i.e., the other 
> checks will prevent it). The opposite is also possible (almost idle application that allocates 
> a lot of memory) but in this scenario I don't think we want to trigger an idle compaction.
> 

This is my main problem when it comes to determine "idleness", for some 
applications allocation rate will be the correct metric, for others it 
will be the load and for a third something different. It feels like it 
is always possible to come up with a case that needs something different.

> Having said that, I am open to change this flag or even remove it as it is one of the
> hardest to get right.
> 

As I said before, to me it feels like just having a periodic GC interval 
flag that is manageable would be a good start. Maybe have constraint 
that the periodic GC only occurs if no other GCs have happened during 
the interval. Could you explain how your use case would suffer from such 
limitations?

Thanks,
Stefan

> cheers,
> rodrigo

On 2018-09-13 14:30, Stefan Johansson wrote:
> Hi Rodrigo,
> 
> Sorry for being a bit late into the discussion. We've had some internal 
> discussions and realized that there are some questions that I need to 
> bring up here.
> 
> I'm trying to better understand under what circumstances this feature is 
> to be used and how a user should use the different flags to tweak it to 
> their use case. To me it feels like GCFrequency would be enough to make 
> sure that the VM returns memory on a timely basis. And if the flag is 
> managed, it can be controlled to not do periodic GCs during high load. 
> With that we get a good way to periodically try to reduce the committed 
> heap.
> 
> The reason I ask is because I have a hard time seeing how we can 
> implement a generic policy for when the system is idle. A policy that 
> will apply well to most use cases. For some cases having the flags you 
> propose might be good, but for other there might be a different set of 
> options needed. If this is the case then maybe the logic and policy of 
> when to do this can live outside the VM, while the code to periodically 
> do GCs lives within the VM. What do you think about that? I understand 
> the problems you've stated with having the policy outside that VM, but 
> at least we have more information to act on there.
> 
> We know that many have asked for features similar to this one and it 
> would be nice to get input from others on this to make sure we implement 
> something that benefits the whole user base as much as possible. So 
> anyone with a use case that could benefit from this, please chime in.
> 
> Regards,
> Stefan
> 
> 
> 
> On 2018-09-07 17:37, Rodrigo Bruno wrote:
>> Hi Per and Thomas,
>>
>> thank you for your comments.
>>
>> I think it is possible to implement this feature using the service 
>> thread or using a separate thread.
>> I see some pros and cons of having a separate thread:
>>
>> Pros:
>> - using the service thread exposes something that is G1 specific to 
>> the rest of the JVM.
>> Thus, using a separate thread, hides this feature from the outsite.
>>
>> Cons:
>> - Having a manageable timeout is a bit more tricky to implement in a 
>> separate/dedicated thread.
>> We need to be able to handle switch on and off. It might require some 
>> variable pooling.
>> - It requires some more memory.
>>
>> Regardless of the path taken, I can prepare a new version of the patch 
>> whenever we decide on this.
>>
>> cheers,
>> rodrigo
>>
>> Per Liden <per.liden at oracle.com <mailto:per.liden at oracle.com>> 
>> escreveu no dia sexta, 7/09/2018 à(s) 11:58:
>>
>>     Hi Thomas,
>>
>>     On 09/07/2018 10:10 AM, Thomas Schatzl wrote:
>>     [...]
>>      >    overnight I thought a bit of the implementation, and given the
>>      > problem with heap usage of the new thread, and the requirement of
>>     being
>>      > able to turn on/off that feature by a managed variable, the best
>>     change
>>      > would probably reusing the service thread as you did in the 
>> initial
>>      > change.
>>
>>     I'm not convinced that this should be handled outside of G1. If 
>> there's
>>     a need to have the flag manageable at runtime (is that really the
>>     case?), you could just always start the G1DetectIdleThread and 
>> have it
>>     check the flag. I wouldn't worry too much about the memory 
>> overhead for
>>     the stack.
>>
>>     cheers,
>>     Per
>>