RFR(L): JDK-8046936 : JEP 270: Reserved Stack Areas for Critical Sections

Tue Nov 24 18:16:40 UTC 2015

On Nov 24, 2015, at 8:46 AM, Karen Kinnear <karen.kinnear at oracle.com> wrote:

> Doug,
> 
> I have been thinking about this more from the perspective of the original problem
> we set out to solve

I apologize if this has already been considered -- but for a lot well designed systems,
occasional application failure is an expected fact of life and we design our HA around
this with automatic restarts and monitoring.

If it is so hard to detect / resolve a stack overflow situation, maybe one useful
mitigation of such awful situations (juc hangs, corrupt state, lost locks) would be to
actually treat a stack overflow as a fatal condition, much like OutOfMemoryError.

In fact, we configure all of our production servers with the moral equivalent of
-XX:OnOutOfMemoryError="kill -9 %p"
because once we are in a possibly inconsistent state, we would much rather nuke
it from orbit and start over.

Maybe introducing some new options, like
-XX:OnStackOverflowError=
or
-XX:TreatStackOverflowAsOOM (piggyback on the existing tunable above)
would allow end users to avoid the really bad behavior in a controllable way?

> , which was identified in the concurrent hash map usage, at the
> time in the class loading logic. While the class loading logic has changed, I think we
> have enough experience with this particular example and have studied
> the code constructs sufficiently that there is value in checking in the small set of 
> JDK changes that target that situation. I also think this gives a sample of
> the kind of model in which this approach can be effective. In addition, having this small set of
> changes provides the ability to test and ensure that the hotspot changes continue to
> work.
> 
> So I would like to recommend that we go ahead and check in the hotspot changes
> and the initial minimal set of j.u.c. updates as a way to put the new mechanism
> in place so that the people with more domain expertise in the java.util.concurrent
> libraries can experiment with the mechanism and add incremental improvements.
> 
> thanks,
> Karen
> 
>> On Nov 22, 2015, at 7:04 PM, Doug Lea <dl at cs.oswego.edu> wrote:
>> 
>> On 11/20/2015 12:40 PM, Karen Kinnear wrote:
>>> Totally appreciate the suggestion that the java.util.concurrent modifications
>>> be done by folks with more domain expertise.
>>> 
>>> Would you have us incorporate the initial minimal set of j.u.c. updates or none
>>> at all?
>> 
>> Sorry that I'm still in foot-drag mode on this.
>> Reading David and Fred's exchanges reinforce my thoughts
>> that there is no defensible rule or approach to
>> use @ReservedStackAccess so as to add as little time and
>> space as possible to reduce the occurrence of stuck
>> resources as much as possible during StackOverflowError.
>> 
>> After googling "StackOverflowError java util concurrent" and seeing the
>> range of situations that can be encountered, I don't even know
>> which kinds of constructions to target.
>> And I'm less sure whether using @ReservedStackAccess at all
>> is better than doing nothing.
>> 
>> Maybe there is some decent empirical strategy, but I can't
>> tell until hotspot support of @ReservedStackAccess is in place.
>> So my vote is still to keep the JDK changes out for now.
>> 
>> -Doug
>> 
>