RFR(L): JDK-8046936 : JEP 270: Reserved Stack Areas for Critical Sections
Frederic Parain
frederic.parain at oracle.com
Fri Nov 20 14:23:07 UTC 2015
David,
On 17/11/2015 02:00, David Holmes wrote:
>> Another issue is that writing a rule for javac that is not
>> overpessimistic is also an intractable problem. The pattern
>> "atomic operation followed by method invocation to complete
>> status update" is too general to be the trigger of the annotation.
>> 1) It would lead to an excessive number of methods being annotated,
>> and by consequence an over-sizing of the reserved area.
>> 2) Another condition to hit the bug is that the stack space of
>> the callee method must be bigger than the caller method with
>> the atomic operation. This information depends heavily on runtime
>> information like HW, compilation policies (inlining) and execution
>> profile (to know which methods are going to be compiled first).
>> javac won't have access to these information to annotated methods.
>
> Right - there is no way to know that the atomic operation followed by
> the call should really be a single atomic operation that we can't
> implement that way. Given many atomic operations have subsequent code
> actions, the pattern would degenerate to simply involve the atomic
> operation, and that wouldn't work well at all.
>
> Further, it isn't always that pattern. Consider for example
> ReentrantReadWriteLock.sync.tryRelease():
>
> int nextc = getState() - releases;
> boolean free = exclusiveCount(nextc) == 0;
> if (free)
> setExclusiveOwnerThread(null);
> setState(nextc);
>
> here we clear the owner first, but then may fail to set the new state.
> Actually detecting where a stackoverflow can lead to inconsistent
> state requires detailed scrutiny of the code. I would like to be able
> to reason that setState requires no more stack than
> setExclusiveOwnerThread, but I can't do that simply by code inspection.
Exactly. The inspection of method attribute in the class file doesn't
give reliable
information. It only provides a hint about the stack space required
when the method
is interpreted. The stack space required once the method has been JIT
compiled
is fully platform dependent, no way to reason with it at the Java source
level.
>>> Perhaps we should simply start with j.u.c.locks as the initial
>>> candidate
>>> sets of classes and work out from there. The idiomatic Lock usage:
>>>
>>> l.lock();
>>> try {
>>> // ...
>>> } finally {
>>> l.unlock();
>>> }
>>>
>>> epitomizes the critical-finally-block scenario (though the lock()
>>> problem is much more subtle). AQS is too low-level to flag any specific
>>> function I think; and the other synchronizers perhaps too high-level,
>>> with fewer idiomatic use-cases that obviously benefit from this.
>>>
>>> In that regard I don't agree with where Fred has currently placed the
>>> annotation in ReentrantLock and AQS. The placements appear arbitrary at
>>> first glance - though no doubt they are the result of a deep and
>>> careful
>>> analysis. But it is far from obvious why the annotation is placed on
>>> NonfairSync.lock but FairSync.tryAcquire(), for example.
>>
>> The annotation is used on methods with the problematic pattern
>> "atomic operation followed by method invocation".
>> Their execution could lead to data corruption if atomic operation
>> is executed successfully but the following method invocation
>> fails because of SOE.
>>
>> In the NonFairSync class, this pattern is in the lock() method,
>> while the tryAcquire() is only a method invocation. So lock()
>> is annotated and tryAcquire() is not. Note that the method
>> invoked by try acquire is nonfairTryAcquire() which has the
>> problematic pattern and is annotated.
>>
>> In the FairSync class, the situation is reversed: lock() is
>> just an method invocation and it is not annotated, and
>> tryAcquire() has the problematic pattern and is annotated.
>>
>> I tried to put the annotation as close as possible to the
>> critical sections in order to minimize the size requirement
>> for the reserved area.
>
> Okay I now understand the rule you were applying. But it isn't obvious
> from the code. Further, there seem to be other places which could also
> suffer serious problems. In AQS doReleaseShared() we have:
>
> 720 if (!h.compareAndSetWaitStatus(Node.SIGNAL, 0))
> 721 continue; // loop to recheck
> cases
> 722 unparkSuccessor(h);
>
> which could seemingly throw SOE without unparking the successor.
>
> Any compound action that logically forms a "transaction" would need to
> be immune to SOE.
>
I agree but describing all patterns that logically forms a"transaction"
is far
above the goal of this JEP. The reserved stack area is a mechanism that
can be used to mitigate effects of stack overflows when their occur in these
"transaction". It is obvious that there's to automatic way to annotate all
critical sections or "transactions". This is why I only annotated the code
relative to ReentranLocks (which was the initial bug) and delayed the
annotation of JDK (or j.u.c) critical sections to a future work. As you
noticed above, it is important to have a deep understanding of the code
in order to identify which sections of code should be annotated.
Note that the annotation doesn't make the code "immune" to SOE.
The SOE still happens, it just delayed until the end of the annotated
section in order to mitigate the damages.
>>> I would be tempted to place them on all the public lock/unlock methods
>>> of the Lock implementations, rather than on the internal AQS-derived
>>> functions, and AQS itself.
>>
>> I've tried that :-) The amount of code being executed with
>> the ReservedStackAccess privilege tends to increase very
>> quickly and I was concerned about keeping the size of the
>> reserved area small.
>
> How much space does each level of calling need? I know that is hard to
> answer but are we talking a few bytes, few kb, many kb?
>
The few sections I've investigated and dis-assembled were small, far
less than a few kb. However, some sections I've tried to annotate
because they looked critical were accessed through different paths,
or with different entry points. My expertise on the java.util.concurrent
code is limited, and I was not confident I could put the annotation on
the more appropriate places.
If you feel confident you could make this analysis and annotation work,
I think it will be a very valuable addition to this JEP work.
Thanks,
Fred
> I'd be looking at expanding the current application of the annotation
> to cover all affected areas of AQS, AQLS, ReentrantLock and
> ReeentrantReadWriteLock. StampedLock is a bit more problematic - there
> it seems we do need to annotate the public API - but if we do it then
> j.u.c.locks is at least covered.
>
> Thanks,
> David
>
--
Frederic Parain - Oracle
Grenoble Engineering Center - France
Phone: +33 4 76 18 81 17
Email: Frederic.Parain at oracle.com
More information about the core-libs-dev
mailing list