On tearing

Dan Heidinga heidinga at redhat.com
Wed Apr 27 18:32:07 UTC 2022


This is circling around the same root issues as the "Foo / Foo.ref
backward default" thread - which is really when should a developer
pick a B3 over a B2.

Kevin's thought experiment in that thread seems to be approaching this
same idea from a different angle:
> (Thought experiment: if we had an annotation meaning "using the .val type is not a great idea for this class and you should get a compile-time warning if you do" .... would we really and I mean *really* even need bucket 2 at all?)

In that thread I suggested some rough rules that said B2s were
preferred in API signatures and B3.val are really about storage.
Kevin outlined a different set of rules on when to prefer B3.val for
APIs.

We originally split B2 out from B3 to support no-good-default values
(aka allow null), support atomicity and avoid tearing. Anything
missing in that list?

WIth B3, we relax some of the constraints required to guarantee the B2
invariants and this allows - but doesn't require! - the JVM to further
optimize the memory density.  A conforming JVM could implement all
B3's using an indirection and have them behave identically to B2s -
and JVMs will likely do so for large B3s or volatile fields.  B3s are
more akin to hint than a promise.  Increasing the memory density
exposes tearing. The two are coupled from an implementation
perspective.

Many of the properties we want for B2 classes are possible because we
adopted references (L carriers).  If we shift towards guaranteed
atomicity for (some) B3.vals, we're going to need to re-examine the VM
model and look at how we represent these additional constraints so the
VM can enforce them.

The VM can provide some tearing-related guarantees for Qs without
indirection but they are hardware dependent - 64bit for sure on all
64bit hardware, 128bit on some newer intel hardware, possibly
different constraints on still other platforms - but maybe that's OK?
Declaring a type must not tear makes it harder for the VM to provide
better density.

John has repeatedly said "Q means go and look".  And the VM already
has to do that before flattening Qs to determine if flattening is
reasonable given the VM's heuristics (ie: size).  Letting users say
they want tear-free B3.vals fits within the existing VM model for L vs
Q and while it may limit the benefits of a particular Q type, seems
like a reasonable thing for users to do.

So aiming for "More declaration-site control over atomicity, so
classes with invariants can ensure their invariants are defended." is
reasonable, fits the existing implementation constraints, but will
cost those users potential density benefits.

The biggest concern I have with this approach is that instead of
having 3 buckets, we're now exposing more of a buffet of options to
users.  Circling back to where I started this email - good defaults
are critical and so is good guidance on when to pick each of the
options or performance cargo cults will undercut the work to split out
the different cases.

--Dan






On Wed, Apr 27, 2022 at 9:59 AM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> Several people have asked why I am so paranoid about tearing.  This mail is about tearing; there’ll be another about user model stacking and performance models.  (Please, let’s try to resist the temptation to jump to “the answer”.)
>
> Many people are tempted to say “let it tear.”  The argument for “let it tear” is a natural-sounding one; after all, tearing only happens when someone else has made a mistake (data race).  It is super-tempting to say “Well, they made a mistake, they get the consequences”.
>
> While there are conditions under which this would be a reasonable argument, I don’t think those conditions quite hold here, because from both the inside and the outside, B3 classes “code like a class.”  Authors will feel free to use constructors to enforce invariants, and if the use site just looks like “Point”, clients will not be wanting to keep track of “is this one of those classes with, or without, integrity?”  Add to this, tearing is already weird, and while it is currently allowed for longs and doubles, 99.9999% of Java developers have never actually seen it or had to think about it very carefully, because implementations have had atomic loads and stores for decades.
>
> As our poster child, let’s take integer range:
>
>      __B3 record IntRange(long low, int high) {
>
>         public IntRange {
>             if (low > high) throw;
>         }
>     }
>
> Here, the author has introduced an invariant which is enforced by the constructor.  Clients would be surprised to find an IntRange in the wild that disobeys the invariant.  Ranges have a reasonable zero value.  This a an obvious candidate for B3.
>
> But, I can make this tear.  Imagine a mutable field:
>
>      /* mutable */ IntRange r;
>
> and two threads racing to write to r.  One writes IntRange(5, 10); the other writes IntRange(2,4).  If the writes are broken up into two writes, then a client could read IntRange(5, 4).  Worse, unlike more traditional races which might be eventually consistent, this torn value will be observable forever.
>
> Why does this seem worse than a routine long tearing (which no one ever sees and most users have never heard of)?  Because by reading the code, it surely seems like the code is telling me that IntRange(5, 4) is impossible, and having one show up would be astonishing.  Worse, a malicious user can create such a bad value (statistically) at will, and then inject that bad value into code that depends on the invariants holding.
>
> Not all values are at risk of such astonishment, though.  Consider a class like:
>
>     __B3 record LongHolder(long x) { }
>
> Given that a LongHolder can contain any long value, users of LongHolder are not expecting that the range is carefully controlled.  There are no invariants for which breaking them would be astonishing; LongHolder(0x1234567887654321) is just as valid a value as LongHolder(3).
>
> There are two factors here: invariants and transparency.  The above examples paint the ranges of invariants (from none at all, to invariants that constrain multiple fields).  But there’s also transparency.  The second example was unsurprising because the API allowed us to pass any long in, so we were not surprised to see big values coming out.  But if the relationship between the representation and the construction API is more complicated, one could imagine thinking the constructor has deterred all the surprising values, and then still see a surprising value.  That longs might tear is less surprising because any combination of bits is a valid long, and there’s no way to exclude certain values when “constructing” a long.
>
> Separately, there are different considerations at the declaration and use site.  A user can always avoid tearing by avoiding data races, such as marking the field volatile (that’s the usual cure for longs and doubles.)  But what we’re missing is something at the declaration site, where the author can say “I have integrity concerns” and constrain layout/access accordingly.  We experimented with something earlier (“extends NonTearable”) in this area.
>
>
> Coming back to “why do we care so much”.  PLT_Hulk summarized JCiP in one sentence:
>
>     https://twitter.com/PLT_Hulk/status/509302821091831809
>
> If Java developers have learned one thing about concurrency, it is: “immutable objects are always thread-safe.”  While we can equivocate about whether B3.val are objects or not, this distinction is more subtle than we can expect people to internalize.  (If people internalized “Write immutable classes, they will always be thread-safe”, that would be pretty much the same thing.)  We cannot deprive them of the most powerful and useful guideline for writing safe code.
>
> (To make another analogy: serialization lets objects appear to not obey invariants established in the constructor.  We generally don’t like this; we should not want to encourage more of this.)
>
> There are options here, but none are a slam dunk:
>
>  - Force all B3 values to be atomic, which will have a performance cost;
>  - Deny the ability to enforce invariants on B3 classes (no NonNegativeInt, no IntRange);
>  - Try to educate people about tearing (good luck);
>  - Put out bigger warning signs (e.g., IntRange.tearable) that people can’t miss;
>  - More declaration-site control over atomicity, so classes with invariants can ensure their invariants are defended.
>
> I think the last is probably the most sane.
>
>



More information about the valhalla-spec-experts mailing list