Constant propagation & withfield updation.

Thu Mar 8 05:57:07 UTC 2018

On Mar 7, 2018, at 9:07 PM, Srikanth <srikanth.adayapalam at oracle.com> wrote:
> 
> 
> Thanks for weighing in John. I would say lworld branch tip behavior matches what you describe as the required behavior.
> 
> Or better phrased, "lworld branch tip behavior matches *what I think* you describe as the required behavior."

Good!

> Three of your assertions below do trigger some uncertainty in my mind, so let me expressly re-ask rather than assuming/pretending that I fully understand you.
> 
> See below:
> 
> On Thursday 08 March 2018 02:39 AM, John Rose wrote:
>> On Mar 6, 2018, at 10:48 PM, Srikanth <srikanth.adayapalam at oracle.com> wrote:
>>> At the moment, javac allows updates to both blank final and non-blank final instance
>>> fields of a value class via the __WithField operator.
>>> 
>>> I am trying to confirm that we want to allow updates to initialized non-blank final fields too. (Or not)
>> You mean should __WithField lower to withfield, regardless of whether
>> the field was initialized or not?
>> 
>> The short answer is "yes".  An initialized field (final or not) is simply the
>> first version of the field value.  Object classes allow fields to be updated,
>> and so should value classes.
> 
> (1) So when you say "Object classes allow fields to be updated", are you including
> mechanisms such as deserialization and reflection ? Using purely linguistic means
> an Object class cannot update a non-blank final (i.e initialized) instance field and hence
> my original question as to whether a value class should be allowed to update a non-blank final
> (i.e initialized final) instance field via withfield.

There's a twisty idea in there, which you caught:  Many objects have non-final
fields, and those can be updated.  The same design patterns sometimes apply
to values, despite the lack of identity:  You make a new value when you need
to update the field.  But often there's still the strong notion of the new value being
the "new version" of the previous value, as with the "i++" in a for-loop:  The value
of i is different in each iteration but the successive versions of i are derived incrementally
from the previous versions.  If "i" were an object it might retain its identity and
update its state, while if it is a value it must make a new version of itself.
For composite value types, "i++" patterns often update just part of the value
keeping other parts the same.  For example, if we were to value-ize the
Iterator type, its "next" method would return two results, the next collection
item, and the new value of itself.  (With i++ and iterators, the second
result is implicitly written back to a variable, either i or an iterator state
variable.  A value-ized iterator probably holds the present result and
is ready to return its next version as the only result of next, but that's
a detail.)

Everything I just said about stateful object types can also be modeled
with read-only value-based objects, which is yet another connection
between values and objects.

In short, both values and object are used to model incrementally
updating state, in many of the same ways, although the details
differ.  The root of the difference is the distinction between putfield
and withfield.

> 
> I read your answer as an unambiguous yes - and that matches branch tip behavior -  but the sentence "Object classes allow fields to be updated,and so should value classes." reads distracting in the current context as an argument in favor.

It's a tricky argument, because it appeals to deep similarities between
values and objects which sometimes don't look similar on the surface.

> 
> 
>>   The differences in syntax (put vs. with)
>> are minor.  The common thread is that fields are for both initializing
>> and updating.  Value fields are marked final as a reminder that there is
>> no write-back from simple assignment (except in constructors), but
>> there is no reason to forbid updates.  Updating a value makes a new value.
>> 
>> Regarding the blank vs. non-blank distinction:  It is a shallow one, really
>> just sugar for replicating the non-blank initializer into every constructor.
>> Taste in sugar at this point doesn't affect updatability.
> 
> (2) One further distinction is in how javac generates code for reads of non blank final instance fields initialized with compile time constant expressions. For a blank final or non-final fields, there would always be a getfield while for non blank final instance fields initialized with compile time constant expressions, the compiler would directly push the constant value onto the operand stack.

Yes.  I guess there are really six cases for finals, with different compiler
behaviors for init and read:

static constant final  I=ConstantValue, R=ConstantValue
static blank final  I=putstatic, R=getstatic
other static final  I=putstatic, R=getstatic

instance constant final  I=ConstantValue+putfield, R=ConstantValue
instance blank final  I=putfield, R=getfield
other instance final  I=putfield, R=getfield

So the non-constant cases differ only in the surface syntax.
Blank-ness is not so special, but constant-ness is special.

> This was the central point of my original question: that what we really want is for the javac compiler to (a) allow updates to initialized final fields via __WithField and (b) for the updated value to be observed, always issue a getfield and ignore the constant initializer

I see; I missed the point of your question; I thought you were asking about
non-blank finals in general, and I answered both 

Although I don't like the rules for ConstantValue on instance finals, I suspect we
can't change that in the long run, even for value types (though they are new).
OTOH we could decide to deprecate this behavior, if we wanted. I'm on the
fence about that.

In an object class, you almost always don't use "final" with a field initializer,
so you don't run into this corner case.  I've seen programmers accidentally
leave out the "static" modifier on a constant declaration, and it takes a
while before they notice that the fields are taking up space in every instance.

In a value class, since we mandate "final" on everything, you can easily fall
into the same trap, of accidentally defining a ConstantValue attribute on an
instance field.

The way out of the trap, of course, is to wean ourselves off of the explicit
"final" modifiers for value types; make the mandated behavior the default
without the extra reinforcement of the "final" modifiers.  The same point
goes for __Flattenable, of course, and for the "final" on the whole class.
The ACC_FINAL bits on fields and class, and the ACC_FLATTENABLE
bits on fields, should be automatically set under the hood, in the right
places.

For now it's less confusing to mandate the modifiers, and that makes
us fall into the trap.  So here's my advice:  Don't respect ConstantValue
attributes when reading value type fields.  That's your (b) proposal
above.  At some point in the future we will decide whether to start
respecting them again and back of from explicit "final", or take some
other path.

One reason to retract the explicit "final" modifier on fields is so we
can assign a useful meaning to "final" if it really does occur explicitly,
which would be rare.  It would mean something like "this value is set only
in the constructor, and no other places (ignoring deserialization)".
Fields not marked with explicit "final" would be settable in more places,
just like in object classes, and just like we are experimenting with now.

This is just an option; there might be other moves we'd prefer in order
to tweak the access behavior of value type fields.  Here are the access
behavior classes I'm thinking of:

field is private-open:  nobody can touch it except the nest, which can use getfield/withfield at will
field is public-read:  anybody can use getfield; the nest can use withfield/putfield anywhere
field is public-open:  anybody can use a getfield or withfield/putfield instruction on it
field is private-constructed:  only official constructors can withfield/putfield, only nest can getfield it
field is public-constructed:  only official constructors can withfield/putfield, anybody can getfield it

In an object class today:
public non-final => public-open
public final => public-constructed
private non-final => private-open
private final => private-constructed

In a value class today:
public final => public-read
private final => private-open
public non-final => illegal or same as public final
private non-final => illegal or same as private final

In a value class tomorrow:
public non-final => public-read
private non-final => private-open
public final => public-constructed
private final => private-constructed

This will align the behavior of value classes more closely with
object classes, at the cost of removing the "training wheels"
of all the extra final modifiers.

The only non-alignment is that we have no proposal
on the table for public-read in object classes and
public-open in value classes.  The modifier combination
"public non-final" describes the missing state for
the opposite kind of class:

class ObjClass {
  public /*non-final*/ int x;  // public-open, not secure
  public __PrivateWrite int x;  // public-read, more secure
}
__ByValue class ValClass {
  public /*non-final*/ int x;  // public-read, very secure
  public __PublicWrite int x;  // public-open, less secure
}

I don't know how to spell __PrivateWrite for objects
and __PublicWrite for values.

> 
> I hear you saying yes to both (a) and (b)

Correct.

>> 
>> (A long answer takes us into the complex world of "reconstructors", which
>> deserves a separate discussion.)
>> 
>>> Such allowance interferes with constant propagation - I was searching for the text in JLS that says what the compiler must do when a final field is initialized with a compile time constant expression - I located only the slightly oblique reference in 17.5.3, which reads:
>>> 
>>> "If a final field is initialized to a constant expression (§15.28) in the field declaration, changes to the final
>>> field may not be observed, since uses of that final field are replaced at compile time with the value of the
>>> constant expression."
>> The net effect of that provision is that fields with constant initializers
>> get ConstantValue attributes.  This is true whether the fields are
>> static or instance.  It is IMO an oversight that the ConstantValue
>> attribute applies to instance fields, since this provides no additional
>> power over constant static fields, and it tends to bloat the heap
>> with lots of copies of the same value.
>> 
>> The translation strategy for fields with ConstantValue is specialized,
>> but *only* for static fields. The language above does not put any
>> constraint on translation strategy for any instance field.
>> 
>> If it makes it easier, just drop ConstantValue attributes from instance
>> fields in value types.  But I don't think it matters.  You can ignore the
>> issue when you are *initializing* constant fields that are *non-static*.
>> 
>> When reading constant fields, the ConstantValue attribute helpfully
>> hands you a constant value at compile time.  So what you wrote into
>> the field is irrelevant, except in separate compilation edge cases.
> 
> (3) But it is unhelpful to consult the ConstantValue attribute for value
> class fields - otherwise any updates won't be observed, no ? Hence the
> tip behavior to always issue getField for reads.

Correct.  Let's keep it that way, but reconsider when we refactor
the default modifiers on fields.  If final fields become a special
thing (say, for constructor-only initialization as with objects)
then we can align the ConstantValue behavior between values
and objects.

> 
>> 
>> I'm talking about ConstantValue attributes, but there is another
>> sub-case of initialized final fields (static or instance), when the
>> initializer expression is not a compile-time constant.  In that case,
>> the translation strategy is exactly the same as for non-final fields,
>> but the JVM will reject the code if you emit a putfield in the wrong
>> place.
>> 
>> Does that help?
>> 
>>> This passage which occurs in the context of deserialization and reflection based updates to final fields will also be relevant for WithField updates.
>> (See P.S. on deserialization, below.)
>> 
>>> ATM, I have disabled such propagation of constants for value class's final instance fields initialized with constant expressions and reads of these fields result in fresh getfield instructions.
>> The basic rule for instance fields is, when writing ignore the
>> ConstantValue attribute and when reading ignore the field
>> value.  For statics it's even simpler:  The ConstantValue
>> attribute does everything.
> 
> (4) I think you mean, allow updates via withfield, but when reading ignore the ConstantValue ?

Yes.

— John