Terminology bikeshedding summary

William Price wrprice at gmail.com
Thu Jan 20 22:47:05 UTC 2022


Hello all,

I've been observing the Valhalla-spec-experts list via the archives for
quite a while and enjoying the progress.  I am eagerly awaiting the new
JEPs and meanwhile thinking to myself how I would mentor other developers
on how and when to (not) use these features once released into the language.

With the recent thread on bikeshedding your B1/B2/B3 categories, some
thoughts popped into my head and I'd like to be so bold as to share them
from the perspective of a (power) user and someone acting in a mentor role.

> #1. value class vs. [something else] class

Value types will have no object identity, and that's important to
understand from the perspective of monitors and how the JVM treats such
types in memory, but I don't think most developers synchronize on their
objects.  The effects on `==` vs `equals(Object)` behaviors are probably
the most important for "lay" and beginning Java devs to get right.  So in
my head I read the above as, "value equivalence class" vs "intrinsic
identity class", with a focus on the behavior of the `==` operator.

Of course, that's too verbose as far as syntax goes.  I think "value class"
is clear and concise for source code.

> #2.
> primitive class vs. bare value class vs. [something else] value class
> primitive type vs. bare value type vs. [something else] value type
> primitive value vs. bare value vs. [something else] value

I'll add my voice behind those saying leave "primitive" alone as much as
possible. The definition of the word matches w/ the specialized behavior
supported by the underlying VM.  The set of primitive types remains the
base case of all data structures enabled by the type system.  Long-time
users, at least those who are less receptive to change, are going to have
enough on their plate w/ the introduction of "value class".

The use of the word "bare" isn't sitting well with me, either; probably
because it will require a verbose definition to have any meaning to most
users. I don't think developers are going to think of their values as
"clothed" or not.  ;-)  The closest use of "bare" in computation that I can
think of is "bare metal" referring to hardware and the lack of
virtualization (or abstraction!).  I can see an argument that flattened
layouts are closer to the "bare" metal, but that's a VM/hardware centric
view.  To your average developer, Valhalla is introducing quite a bit more
complexity to the VM abstraction between the source code and the metal.
(That's assuming the developer cares to understand how the VM works its
magic in the first place.)

>From the above two paragraphs, I'm squarely in camp "[something else]" and
I'll get to that something in a moment.

> #3:
> primitive value vs. object
>
> We're trying to make a distinction between primitive values being "class
instances" and
> calling them "objects", but for many developers, especially beginners,
that sounds like
> meaningless pedantry.

+1.  Let "objects are instances of a class" remain true and apply to all of
these new types.  One only has to look through the Java Tutorials section
on Creating Objects[1] to see that the developer's syntax when working with
proposed value- and [something else] classes will largely mirror the
existing experience with classic objects -- all the way from `new`,
constructor-call syntax, and `instanceof`.

What Valhalla is proposing to change is that [something else], which had
previously been limited to primitives, are now objects (instances) too, but
not all objects have identity. If that interpretation is correct, then
"everything is an object" isn't too hard to explain.

> #3 is more like:
>
> primitive value vs. bare object
>  Or
> primitive value vs. [something else] object
>
> Where object is split not only into value object and identity objects,
but bare (value) objects
> are another split. You get bare vs heap just as you get value vs identity
as cleavage planes
> in the universe of objects.
>
> Also:
>
> Legacy primitives would not be objects in any case. But we can mock up
classes to wrap
> and/or emulate them and even declare in the user model that these very
special classes in
> some sense “are” the primitives.

This is what caused a *click* in my head and prompted this e-mail.  It
reminded me of how, after Java 5 introduced `enum` types, I don't recall
myself or any colleagues ever referring to "enum classes" outside of those
dealing directly with reflection or bytecode manipulation.  They're simply
"enums".  Despite the initial shock that one is defined in source as a
`[<modifier>] enum` absent the `class` or `interface` keyword, we all now
understand that there's a class behind it, just as we previously understood
that there was a `java.lang.Class` behind a proper `interface` type.

Similarly, the primitive-value/bare/[something else] type WILL be
understood to be both an object, and a value-type, and have a class
representation at runtime/reflection.  In the documentation, of course, it
will need to be heavily emphasized that [something else] is not just
flattenable but is also a value type. However, I don't feel that these
attributes require heavy emphasis in the formal name. I favor "public
[something]" over "public [something [value]] class", just as Java 5 did
for enum.

I've been working professionally in Java land for a long time, but when I
look at the attributes of primitive/bare/[something else], as in JEP 401,
what I see is a struct.  Not the struct from Java's roots in C/C++ but the
struct[2] from C#, which could be considered a closer comparison these
days.  Though the Valhalla proposal is not 1:1 with C#, the latter's
(readonly) structs have methods, support inlining outside of heap, and many
similar characteristics.  This similarity may bring with it familiarity to
developers coming from .NET ecosystems as a tangential benefit. For
developers unfamiliar with C#, "struct" is still existing terminology that
-- without any further definition -- should evoke thoughts of data
structures and/or in-memory representation.

Given this is a thread on bikeshedding, my humble opinion for syntax
related to these new Valhalla classes/types is as follows:

class Foo        // identity-based L-type
value class Bar  // value-based L-type
struct Qat       // flattened value Q-type

In recognition that structs in other languages do not implicitly impose
`final` on their fields, and in the hope that perhaps direct inlined
mutation might be allowed by future Java versions, I would also suggest
that javac require an *explicit* `final` modifier on fields declared within
a `struct` rather than rely on the implicit behavior of general value
classes. Yes, it's more verbose, but that's countered by advice to users
that they should use `struct` judiciously (due to tearing, etc.) and not go
overboard with use -- in other words, the impact felt from that extra
boilerplate should be rare.

Of course, if you're certain you'll never-ever allow such fields to be
non-final, then my above argument is moot.  But I have had real-life cases
where I think I would benefit from encapsulating and inlining struct-like
classes within another class, avoiding the indirection, but supporting
in-place mutation of individual struct fields.  Implicitly final fields do
not allow you to change your mind on this point without breaking source
backwards compatibility in a way that might not be noticed upon a
subsequent recompile.

I'll stick my neck out farther and bring in this snippet from a different
thread on the list.  Brian, you wrote in response[3] to Remi:

> I think where you're going is some flavor of this:
>
>     special-legacy-backwards value class Integer
>         with-legacy-primitive int { ... }

I may have missed the need for [special-legacy-backwards] separate from
[with-legacy-primitive].  Given that [with-legacy-primitive] would be
reserved for the JDK alone and not user-defined classes, I thought of
several existing keywords that could be used in that place without *too*
much head-scratching.  But one in particular stood out for its potential
simplicity:

public struct Integer extends Number implements int, Comparable<Integer> {
  private final int value;                      ^^^^
  /* ... */
}

- `int` and the other Language Keywords for primitives are not interfaces,
so this is the special JDK-reserved behavior and the only "smell" that
jumps out to me
- `implements int`, read as English, says exactly what you mean for readers
to understand
- no new keywords or reserved identifiers required
- The "backwards" behavior (Integer.class == int.ref?) is only applied to
these special cases
- Does not use the defined type (Integer) as a member field (non-circular)

That's more than enough for now.  Thank you for reading this far, and I'm
eagerly anticipating the fruits of both Valhalla and Loom.

[1]: https://docs.oracle.com/javase/tutorial/java/javaOO/objectcreation.html
[2]:
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/struct
[3]:
https://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2022-January/001764.html

-- 
William Price


More information about the valhalla-spec-observers mailing list