Migrating library classes to value classes

Mon Mar 4 22:26:08 UTC 2024

> On Mar 4, 2024, at 10:56 AM, Kevin Bourrillion <kevinb9n at gmail.com> wrote:
> 
> But migrating to a value class is rescinding functionality and so inherently *not* backward-compatible. So, libraries can only do it via a multi-step process, over time. This thread is happening because I think this project probably wants to support that process, but I'm not clear yet on how it would do so.

To help define the problem: assuming you already have a class that has the shape of a value class (final fields, no internal identity assumptions, final or abstract class, compatible superclass), adding 'value' is a binary- and (nearly*) source-compatible change.

Then there's behavioral compatibility; the JEP helpfully enumerates the potential issues:

>>     • The == operator may treat two instances as the same, where previously they were considered different
>>     • Attempts to synchronize on an instance will fail, either at compile time or run time
>>     • The results of toString, equals, and hashCode, if they haven't been overridden, may be different (preferably, the existing class already overrides equals and hashCode in a way that eliminates the dependency on identity)
>>     • Assumptions about unique ownership of an instance may be violated (for example, an identical instance may be created at two different program points)
>>     • Performance will generally improve, but may have different characteristics that are surprising

(*The compile-time error regarding synchronization is, in fact, a source incompatibility. We think it's helpful to catch the error earlier, but it will make some programs fail to compile.)

In the JDK, we are very concerned about compatibility, so we defined "value-based class" to require private (or non-private deprecated) constructors and construction via identity-ambiguous factories. This provides a specified basis for most of the behavioral changes that come with 'value', except for the fail-fast behavior of synchronization.

So I guess I'd divide the issues into two categories:

1) Things that can be controlled by the class declaration

There are the basic class structure things (fields, sub- and superclasses). These are pretty self-evident—if you've got a candidate class, you probably already meet these requirements. In anticipation of 'value' coming, maybe you'd like to add some 'final' modifiers in the right place for things that were not explicitly final before.

Surely if you're prepared to abandon identity, you've already overridden equals/hashCode to support a notion of equality that isn't identity-based. 'toString' probably, too, although I think you can assume your users won't/shouldn't depend on the contents of the random number appearing in the default 'toString' output.

If you're also worried about how people will use '==', you can match the JDK and provide factories that don't commit to identity details. You can even link to the "value based class" definition in the Java SE docs, and specify that your class follows these rules.

Would it be useful to have some tooling that checks all of these things for you? Eh, maybe. I'm not sure there's that much to check—most of it pretty naturally flows from the fact that you're declaring a class that fits in the value-candidate bucket; and making a deliberate choice to avoid identity guarantees via factory-based construction and some extra javadoc is no harder than making a deliberate choice to add a '@ValueBased' annotation.

2) Corner-case use site failures

The main purpose of JEP 390 was to more strongly discourage synchronization on types like Integer, because that code will break when Integer becomes a value class. I think that's already a corner case—most people do not write code like this. But java.lang.Integer gets used by everyone, and so an occasionally usage is bound to show up.

s/Integer/SomeLibraryClass/ and I think we're talking about orders of magnitude fewer use cases. Still possible, of course, but I'm pretty skeptical in the likelihood of all of the stars aligning with any frequency:

- User code that is targeted to an older VM right now, but will want to target a newer VM later
- A user synchronizing on SomeLibraryClass
- A usage that can be detected with static tooling
- A user willing to use the tooling
- A version of the tooling new enough to implement this check

You can do the same exercise with WeakReference. Maybe when we're done there will be one or two other identity-related things like this that simply stop working. But they're just so rare, it's hard to imagine tooling dedicated to catching them (and static tools won't catch everything anyway).

---

When we were working out the details of JEP 390, we considered making the @ValueBased annotation public/standard, and decided against it, because we wanted to avoid introducing layers of new concepts for programmers. "How is a value-based class different from a value class?" is not something we want showing up in Java tutorials.

Instead—and I realize this is totally theoretical for me and very practical for you, so I appreciate any hard-earned wisdom you can provide—I'm not that concerned about saying version N of a library migrates some classes to be true value classes, and that should be just fine for most of their users. If somebody has a special issue with synchronization or ==, they should stick with library version N-1 until they can fix it.

> In fact, the earlier a version of Java we could backport this annotation and its javac support to, the better, AFAICT. With each version earlier we could push it, that much more code gets to become safely forward-compatible today.

Our normal approach to javac would be to put new warnings in the latest version only. Best we could do is, say, javac 23. And since we expect support for value classes to follow soon after, most of the benefit would be to people who write, e.g., 17- or 21-targeted code, but compile it using the latest JDK.