Ad hoc type restriction
Archie Cobbs
archie.cobbs at gmail.com
Fri Oct 10 21:07:23 UTC 2025
When I read the draft JEP "Null-Restricted and Nullable Types" a few things
really clicked in my head.
The first is that when reviewing Java code, it is very common to need to
convince oneself that some variable != null in order to understand the code
or prove its correctness. So the addition of a "!" right in front of
its declaration will be a huge, everyday win. I look forward to adding this
feature to the growing list of other "How can the compiler help me prove
this code is correct?" features like: generics, exhaustive switch cases,
sealed types, etc.
But null-restriction is just one specific example of a more general
concept, let's call it "ad hoc type restriction".
A whole lot of Java code is written using a strategy of "I have some value
which can be modeled by type X, but only a subset Y of values of type X are
valid for my particular use case, so I'm going to pass these values around
as parameters of type X and then either validate them everywhere one can
enter from the outside world... which, um, requires me to keep track of
which values have come from the outside world and which have not, hmm..."
A null-restricted reference type is just the most common example of this -
where the subset Y = X \ { null }.
Other examples...
- Using int for: The size of some collection (can't be negative)
- Using byte or char for: An ASCII character
- Using short for: A Unicode basic plane character (2FE0..2FEF are
unassigned)
- Using String for: SQL query, phone number, SSN, etc. (must be non-null
and have the proper syntax)
But regardless of what X or Y is, it's very common for the validation step
to be missed or forgotten somewhere, leading to bugs. One might even argue
that a *majority* of bugs are due to incomplete validation of some sort or
another. Keeping track of when validation is required and manually adding
it in all those places is tedious and error prone.
OK, how could the compiler help me? I am starting with some type T, and I
want to derive from it some new type R which only permits a subset of T's
values. I want the compiler to guarantee this without too much performance
penalty. I want R to be arbitrary - however I define it.
Today I can "homebrew" type restriction on reference types by subclassing
or wrapping T, for example:
public class PhoneNumber {
private String e164; // non-null and in E.164 format, e.g.,
"+15105551212"
public PhoneNumber(String s) {
... parse, validate E.164 syntax, normalize, ...
}
public String toString() {
return this.e164;
}
}
But doing that is not ideal because:
1. This doesn't work for primitive types (e.g., a person's age, grade
point average, number of children, etc.)
2. A wrapper class loses all of the functionality that comes with the
wrapped class. For example, I can't say pn1.startsWith(pn2) even though
both values are essentially just Strings.
3. There is performance overhead
[Side note - in Valhalla at least problem #3 goes away if the wrapper class
is a value class?]
So I pose this question to the group: How could the language and compiler
take the null-restricted types idea and fully generalize it so that:
- It's easy to define custom type restrictions on both primitive and
reference types
- Runtime performance is "as good as casting", i.e., validation only
occurs when casting
Here's one idea just for concreteness: Use special type restriction
annotations, e.g.:
@TypeRestriction(baseType = String.class, enforcer =
PhoneNumberEnforcer.class)
public @interface PhoneNumber {
boolean requireNorthAmerican() default false;
}
public class PhoneNumberEnforcer implements TypeRestrictionEnforcer<String>
{
@Override
public void validate(PhoneNumber restriction, String value) {
if (value == null || !value.matches("\+[1-9][2-9][0-9]{6,14}"))
throw new InvalidPhoneNumberException("not in E.164 format");
if (restriction.requireNorthAmerican() && !value.charAt(0) != '1')
throw new InvalidPhoneNumberException("North American number
required");
}
}
// Some random API...
public void dial(@PhoneNumber String number) { ... }
// Some random code...
String input = getUserInput();
try {
dialer.dial((@PhoneNumber String)input);
} catch (InvalidPhoneNumberException e) {
System.err.println("Invalid phone number: \"" + input + "\"");
}
String input2 = getUserInput();
dialer.dial(input2); // "warning: implicit cast to @PhoneNumber
String" ?
The compiler would insert bytecode or method call-outs at the appropriate
points to guarantee that any variable with type @PhoneNumber String would
always contain a value that has successfully survived
PhoneNumberEnforcer.validate(). In other words, it would provide the same
level of guarantee as String! does, but it would be checking my custom type
constraint instead.
The Checker Framework does something like the above, but it has limitations
due to being a 3rd party tool, it's trying to solve a more general problem,
and personally I haven't seen it in widespread use (is it just me?)
Thoughts?
-Archie
--
Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20251010/901ed1ac/attachment-0001.htm>
More information about the amber-dev
mailing list