RFR: 8315066: Add unsigned bounds and known bits to TypeInt/Long [v2]

Fri Sep 22 03:10:12 UTC 2023

On Thu, 21 Sep 2023 10:44:28 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

> This binary operation, unfortunately, is ad-hoc and may not be trivial for more complex sets.

To be clear, there exists a fairly trivial representation for this particular problem, that to map a value set

    {t | t s>= lo && t s<= hi && t u>= ulo && t <= uhi && (t & zeros) == 0 && (~t & ones) == 0} -> (lo, hi, ulo, uhi, zeros, ones)

and the corresponding dual set

    {t | t s< lo || t s> hi || t u< ulo || t u> uhi || (t & zeros) != 0 || (~t & ones) != 0} -> (hi, lo, uhi, ulo, ~zeros, ~ones)

However, this is pretty backward as I come up with the maps from the meet operation

    (smin(lo1, lo2), smax(hi1, hi2), umin(ulo1, ulo2), umax(uhi1, uhi2), zeros1 & zeros2, ones1 & ones2)

and it simply moves the complexity and burden of correctness from `xmeet` where branching on the `_dual` field is pretty straightforward to `xdual` where this representation is a fair leap of faith. So IMO it is clearer to have a clear distinction between a value set and a dual set.

Another problem is that currently the same `TypeInteger` can be the representation of 2 different things, an empty value set or a valid dual set, such as:

https://github.com/openjdk/jdk/blob/775e22a8a68b3bcedabc673b1d612dee8028d5d0/src/hotspot/share/opto/ifnode.cpp#L1012

Although they may have the same representation, the dual set corresponding to the point `(x, y)` is not the same as the empty value set `(x, y)`, similar to how a point `(1, 2)` is different from the complex number `1 + 2i`, they live in different spaces. This poses potential problems such as the non-uniqueness of the empty set, dual set incorrectly reports itself as empty, etc. This currently does not present any problem because the usage of dual sets is limited to the calculation of intersection operations only. But having non-canonical empty sets floating around that we can extract their bounds in the same way as a valid set may introduce problems. As a result, I think verifying the validity of a `TypeInteger` when it is created is a desirable approach.

Ideally, we can have different types for an integer value set and its corresponding dual set. This is because a value set and a dual set are fundamentally different things and trying to cramp them into one representation space is not mathematically sound. The tradeoff however is that a fairly significant changes to the `Type` hierarchy is needed which is unnecessary for the only benefit of usages in `join` operations. As a result, I added `assert(!_dual, "")` to all methods apart from `xmeet` and `xdual`. The bounds, however, can still be freely accessed since they are not methods, so if a `TypeInteger` representing a dual set somehow appears in an unexpected location then accessing its bounds may lead to surprising behaviours.

This is my understanding of the `TypeInteger` hierarchy, please correct me if there is any misunderstanding, thanks a lot.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15440#issuecomment-1730740721