<!DOCTYPE html><html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

</head>

<body><div style="font-family: sans-serif;"><div class="markdown" style="white-space: normal;">

<p dir="auto">Dealing with numeric ranges is error-prone, so I welcome these.</p>

<p dir="auto">Another operation on ranges is membership.  This is easy to derive<br>

from the clamp methods, I think:</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">   checkRange(x,min,max) := clamp(x,min,max) == x

</code></pre>

<p dir="auto">That idiom works even for NaNs, which are never in any range.<br>

I wonder about having a separate method for this. The idiom does<br>

fail for empty ranges, so it’s almost worth making a method that<br>

returns false with empty ranges instead of throwing.  Probably<br>

not, though.</p>

<p dir="auto">But I think there is no harm and real benefit from widening<br>

the value parameters, so value is always either long or double.<br>

That will save casts (which also cause bugs) for some use<br>

cases, such as doing 32x32-to-64-bit arithmetic followed<br>

by overflow checks and/or saturation.  That’s a common use<br>

case for clamping.</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">public static int clamp(long value, int min, int max)

public static long clamp(long value, long min, long max)

public static double clamp(double value, double min, double max)

public static float clamp(double value, float min, float max)

</code></pre>

<p dir="auto">I wonder if there is a use case for these methods (either clamp<br>

or the range-membership method), with 64-bit overflow detection<br>

for complex expressions.  The idea is that you compute the<br>

expression in two domains, double and long.  The double domain<br>

correctly tracks magnitude (but not precision) while long<br>

tracks 64 bits of precision.  At the end, if the double value<br>

is in the range representable by longs, you take the long<br>

value, with full confidence.</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">   long x = …, y = a*x*x + b*x + c;

   double xf = x, yf = a*xf*xf + b*xf + c;

   double xc = clamp(xf, Long.MIN_VALUE, Long.MAX_VALUE);

   boolean saturated = false;

   if (xc != xf) {   // error processing

      saturated = true;  // or throw an exception

      x = xc;  // saturate towards the required magnitude

   }

</code></pre>

<p dir="auto">That use case doesn’t require a membership operation per se,<br>

just clamp.</p>

<p dir="auto">I thought about cross-domain clamps, which preserve 64 bit<br>

precision and double range at the same time in different places:</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">public static long clamp(long value, double min, double max)

public static long clamp(double value, long min, long max)

</code></pre>

<p dir="auto">Those are perhaps overly subtle.  I think it would be OK for<br>

nearly all use cases to cast the min/max values to the appropriate<br>

type (same as value) and then call one of the clamp functions<br>

originally proposed.  Note that (long)(double)Long.MAX_VALUE<br>

is the same as Long.MAX_VALUE, and the same is true for MIN_VALUE.<br>

So these critical parameters would be processed as the same<br>

limits by both the regular and cross-typed overloading.</p>

<p dir="auto">Now for another topic.</p>

<p dir="auto">There is another common range operation which is very often the<br>

source of bugs, and I’d like us to consider it for this group.</p>

<p dir="auto">That is the midpoint operation, naively coded as (min+max)/2,<br>

and more subtly as min+(max-min)/2 and even more subtly what<br>

our friend Mr. Stackoverflow recommends, (min+max)>>>1.<br>

All three of these idioms hide bugs, for some values of<br>

max or min.</p>

<p dir="auto">The fully correct idiom (for 2s complement numbers) is:</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">  midpoint(min, max) := min + ((max - min) >>> 1)

</code></pre>

<p dir="auto">The subexpression max-min never overflows, when it<br>

is regarded as an unsigned number, and the “>>>1”<br>

immediately converts that unsigned number into the<br>

correctly signed positive number.  That’s the only<br>

way to avoid overflow for all inputs.  The third<br>

idiom above (the “smartest” of the three) will fail,<br>

for example, if you feed it negative values.<br>

For some uses that is OK, but for others it isn’t.</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">public static int midpoint(int min, int max)

public static long midpoint(long min, long max)

public static double midpoint(double min, double max)

public static float midpoint(float min, float max)

</code></pre>

<p dir="auto">All kinds of divide and conquer algorithms need a robust<br>

midpoint operation.  And many of those are coded wrongly,<br>

because affine ranges are hard to reason about.</p>

<p dir="auto">The int and long midpoints are more useful than the floating<br>

point ones, because ints and longs are used as indexes into<br>

arrays (including Panama native arrays), where subdivision<br>

and binary search is common (and very often buggy).</p>

<p dir="auto">More generally, this midpoint operation performs an affine<br>

weighted sum within the given numeric range, so one might<br>

think that a floating point “alpha” parameter would be<br>

interesting.  The midpoint operations hardwire alpha=0.5,<br>

and they use clever fast fixpoint division to divide by 2.</p>

<p dir="auto">There are applications for using alpha other than 0.5,<br>

such as a 10x subdivision of a range using alpha=0.1, 0.2,<br>

and so on.  I don’t recommend an “affine midpoint” method,<br>

because I think the clamp methods combined with floating<br>

point math can get the right answers.  For example, given<br>

a 64-bit range, you can use double arithmetic to compute<br>

affine inner points and then clamp them, like this:</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">final double ALPHA = 0.1;  //float would be OK too

final long min = …, max = …;

final double incr = ((double)max - min) * ALPHA;

for (int i = 1; i < 10; i++)

  addPoint(clamp((long)(min + i*incr), min, max));

</code></pre>

<p dir="auto">Doing the grid measurements in double (instead of long) is good<br>

enough for most purposes, and doesn’t cause anomalies even if<br>

the difference between min and max is large.  Using a long<br>

for the difference would lead to bugs when max-min cannot be<br>

represented as a long.</p>

<p dir="auto">Anyway, the above examples tell me that we don’t need a<br>

general affine range splitter if we can use floating point<br>

to compute the splits and use clamp to prevent bugs at the<br>

range boundaries.  But we do need the optimized midpoint<br>

operations, because people always insist on them taking<br>

a couple of cycles, and then they insist on coding them<br>

dangerously.</p>

<p dir="auto">Besides clamp and midpoint is there another operation on<br>

affine ranges that is commonly coded, and with bugs?<br>

I don’t think so.  Range algebra such as union, intersection,<br>

difference, convex hull: those come to mind, but I don’t<br>

think they are as bug-prone (and as frequent) as clamp<br>

and midpoint.  Range translation (moving a range to another<br>

base point) is subject to overflow, but I think that can<br>

be handled by using higher precision and/or overflow checks,<br>

and/or clamping.</p>

<p dir="auto">— John</p>


</div></div></body>


</html>