Range API

Olexandr Rotan rotanolexandr842 at gmail.com
Sun Sep 22 19:01:47 UTC 2024


Hello everyone! I am writing here today to invite everyone to participate
in the discussion regarding the Range APi proposal I have made into JDK.
Here is the pull request link: https://github.com/openjdk/jdk/pull/21122,
and PR text:

This pull request describes the methods of the Range<T> interface. The
Range<T> interface represents a bounded or unbounded range. (From now on,
range, span and interval are used interchangeably, but docs only use
"range")
Goals:

   -

   *Main goal. Standardization of the Range/Interval API:* The primary
   objective of this effort is to provide a standardized interface for working
   with ranges or spans of time (or any ordered types). Many existing
   libraries offer their own custom implementations of ranges, and they differ
   in significant ways, making it harder to use and combine across different
   codebases. Standardization will ensure consistency, interoperability, and a
   more predictable interface across various contexts.
   -

   *Versatile range operations:* provide a comprehensive API for
   manipulating and querying ranges, especially those representing time
   periods or numerical intervals. The API simplifies common tasks like
   checking containment, overlaps, or adjacency between ranges.
   -

   *Support for unbounded ranges:* Unlike many existing libraries, which
   assume intervals are always bounded, this API aims to fully support
   unbounded intervals. Users will be able to define ranges with open starts
   or ends, making it suitable for temporal data that spans indefinitely in
   one direction, such as future projections or historical data with unknown
   starting points.
   -

   *Performance efficiency:* The API aims to provide optimized for
   performance implementation, that takes advantage of all possible
   simplifications and short-circuits.
   -

   *Consistency with existing libraries:* To aid adoption, the API should
   be familiar to developers who have used popular libraries like NodaTime,
   Joda-Time, Three-Ten Extra, and Boost Date_Time, but with enhancements for
   unbounded intervals, negative ranges (?), and optional return types instead
   of null values.

Non-Goals:

   -

   *Handling complex data structures beyond smple ranges:* This API is not
   intended to manage or represent complex data structures beyond ranges. For
   example, ranges that involve intricate internal states, like non-contiguous
   ranges (?) or ranges with multiple gaps, are out of scope.
   -

   *Overly simplifying range types:* While ease of use is a goal, its is
   not an aim to remove support for advanced cases like unbounded or negative
   ranges, even if this results in slightly more complex implementations. The
   API should not be skewed towards being purely a simple data structure for
   bounded ranges.
   -

   *Application-specific logic:* The API is meant to be domain-agnostic and
   general-purpose. It is not intended to allow to embed application-specific
   logic, such as calendar-based date manipulations or domain-specific
   business rules for interval comparison.
   -

   *Replacing existing libraries:* The goal is not to replace established
   libraries like Joda-Time or ThreeTen-Extra, but rather to augment these
   ideas with support for unbounded ranges and additional arithmetic
   operations. Although, it is a goal to provide interface that could
   exisiting libraries could easily retrofit into.

Motivation

The primary motivation behind standardizing the Range API is the *lack of
an established, universal interface* for handling ranges or spans across
various domains. Developers are often forced to work with different,
incompatible range implementations across libraries or to re-implement
common functionality themselves. This leads to redundant code, increased
dependencies, and greater chances for errors.

In many software systems—whether in scheduling, auditing, access control,
or financial services—ranges are used to represent periods of time,
numerical intervals, or validity spans. Without a standardized API,
developers must contend with diverse implementations that often differ in
naming conventions, behavior, and supported features. These variations
create unnecessary complexity, as developers must:

   1.

   *Introduce additional dependencies*: Many libraries provide similar
   functionality for ranges, but since they are not interchangeable,
   developers must often add extra dependencies to cover edge cases or
   specific use cases that are not available in a single library. This bloats
   the codebase and creates maintenance overhead.
   2.

   *Re-implement common logic*: In cases where no single library meets the
   required needs, developers are forced to write their own range-handling
   logic. This reinvention of basic operations such as intersection, union, or
   containment leads to redundancy, increased likelihood of bugs, and
   inconsistency in how ranges are handled across different parts of the code.
   3.

   *Fragmentation across domains*: Different libraries often define their
   own range concepts (e.g., for date-times, numbers, or general comparisons),
   which are rarely compatible with one another. This lack of compatibility
   makes integration between systems difficult, requiring custom adapters or
   conversions.

By defining a standard Range API, the goal is to:

   - *Reduce the dependency footprint:* A common, well-designed API for
   ranges would eliminate the need to import multiple libraries just to handle
   different types of ranges, reducing dependencies in projects and enhancing
   maintainability.
   - *Simplify code and increase reusability:* With a standardized
   interface, developers can write range-related code once and reuse it across
   projects and libraries, confident that the same semantics and operations
   will apply consistently.
   - *Minimize developer errors:* By providing a predictable and
   well-documented interface, the likelihood of misunderstandings or incorrect
   use of range operations will decrease. Developers can trust that operations
   like intersections, unions, and comparisons will behave consistently,
   regardless of the context.

In essence, the lack of standardization in range operations creates
unnecessary complexity, fragmentation, and redundant effort. A standardized
Range API would provide clarity, reduce the need for additional
dependencies, and enable more efficient, reusable, and error-free code
across different projects and domains.
Key API pointsSupport of unbounded intervals

API supports both one- and two-sided. Provided sample (draft)
implementation for ChronoDateTime has 4 separate implementation for each
type of ranges.
Alternatives

   - Many Libraries, like Luxon, C++ boost, NodaTime and many others,
   arguably the most, fo not explicitly support unbounded intervals. This
   reduces complexity of implementation, but takes away many possible
   optimization for edge-cases. Alternative they propose is to use Instant.MIN
   and Instant.MAX or similar to create unbounded-like intervals.

Support for negative intervals,

API supports both positive and negative and positive ranges. This is
questionable and discussion is encouraged.
Advantages

   - Allows more flexible usage of API, which would be helpful for use
   cases like diagrams visualization.

Disadvantages

   - Dramatically increases amount of boilerplate code inside the
   implementations.
   - Makes behaviour of potential methods like boolean endsBefore(T t)
unintuitive.
   Does this mean that end() is before that provided parameter, or latter of
   bounds (i.e. start() for negative range and end() for positive).
   - Limited usability scope. Most use cases would not benefit from
   possibility of negative ranges creation, but would have to suffer
   performance decrease.

In general, either there should be support for negative ranges, or ranges
might be end-exclusve, but not two at the same time, as having them both
together dramatically increases complexity.
Range is not Serializable

Currently ranges are not Serializable. This is due to difficulties
regarding using non-serializable interfaces, like ChronoDateTIme  in sample
implementation.
Alternatives

   - Restrict range type variable to implement Serializable. I see this
   option as undesiarable bacause of how much it narrows use of interface.

Current interface methods list is minimal

For now, API proposed contains minimal amount of methods that are used in
range arithmetics. List of methods is supposed to change as discussion
moves on.
Generic Range class vs Rnage interface + specific inmplementations

Currently, approach is to define interface and list of implementations.
Advantages

   - Ability to introduce specialized for type of range methods. For
   example, Timespan could have Duration toDuration() method, potential
   IntegerRange could have something like LongRange toLongRange() dur to
   limitations of comparability between classes. This would be impossible with
   structural class Range without declaring additional static utility methods.
   - Enhanced validation of annotaion targets as classes, unlike generics,
   arent erased.

Disadvantages

   - Increased amount of classes to maintain.
   - Additional considerations would be required before extending Range
   interface in case if hierarchy non-sealed to ensure backward compatibility.

API DescriptionNB: Since date ranges is supposed to be one of the most
popular if not the most popular use case for range, date-time libraries
were main reference for interface design.
------------------------------
Section: BoundsGeneral notes

   - In *Boost Date_Time* (time_period.begin()), the start and end are
   always defined, meaning there is no concept of unbounded intervals.
   Similarly, some libraries like *Chrono* in Rust assume bounded intervals
   by default. In fact, only a few libraries expose trully unbound ranges.
   Although, while complexity of implementation is increased by this corner
   cases, thier performance also vastly increased by cutting amount of
   operations in each method at least in half (For two-way unbound interval,
   almost all operations return constnat value).

------------------------------
T start()

*Description:*
Returns the start of the range. If the range is unbounded at the start,
this method throws an UnsupportedOperationException. This can be
preemptively checked using isBoundedAtStart().

*Alternatives:*

   - Method could return Optional<T> instead of throwing an exception. I
   see this two approaches roughly identical in terms of pros/cons score, so
   suggestions are much appreciated.

------------------------------
T end()

*Description*:
Returns the end of the range. If the range is unbounded at the end, this
method throws an UnsupportedOperationException. Use isBoundedAtEnd() to
check if the range is bounded.

*Alternatives:*

   - Simallarly to start(), method could return Optional<T> instead of
   throwing an exception. I see this two approaches roughly identical in terms
   of pros/cons score, so suggestions are much appreciated.

------------------------------
boolean isBoundedAtStart()

*Description*:
Returns true if the range is bounded at the start. If unbounded, it returns
false, meaning calling start() will throw an UnsupportedOperationException.

*Alternatives*:

   - *Joda-Time*, *NodaTime*, *Luxon*, and *Moment.js* do not explicitly
   support unbounded intervals by default but can use null or special
   values to represent unbounded starts.
   - *Boost Date_Time* and *Chrono* don’t support unbounded ranges
   directly, so this method is unnecessary.

------------------------------
boolean isBoundedAtEnd()

*Description*:
Returns true if the range is bounded at the end. A false value means the
range is unbounded at the end, and calling end() will throw an
UnsupportedOperationException.

*Alternatives*:

   - Similar to isBoundedAtStart(), most libraries don’t have built-in
   unbounded intervals, but the concept can be simulated using null,
   minimal/maximal possible value etc. Pros and cons were described in API
   notes.

------------------------------
Section: boolean operationsboolean contains(T instant)

*Description*:
Returns true if the given instant falls within the start and end bounds of
the range, otherwise returns false.

*Similar Methods in other libraries*:

   - *NodaTime (Interval.Contains)*
   - *Joda-Time (Interval.contains)*
   - *Luxon (Interval.contains)*
   - *Boost Date_Time (time_period.contains())*
   - And many others...

*Differences with existing APIs*:

   - *Moment.js* doesn’t provide a direct contains method but the
   moment-range plugin adds this functionality with range.contains().

*Note*: this method is present in most interval implementations. Terefore,
I concider as basic and unremovable from the API.
------------------------------
boolean overlaps(Range<? extends T> other)

*Description*:
Checks if the current range overlaps with another range. Returns true if
the two ranges overlap, otherwise returns false.

*Similar Methods in other libraries*:

   - *NodaTime (Interval.Overlaps)*
   - *Joda-Time (Interval.overlaps)*
   - *Luxon (Interval.overlaps)*
   - *Boost Date_Time (time_period.intersects())*
   - And many others...

*Differences with existing APIs*:

   - *Moment.js*: The moment-range plugin provides a similar overlaps() method
   to check overlap.
   - *Chrono* relies on custom interval intersection logic.

*Note*: this method is present in most interval implementations. Terefore,
I concider as basic and unremovable from the API.
------------------------------
General notes on next two methods

Most of the libraries propose API like isBefore(T point) or do not provide
methods like this at all. Since current implementation throws an exception
if interval is not bounded, trivial check for isBefore could become 4-6
lines long. The question basically comes down to whether the Range class
should be more data-structure-like or object-like. I would argue that at
least isBefore(T moment) is required, especially since ranges can be
negative currently. Existence of boolean isBefore(Range<? extends T> other)and
similarisAfter` is up to discussion.
boolean isBefore(Range<? extends T> other)

*Description*:
Returns true if the current range is strictly before another range (i.e.,
ends before the other range starts).

*Differences with other libraries*:

   - *NodaTime*: You’d manually compare End of one interval with the Start of
   another.
   - *Joda-Time*: Manual comparison with Interval.getEnd() and
   Interval.getStart().
   - *Boost Date_Time* and *Chrono* would use custom logic to compare
   time_period or ranges of time, since they don’t have a direct equivalent
   of isBefore().

*Alternatives*

   - Most of the libraries propose API like isBefore(T point) or do not
   provide methods like this at all. Since current implementation throws an
   exception if interval is not bounded, trivial check for isBefore could
   become 4-6 lines long. The question basically comes down to whether the
   Range class should be more data-structure-like or object-like. I would
   argue that at least isBefore(T moment) is required, especially since
   ranges can be negative currently

------------------------------
boolean isAfter(Range<? extends T> other)

*Description*:
Returns true if the current range is strictly after another range (i.e.,
starts after the other range ends).

   - *Similar Methods*:
      - Similar to isBefore(), manual comparisons are used in *NodaTime*,
      *Joda-Time*, and *Luxon* an others.

------------------------------
boolean isBefore(T point)

*Description:*
Determines if the span ends before the given point. This is useful when you
need to check whether a time span occurs entirely before a specific point.

*Alternatives*:

   - Method could be removed from APi at all, if Range is desired to be
   skewed towards being data structure.

------------------------------
boolean isAfter(T point)

*Description:*
Determines if the span starts after the given point. This is useful when
you need to check whether a time span occurs entirely after a specific
point.

*Alternatives*:

   - Similarly to boolean isBefore(T point), method could be removed from
   APi at all, if Range is desired to be skewed towards being data structure.

------------------------------
boolean isNegative()

*Description*:
Returns true if the start of the range is after the end, indicating a
"negative" range.

*Alternatives:*

   - if concidered too niche, negatie timespans could be removed from model.

*Note:* this one is most questionable for me. Do we really need negative
ranges? This is most entirely required in numeric ranges and diagrams,
while introdcues huge complexity overhead for majority that doesnt need
this feature. Negativity might be confusing for users. Would love to hear
thoughs on this matter
------------------------------
Section: Range arithmeticsOptional<Range<T>> intersection(Range<? extends
T> other)

*Description*:
Returns the intersection of the current range with another range. If the
ranges do not overlap, the result is an empty Optional. If they overlap,
the intersection is returned.

*Similar Methods*:

   - *NodaTime (Interval.Intersection())*
   - *Moment.js (via moment-range, range.intersect())*
   - *Joda-Time (Interval.overlap())*
   - And many others...

*Differences with existing APIs*:

   - *Boost Date_Time* returns an empty time_period if no overlap exists,
   instead of an Optional. Some libraries return null (e.g., *NodaTime*).
   - Other libraries return null if intervals arent overlapping. This is
   undesrable, so optional returned instead.

*Note*: this method is present in most interval implementations. Terefore,
I concider as basic and unremovable from the API.
------------------------------
Range<T>[] union(Range<? extends T> other)

*Description*:
Returns the union of two ranges. If the ranges overlap, the result is a
single combined range. If they do not overlap, the result is an array of
two separate ranges.

*Differences with existing APIs*:

   - *NodaTime* and *Joda-Time* support similar logic using custom union
   handling.
   - *Boost Date_Time* has no built-in union() function but you can write
   custom logic to combine or separate intervals.

*Note*: Behaviour of this method is up to change. Currently, it returns
array for maximal performance, but it can (and most likely should) be
wrapped in some monadic class. As an alternative, there may be support for
non-continuous ranges (ones with gaps inside them), then this method should
return thise kind of range.
------------------------------
Optional<Range<T>> gap(Range<? extends T> other)

*Description*:
Returns the gap between two ranges, if they do not overlap. If they
overlap, the result is an empty Optional.

*Differences with existing APIs*:

   - *NodaTime* and *Joda-Time* support custom logic to calculate the gap
   using isBefore(), isAfter(), and manual calculations of the gap.
   - Other libraries return null if intervals are overlapping. This is
   undesrable, so optional returned instead.

------------------------------
Section: potential methodsboolean isEmpty()

*Description:*
Determines if the range is "empty,"

Empty range is its own, separate type of range (basically opposite of
unbounded range). There are many questions regrading this type of range. Is
it bounded at start or end? If so, what should start() or end() return.
Them throwing an exception would violate current contract between
IsBoundedAtX() and 'x()` methods.

*Advantages*

   - Returning empty range instead of Optional might be more user-friendly

*Disadvantages*

   - One more concept in the API model
   - Corner case in IsBoundedAtX() and 'x()` contract.

Potential Methods for API Enhancement

In this section, we explore methods that could be added to the API,
comparing them with similar functionality in popular time-related
libraries. These methods enhance the versatility and clarity of the
Range<T> implementation,
especially in the context of temporal, numeric, and other domain-specific
ranges. Some of these methods are inspired by well-established libraries,
while others are novel suggestions.
------------------------------
boolean encloses(Range<? extends T> other)

*Description*:
Checks whether the current range completely encloses another range, i.e.,
the other range starts after or at the start of the current range and ends
before or at the end of the current range.

   -

   *Similar Methods in Other Libraries*:
   - *NodaTime (Interval.ContainedBy)*
      - *Joda-Time (Interval.contains)*
      - *Luxon (Interval.contains)*
      - *Boost Date_Time (time_period.contains())*
   -

   *Differences with Existing APIs*:
   - Some libraries handle encloses() and contains() in the same method.
      For clarity, this API can separate the two, where contains() is used
      for checking individual points and encloses() is for range-level
      comparison.

------------------------------
boolean abuts(Range<? extends T> other)

*Description*:
Returns true if the current range abuts (i.e., touches but does not
overlap) with another range. This method is useful when determining whether
two ranges are adjacent but do not overlap.

   -

   *Similar Methods in Other Libraries*:
   - *NodaTime (Interval.Abuts)*
   -

   *Alternatives*:
   - Instead of this method, users could manually compare the end of one
      range and the start of another, but including abuts() in the API
      simplifies the logic and reduces error-prone comparisons.

------------------------------
Range<T> extendTo(T point)

*Description*:
Returns a new range that extends the current range to include the given
point. If the point is already within the range, it returns the current
range. Otherwise, it extends either the start or end, depending on the
point's position relative to the range.

   -

   *Similar Methods in Other Libraries*:
   - *NodaTime* and *Joda-Time* do not have explicit methods for this, but
      users can manipulate intervals manually.
      - *Moment.js*: The moment-range plugin offers similar logic via
      manual adjustments to the range.
   -

   *Advantages*:
   - In contrast to manual adjustment, this method automates the process of
      extending ranges, which can be useful in situations where ranges
need to be
      dynamically modified over time (e.g., expanding time intervals
in streaming
      data).

*Alternatives*:

   - Users could manually adjust the range using start() and end() "withers",
   but an explicit extendTo() method offers a more intuitive, built-in
   approach

------------------------------
Range<T> shrinkTo(T point)

*Description*:
Returns a new range that shrinks the current range to exclude the given
point, if possible. If the point is within the range, the range is modified
so that it no longer includes the point. This is useful for splitting
ranges or excluding unwanted time periods or values.

   - *Similar Methods in Other Libraries*:
   No major time libraries provide a direct equivalent to this
   functionality, although similar operations can be manually performed by
   manipulating start and end.

*Alternatives*:

   - Similarly to extendTo, users could manually adjust the range using
   start() and end() "withers", but an explicit shrinkTo() method offers a
   more intuitive, built-in approach.

------------------------------
Range<T>[] difference(Range<? extends T> other)

*Description*:
Returns the difference between the current range and another range (XOR
operations). If the ranges overlap, the result is a new range or two ranges
representing the non-overlapping portions. If the ranges do not overlap,
the result is the current range.

*Adavntages*:

   - This method simplifies computing the difference between two ranges,
   reducing the need for manual boundary comparisons.
   - Completes set of methods required for ranges arithmetics

*Disdavntages*:

   - THis method is inverse of union(Range<? extends T> other), so it has
   same design problems as union.

------------------------------
Range<T> clamp(Range<? extends T> bounds)

*Description*:
Clamps the current range to fit within the specified bounds. If the current
range extends outside of the bounds, it is shortened to fit within the
bounds. If the range already fits within the bounds, it is returned
unchanged.

*Advantages*:

   - This method streamlines the process of adjusting a range to a set of
   bounds, which is especially useful in time-based operations where ranges
   must be constrained within specific periods (e.g., scheduling).

------------------------------
boolean isContiguousWith(Range<? extends T> other)

*Description*:
Determines if the current range is contiguous with another range, meaning
that the two ranges touch or overlap without leaving any gaps. This is
particularly useful when combining ranges or ensuring that a sequence of
ranges forms a continuous block.

*Alternatives*:

   - Users could manually compare the end and start of ranges to check
   contiguity, but this method offers a more explicit and efficient way to
   perform the check.

------------------------------
Optional<Range<T>> asBounded()

*Description*:
Returns the bounded version of the current range, if one exists. If the
range is already bounded, it returns the range unchanged. If the range is
unbounded, the result is an empty Optional. Could be used as a monade for
handling errors if range that is expected to be bounded, but unbounded one
has been recieved.

*Alternatives:*

   - API could explicitly expose BoundedRange marker (or not marker)
   interface to verify range that is recieved is bounded at compile time.
   Interface could provide some adapter methods for converting
   unknown-boundness ranges to bounded, and have specific behaviour for error
   cases.

Range<T>[] splitAt(T point)

*Description*: Splits the current range into two sub-ranges at the
specified point. If the point lies outside the range, it returns an array
of length 1 with initial range. If rang contains() point, than array of
length 2 is returned, whith two ranges splitted accross given point.
List<Range<T>> splitInto(int n)

*Description*:
Splits the current range into n equal sub-ranges. If the range cannot be
evenly divided, the last range may be slightly larger to accommodate the
remaining span. Throws UnsupportedOperationException if range is at least
half-unbounded.
Stream<T> pointsFromStartToEnd(??? step)

*Description*: Returns a list of points that are evenly spaced from the
start to the end of the range, using the specified step size. Throws
UnsupportedOperationException if range is isBoundedAtStart() returns false.

*Note:* while this method could have various use cases, It is not clear how
step could be provided. One of the options is to pass Function<T, T> that
is invoked on each value until value is > end() instrad of constant step.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240922/a28f0e15/attachment-0001.htm>


More information about the core-libs-dev mailing list