AW: Range API

Markus Karg markus at headcrashing.eu
Mon Sep 23 18:05:59 UTC 2024


Why limiting Range to timespans? I once wrote Range<E>, see https://github.com/headcrashing/treasure-chest/blob/master/RangeClass/src/main/java/eu/headcrashing/java/util/Range.java. I could imagine this is useful

 

Von: core-libs-dev [mailto:core-libs-dev-retn at openjdk.org] Im Auftrag von Olexandr Rotan
Gesendet: Sonntag, 22. September 2024 22:53
An: Remi Forax
Cc: core-libs-dev
Betreff: Re: Range API

 

Hello! 
Thanks for your comments, they are valuable to me. I indeed intended to make Range immutable, if you noticed, in potential methods I listed extendTo and shrinkTo, which is kind of modified with-ers. But, it's indeed better to convert range to class and expose some methods to create bounded and unbounded ones.

The range pattern would be great to have, but unless there is some way to ger range object, there will not me a way to create range arithmetics. It would be better, in my opinion, to create member pattern inside Range class that can match it's type params. This may eliminate need for new syntax for range litterals and make feature more versatile.

Regarding the array return types, I specifically mentioned that this is the first thing up to change. Your point is completely valid, I will look into this issue as soon as possible.

I don't really understand the point about coupling different parts with an interface (future class I guess). Do you say that it is not desirable to have specialized versions of range such as timespan, and would prefer just one generic Range?

I appreciate all feedback you give. I will address your comments as soon as possible. I am new to API design, so I guess people won't go to hard on me. I hope that, iteratively, I could workout the API that would be a good match for JDK quality level.

Best regards

 

On Sun, Sep 22, 2024, 23:17 Remi Forax <forax at univ-mlv.fr> wrote:

Hello,

if we introduce a Range object, it will have to be an immutable class because otherwise any API that takes an interface Range as parameter will have to copy it into an immutable subclass first;

 

In term of design, coupling different parts of the JDK with an interface is not appealing to me, especially given that a Range is a very small object.

 

We (the amber EG) have talked about adding a way to do pattern matching using a Range with a special syntax like 1 .. 3, if we introduce such syntax, the runtime representation will be an immutable class an not an interface (think java.lang.String).

 

Your API is unsafe, you can not have a method that creates a Range<T>[] in a safe way, the combination of erasure + array covariant is nasty (at least until we have a way to create a read-only arrays).

 

regards,

Rémi

 


  _____  


From: "Olexandr Rotan" <rotanolexandr842 at gmail.com>
To: "core-libs-dev" <core-libs-dev at openjdk.org>
Sent: Sunday, September 22, 2024 9:01:47 PM
Subject: Range API

Hello everyone! I am writing here today to invite everyone to participate in the discussion regarding the Range APi proposal I have made into JDK. Here is the pull request link: https://github.com/openjdk/jdk/pull/21122, and PR text:





This pull request describes the methods of the Range<T> interface. The Range<T> interface represents a bounded or unbounded range. (From now on, range, span and interval are used interchangeably, but docs only use "range")


Goals:


·        Main goal. Standardization of the Range/Interval API: The primary objective of this effort is to provide a standardized interface for working with ranges or spans of time (or any ordered types). Many existing libraries offer their own custom implementations of ranges, and they differ in significant ways, making it harder to use and combine across different codebases. Standardization will ensure consistency, interoperability, and a more predictable interface across various contexts.

·        Versatile range operations: provide a comprehensive API for manipulating and querying ranges, especially those representing time periods or numerical intervals. The API simplifies common tasks like checking containment, overlaps, or adjacency between ranges.

·        Support for unbounded ranges: Unlike many existing libraries, which assume intervals are always bounded, this API aims to fully support unbounded intervals. Users will be able to define ranges with open starts or ends, making it suitable for temporal data that spans indefinitely in one direction, such as future projections or historical data with unknown starting points.

·        Performance efficiency: The API aims to provide optimized for performance implementation, that takes advantage of all possible simplifications and short-circuits.

·        Consistency with existing libraries: To aid adoption, the API should be familiar to developers who have used popular libraries like NodaTime, Joda-Time, Three-Ten Extra, and Boost Date_Time, but with enhancements for unbounded intervals, negative ranges (?), and optional return types instead of null values.


Non-Goals:


·        Handling complex data structures beyond smple ranges: This API is not intended to manage or represent complex data structures beyond ranges. For example, ranges that involve intricate internal states, like non-contiguous ranges (?) or ranges with multiple gaps, are out of scope.

·        Overly simplifying range types: While ease of use is a goal, its is not an aim to remove support for advanced cases like unbounded or negative ranges, even if this results in slightly more complex implementations. The API should not be skewed towards being purely a simple data structure for bounded ranges.

·        Application-specific logic: The API is meant to be domain-agnostic and general-purpose. It is not intended to allow to embed application-specific logic, such as calendar-based date manipulations or domain-specific business rules for interval comparison.

·        Replacing existing libraries: The goal is not to replace established libraries like Joda-Time or ThreeTen-Extra, but rather to augment these ideas with support for unbounded ranges and additional arithmetic operations. Although, it is a goal to provide interface that could exisiting libraries could easily retrofit into.


Motivation


The primary motivation behind standardizing the Range API is the lack of an established, universal interface for handling ranges or spans across various domains. Developers are often forced to work with different, incompatible range implementations across libraries or to re-implement common functionality themselves. This leads to redundant code, increased dependencies, and greater chances for errors.

In many software systems—whether in scheduling, auditing, access control, or financial services—ranges are used to represent periods of time, numerical intervals, or validity spans. Without a standardized API, developers must contend with diverse implementations that often differ in naming conventions, behavior, and supported features. These variations create unnecessary complexity, as developers must:

1.        Introduce additional dependencies: Many libraries provide similar functionality for ranges, but since they are not interchangeable, developers must often add extra dependencies to cover edge cases or specific use cases that are not available in a single library. This bloats the codebase and creates maintenance overhead.

2.        Re-implement common logic: In cases where no single library meets the required needs, developers are forced to write their own range-handling logic. This reinvention of basic operations such as intersection, union, or containment leads to redundancy, increased likelihood of bugs, and inconsistency in how ranges are handled across different parts of the code.

3.        Fragmentation across domains: Different libraries often define their own range concepts (e.g., for date-times, numbers, or general comparisons), which are rarely compatible with one another. This lack of compatibility makes integration between systems difficult, requiring custom adapters or conversions.

By defining a standard Range API, the goal is to:

*	Reduce the dependency footprint: A common, well-designed API for ranges would eliminate the need to import multiple libraries just to handle different types of ranges, reducing dependencies in projects and enhancing maintainability.
*	Simplify code and increase reusability: With a standardized interface, developers can write range-related code once and reuse it across projects and libraries, confident that the same semantics and operations will apply consistently.
*	Minimize developer errors: By providing a predictable and well-documented interface, the likelihood of misunderstandings or incorrect use of range operations will decrease. Developers can trust that operations like intersections, unions, and comparisons will behave consistently, regardless of the context.

In essence, the lack of standardization in range operations creates unnecessary complexity, fragmentation, and redundant effort. A standardized Range API would provide clarity, reduce the need for additional dependencies, and enable more efficient, reusable, and error-free code across different projects and domains.


Key API points


Support of unbounded intervals


API supports both one- and two-sided. Provided sample (draft) implementation for ChronoDateTime has 4 separate implementation for each type of ranges.


Alternatives


*	Many Libraries, like Luxon, C++ boost, NodaTime and many others, arguably the most, fo not explicitly support unbounded intervals. This reduces complexity of implementation, but takes away many possible optimization for edge-cases. Alternative they propose is to use Instant.MIN and Instant.MAX or similar to create unbounded-like intervals.


Support for negative intervals,


API supports both positive and negative and positive ranges. This is questionable and discussion is encouraged.


Advantages


*	Allows more flexible usage of API, which would be helpful for use cases like diagrams visualization.


Disadvantages


*	Dramatically increases amount of boilerplate code inside the implementations.
*	Makes behaviour of potential methods like boolean endsBefore(T t) unintuitive. Does this mean that end() is before that provided parameter, or latter of bounds (i.e. start() for negative range and end() for positive).
*	Limited usability scope. Most use cases would not benefit from possibility of negative ranges creation, but would have to suffer performance decrease.

In general, either there should be support for negative ranges, or ranges might be end-exclusve, but not two at the same time, as having them both together dramatically increases complexity.


Range is not Serializable


Currently ranges are not Serializable. This is due to difficulties regarding using non-serializable interfaces, like ChronoDateTIme  in sample implementation.


Alternatives


*	Restrict range type variable to implement Serializable. I see this option as undesiarable bacause of how much it narrows use of interface.


Current interface methods list is minimal


For now, API proposed contains minimal amount of methods that are used in range arithmetics. List of methods is supposed to change as discussion moves on.


Generic Range class vs Rnage interface + specific inmplementations


Currently, approach is to define interface and list of implementations.


Advantages


*	Ability to introduce specialized for type of range methods. For example, Timespan could have Duration toDuration() method, potential IntegerRange could have something like LongRange toLongRange() dur to limitations of comparability between classes. This would be impossible with structural class Range without declaring additional static utility methods.
*	Enhanced validation of annotaion targets as classes, unlike generics, arent erased.


Disadvantages


*	Increased amount of classes to maintain.
*	Additional considerations would be required before extending Range interface in case if hierarchy non-sealed to ensure backward compatibility.


API Description


NB: Since date ranges is supposed to be one of the most popular if not the most popular use case for range, date-time libraries were main reference for interface design.


  _____  


Section: Bounds


General notes


*	In Boost Date_Time (time_period.begin()), the start and end are always defined, meaning there is no concept of unbounded intervals. Similarly, some libraries like Chrono in Rust assume bounded intervals by default. In fact, only a few libraries expose trully unbound ranges. Although, while complexity of implementation is increased by this corner cases, thier performance also vastly increased by cutting amount of operations in each method at least in half (For two-way unbound interval, almost all operations return constnat value).


  _____  


T start()


Description:
Returns the start of the range. If the range is unbounded at the start, this method throws an UnsupportedOperationException. This can be preemptively checked using isBoundedAtStart().

Alternatives:

*	Method could return Optional<T> instead of throwing an exception. I see this two approaches roughly identical in terms of pros/cons score, so suggestions are much appreciated.


  _____  


T end()


Description:
Returns the end of the range. If the range is unbounded at the end, this method throws an UnsupportedOperationException. Use isBoundedAtEnd() to check if the range is bounded.

Alternatives:

*	Simallarly to start(), method could return Optional<T> instead of throwing an exception. I see this two approaches roughly identical in terms of pros/cons score, so suggestions are much appreciated.


  _____  


boolean isBoundedAtStart()


Description:
Returns true if the range is bounded at the start. If unbounded, it returns false, meaning calling start() will throw an UnsupportedOperationException.

Alternatives:

*	Joda-Time, NodaTime, Luxon, and Moment.js do not explicitly support unbounded intervals by default but can use null or special values to represent unbounded starts.
*	Boost Date_Time and Chrono don’t support unbounded ranges directly, so this method is unnecessary.


  _____  


boolean isBoundedAtEnd()


Description:
Returns true if the range is bounded at the end. A false value means the range is unbounded at the end, and calling end() will throw an UnsupportedOperationException.

Alternatives:

*	Similar to isBoundedAtStart(), most libraries don’t have built-in unbounded intervals, but the concept can be simulated using null, minimal/maximal possible value etc. Pros and cons were described in API notes.


  _____  


Section: boolean operations


boolean contains(T instant)


Description:
Returns true if the given instant falls within the start and end bounds of the range, otherwise returns false.

Similar Methods in other libraries:

*	NodaTime (Interval.Contains)
*	Joda-Time (Interval.contains)
*	Luxon (Interval.contains)
*	Boost Date_Time (time_period.contains())
*	And many others...

Differences with existing APIs:

*	Moment.js doesn’t provide a direct contains method but the moment-range plugin adds this functionality with range.contains().

Note: this method is present in most interval implementations. Terefore, I concider as basic and unremovable from the API.


  _____  


boolean overlaps(Range<? extends T> other)


Description:
Checks if the current range overlaps with another range. Returns true if the two ranges overlap, otherwise returns false.

Similar Methods in other libraries:

*	NodaTime (Interval.Overlaps)
*	Joda-Time (Interval.overlaps)
*	Luxon (Interval.overlaps)
*	Boost Date_Time (time_period.intersects())
*	And many others...

Differences with existing APIs:

*	Moment.js: The moment-range plugin provides a similar overlaps() method to check overlap.
*	Chrono relies on custom interval intersection logic.

Note: this method is present in most interval implementations. Terefore, I concider as basic and unremovable from the API.


  _____  


General notes on next two methods


Most of the libraries propose API like isBefore(T point) or do not provide methods like this at all. Since current implementation throws an exception if interval is not bounded, trivial check for isBefore could become 4-6 lines long. The question basically comes down to whether the Range class should be more data-structure-like or object-like. I would argue that at least isBefore(T moment) is required, especially since ranges can be negative currently. Existence of boolean isBefore(Range<? extends T> other)and similarisAfter` is up to discussion.


boolean isBefore(Range<? extends T> other)


Description:
Returns true if the current range is strictly before another range (i.e., ends before the other range starts).

Differences with other libraries:

*	NodaTime: You’d manually compare End of one interval with the Start of another.
*	Joda-Time: Manual comparison with Interval.getEnd() and Interval.getStart().
*	Boost Date_Time and Chrono would use custom logic to compare time_period or ranges of time, since they don’t have a direct equivalent of isBefore().

Alternatives

*	Most of the libraries propose API like isBefore(T point) or do not provide methods like this at all. Since current implementation throws an exception if interval is not bounded, trivial check for isBefore could become 4-6 lines long. The question basically comes down to whether the Range class should be more data-structure-like or object-like. I would argue that at least isBefore(T moment) is required, especially since ranges can be negative currently


  _____  


boolean isAfter(Range<? extends T> other)


Description:
Returns true if the current range is strictly after another range (i.e., starts after the other range ends).

*	Similar Methods:

*	Similar to isBefore(), manual comparisons are used in NodaTime, Joda-Time, and Luxon an others.


  _____  


boolean isBefore(T point)


Description:
Determines if the span ends before the given point. This is useful when you need to check whether a time span occurs entirely before a specific point.

Alternatives:

*	Method could be removed from APi at all, if Range is desired to be skewed towards being data structure.


  _____  


boolean isAfter(T point)


Description:
Determines if the span starts after the given point. This is useful when you need to check whether a time span occurs entirely after a specific point.

Alternatives:

*	Similarly to boolean isBefore(T point), method could be removed from APi at all, if Range is desired to be skewed towards being data structure.


  _____  


boolean isNegative()


Description:
Returns true if the start of the range is after the end, indicating a "negative" range.

Alternatives:

*	if concidered too niche, negatie timespans could be removed from model.

Note: this one is most questionable for me. Do we really need negative ranges? This is most entirely required in numeric ranges and diagrams, while introdcues huge complexity overhead for majority that doesnt need this feature. Negativity might be confusing for users. Would love to hear thoughs on this matter


  _____  


Section: Range arithmetics


Optional<Range<T>> intersection(Range<? extends T> other)


Description:
Returns the intersection of the current range with another range. If the ranges do not overlap, the result is an empty Optional. If they overlap, the intersection is returned.

Similar Methods:

*	NodaTime (Interval.Intersection())
*	Moment.js (via moment-range, range.intersect())
*	Joda-Time (Interval.overlap())
*	And many others...

Differences with existing APIs:

*	Boost Date_Time returns an empty time_period if no overlap exists, instead of an Optional. Some libraries return null (e.g., NodaTime).
*	Other libraries return null if intervals arent overlapping. This is undesrable, so optional returned instead.

Note: this method is present in most interval implementations. Terefore, I concider as basic and unremovable from the API.


  _____  


Range<T>[] union(Range<? extends T> other)


Description:
Returns the union of two ranges. If the ranges overlap, the result is a single combined range. If they do not overlap, the result is an array of two separate ranges.

Differences with existing APIs:

*	NodaTime and Joda-Time support similar logic using custom union handling.
*	Boost Date_Time has no built-in union() function but you can write custom logic to combine or separate intervals.

Note: Behaviour of this method is up to change. Currently, it returns array for maximal performance, but it can (and most likely should) be wrapped in some monadic class. As an alternative, there may be support for non-continuous ranges (ones with gaps inside them), then this method should return thise kind of range.


  _____  


Optional<Range<T>> gap(Range<? extends T> other)


Description:
Returns the gap between two ranges, if they do not overlap. If they overlap, the result is an empty Optional.

Differences with existing APIs:

*	NodaTime and Joda-Time support custom logic to calculate the gap using isBefore(), isAfter(), and manual calculations of the gap.
*	Other libraries return null if intervals are overlapping. This is undesrable, so optional returned instead.


  _____  


Section: potential methods


boolean isEmpty()


Description:
Determines if the range is "empty,"

Empty range is its own, separate type of range (basically opposite of unbounded range). There are many questions regrading this type of range. Is it bounded at start or end? If so, what should start() or end() return. Them throwing an exception would violate current contract between IsBoundedAtX() and 'x()` methods.

Advantages

*	Returning empty range instead of Optional might be more user-friendly

Disadvantages

*	One more concept in the API model
*	Corner case in IsBoundedAtX() and 'x()` contract.


Potential Methods for API Enhancement


In this section, we explore methods that could be added to the API, comparing them with similar functionality in popular time-related libraries. These methods enhance the versatility and clarity of the Range<T> implementation, especially in the context of temporal, numeric, and other domain-specific ranges. Some of these methods are inspired by well-established libraries, while others are novel suggestions.


  _____  


boolean encloses(Range<? extends T> other)


Description:
Checks whether the current range completely encloses another range, i.e., the other range starts after or at the start of the current range and ends before or at the end of the current range.

·        Similar Methods in Other Libraries:

*	NodaTime (Interval.ContainedBy)
*	Joda-Time (Interval.contains)
*	Luxon (Interval.contains)
*	Boost Date_Time (time_period.contains())

·        Differences with Existing APIs:

*	Some libraries handle encloses() and contains() in the same method. For clarity, this API can separate the two, where contains() is used for checking individual points and encloses() is for range-level comparison.


  _____  


boolean abuts(Range<? extends T> other)


Description:
Returns true if the current range abuts (i.e., touches but does not overlap) with another range. This method is useful when determining whether two ranges are adjacent but do not overlap.

·        Similar Methods in Other Libraries:

*	NodaTime (Interval.Abuts)

·        Alternatives:

*	Instead of this method, users could manually compare the end of one range and the start of another, but including abuts() in the API simplifies the logic and reduces error-prone comparisons.


  _____  


Range<T> extendTo(T point)


Description:
Returns a new range that extends the current range to include the given point. If the point is already within the range, it returns the current range. Otherwise, it extends either the start or end, depending on the point's position relative to the range.

·        Similar Methods in Other Libraries:

*	NodaTime and Joda-Time do not have explicit methods for this, but users can manipulate intervals manually.
*	Moment.js: The moment-ran

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240923/5912cfbf/attachment-0001.htm>


More information about the core-libs-dev mailing list