<div dir="ltr">Text copied from PR draft since no one really goes through 2-month old drafts ;)<br><br>For anyone wondering what's up with this PR, I have decided to take a little break from coding itself and focus on the more fundamental part of ranges. Most of the time I was thinking about the range lifecycle, how the range that was acquired from the chain of operations is fundamentally different from the plain range created using Range.of and stuff like that. Also I have been thinking about how exactly should custom comparators be consoled with range operations, which is obviously the point of most friction regarding comparators for ranges.<br><br>1. Lifecycle<br><br>Generally, I see 4 use cases for ranges. I will put them in "popularity" order, speaking from my experience<br>1. Receive boolean value (does contain, does intersect etc.)<br>2. iterate over (It may become more popular as ranges are present in jdk, but as Brian Goetz once mentioned in mail, no one is really hurt by the necessity to write for loop instead of foreach over range, so I don't consider stuff like for(var i : Range.of(0, 10) a valuable use case).<br>3. Provide as input (also known as slicing in python and things alike in other languages)<br>4. Receiving range as result of method (not really a terminal operation, but case is huge enough to acknowledge it separately)<br><br>I have been through multiple iterations of use-case investigation. At some point, I have even thought about dropping range operation completely. What stopped me, though, is one of the tasks I have received at work recently, which completely changed my mind. The one particular use case that I find huge enough to be worth support in API is scheduling with only one task per moment, i.e. scheduling lessons for teachers. for example, accounting for business hours and stuff like that. I believe this is one of the most common things business logic programmers deal with when talking about datetime ranges.<br><br>From this point, I have been looking at this topic from a somewhat different perspective. I believe that most of the time (dominating majority of times) results of range operations would be used to acquire boolean value. Passing it as input would be a mess to handle as range can be interrupting/continuous. Iterating over ranges like that may lead to confusing behaviour, for example, consider:<br><br>[1;4] U [5;10] with step 3. Should it be like 1,4,7,5,8 or 1,4,5,7,8 or maybe 1,4,5,8 ? This is just too confusing to consider.<br><br>On the other hand, continuous, as-is ranges (ones that are created directly with factory method), would be a much better candidate for iteration, for many cases, from safe iteration over substring indexes to just plain for loop replacement (people for some reason want this a little bit too much).<br><br>So, summarizing everything above: range operation is a valid request, although support for everything available to "plain" ranges would be inappropriate for more complex ranges, specifically unions. Unions are mostly used for boolean operations, while plain ranges can have a broader number of use cases. For iteration over unions, the best option I see is to provide a method that returns a List of their components so people can by-hand decide which behaviour exactly do they want.<br><br>2. Range operations and comparators rivalry<br><br>I think everyone could in 5 seconds come up with at least one case when these two conflict. I was thinking a lot about this. Prototyping some sorts of operation contexts (something like java.math.MathContext), creating classes that were responsible for range operations that required providing comparator for each action (this one I brushed off for multiple reasons: 3 parameters for each method sounds horrific, many implementation intricacies, and generally not intuitive way to operate on rages considering 95% of use cases will be bound to datetime and numerics and things backed by them), and other stuff. Surprisingly, where I found similar problems is in the collection framework, specifically in sets. The problems here were really much more alike to range then I have expected (basically the same except for the fact that set has only one 2 possible relations to element: contains or not, while ranges have 5 possible relations to other ranges (one can be before, after, contained, or intersect at start or end with other) which obviously complicates things). Collections framework just state behaviour in operations between sets that use different equality metrics as well, not undefined, but confusing for anyone who doesn't know implementation details. I am planning to prototype something with the same kind of behaviour for ranges, but I still haven't decided how 2 ranges will communicate with each other (analogy for how sets communicate using contains).<br><br>That's basically it for now, i will continue with investigation, maybe work will also give some food for though as with schedule, and when I feel like I have some common ground for all of the things I would want to see in the API, I will proceed with some sort of more mature API description</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 26, 2024 at 11:00 PM Olexandr Rotan <<a href="mailto:rotanolexandr842@gmail.com">rotanolexandr842@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-family:monospace;font-size:large">but I also don't sense people beating down the doors for that (even if the language had range literals, like `0..<100`). </span></blockquote><div><br></div><div>True, that what I was thinking also: "is iteration over numeric range is so important to challenge general versatility of API?", but  </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-family:monospace;font-size:large">for (int j=index; j<index+target.length(); j++)</span></blockquote><div><br></div><div>is the use case that I haven't even thought of. After some research, I found a family of methods across jdk that could benefit from ranges. Their names match X::indexOf(X) or X::indexOfSubX(X) or similar patterns. I did some research and found it surprising that the only young language, which all now have first-class support for ranges, that takes advantage of range return types is Swift and its string::range(of: substring) (maybe there is similar methods for sublists in Swift, I think it is not that significant). Others either return an optional int or int itself.</div><div><br></div><div>Much more languages, though, have adapted ranges as input parameters for slicing and other range-specifying operations, not lastly due to issues like</div><div><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-family:monospace;font-size:large">Similarly, I see errors in API usage because sometimes we specify range by (start, end) and sometimes by (start, length)</span></blockquote><div> </div>Though, there are some differences with the vast majority of languages and this API currently, mostly with the fact that almost all languages do not support any range algebra out-of-the-box (closest I have seen were itertools in Rust), and, subsequently, there is nowhere to emerge for unios from, unlike current implementation. Adapting return types for being three-state algebraic data types (i.e. result of passing empty range, result of passing uninterrupted and result of passing union) is not that hard, but just another work to do in case similar APIs would be adopted in jdk.</div><div><br></div><div>My research about iterability of ranges in other languages showed that there is, unfortunately, no magic pill invented. Basically, there are two approaches to making range iterable:</div><div><br>1) Implement iterator<br>2) Implement increment/decrement<br><br>The concrete way might vary from traits in Rust (and something like this in Haskell I suppose) to operator overloading in c++, but concepts are the same. Most languages also provide out-of-the-box support for integer ranges. <br><br>And that is what I concluded on, at least for now. I think that API should provide support for any custom iteration strategy using something like Collector/Gatherer wrappers over a functional interface that will transform increment/decrement function into iterator. Besides that, I think that it makes sense to create separate Range.OfInts(int, int) or just Range.of(int,int) overload, that will return thin extension of Range<Integer> (noone convinces me to create new specialized class), that implements iterable using simple int increments, which would both benefit the performance and address most of the use cases for "obviously iterable" ranges. I also think there is no need for open iterable ranges, even if start-closed. I think it is better to have some method to return Stream than implement Iterable, since this might lead to unexpected problems.<br><br>Best regards</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 26, 2024 at 4:30 PM Brian Goetz <<a href="mailto:brian.goetz@oracle.com" target="_blank">brian.goetz@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>

  <div>

    <font size="4" face="monospace">Sorry for the not-good news, but I'm

      not too surprised.  Computational domains like "32 bit integers"

      seem like they should have a lot in common with algebraic

      structures like groups and rings, but when you start poking at

      them, the compromises we make to fit them into hardware registers

      start to bite.  (And let's not get started on floating point...) 

      Lots of research into numeric towers in various languages, or

      capturing fundamental properties in type classes like Haskell's

      `Eq` and `Ord`, offers plenty of compromise to go with its

      promise.  <br>

      <br>

      I think a big part of what you are running into is that you've

      started with a _concept_ (a deceptively simple one, at that),

      rather than _requirements_.  And it is the open-endedness of this

      concept (discrete vs continuous, bounded vs half-open, including

      endpoints or not, etc) that resists abstraction.  Plus, without

      clear requirements, you will be subject to an endless barrage of

      "what about my pet use case" (e.g., "what about the numbers zero

      to ten, advancing by two").  Meanwhile, domain-specific libraries

      such as java.time will invent their own domain-specific answers,

      like Interval.  <br>

      <br>

      Rather than starting from the algebraic properties, perhaps start

      from the other end: what are the use cases where the lack of a

      range abstraction is problematic.  I get that <br>

      <br>

          for (int i=0; i<100; i++) { ... }<br>

      <br>

      is uglier and less abstract than <br>

      <br>

          for (int i : Range.of(0, 100)) { ... }<br>

      <br>

      but I also don't sense people beating down the doors for that

      (even if the language had range literals, like `0..<100`).  <br>

      <br>

      Where I do see people having trouble is that many range

      computations are error prone.  For example, `String::indexOf`

      returns the starting index of a match; if you want to actually

      iterate over the characters of such a match, you have to do

      something like<br>

      <br>

          for (int j=index; j<index+target.length(); j++)<br>

      <br>

      and you are at risk for fencepost errors when recreating the

      range.  Whereas an indexOf method (under a more suitable name)

      that returned a range, would be more amenable to downstream

      processing.  Similarly, I see errors in API usage because

      sometimes we specify range by (start, end) and sometimes by

      (start, length), and since both are ints, we get no type checking

      when you pass the wrong kinds of ints to such a method.  <br>

      <br>

      But, the mere existence of a Range type would do little to help

      String, Arrays, and other range-happy APIs, because we would have

      to update them to include new overloads that dispense and consume

      ranges.  So that's a big project.  <br>

      <br>

      Still, I think investigating use cases involving libraries that

      work intensively with ranges like this would likely yield useful

      information for what a Range type would want to provide.<br>

      <br>

      HTH,<br>

      -Brian<br>

      <br>

      <br>

      <br>

      <br>

      <br>

    </font><br>

    <div>On 9/26/2024 9:07 AM, Olexandr Rotan

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">Researching the of deriving some iterable

        representations from ranges, and I am not here with the good

        news.<br>

        <br>

        Unlike range algebra and boolean operations, which generalize

        extremely well, iterability of ranges... Well, it's safe to say

        it doesn't generalize at all. Analyzing key features people

        expect iterable ranges to have, I ended up concluding there are

        basically two groups / two use cases for them. First is plain

        and simple, arguably the most popular one: iterating over a

        range of integer numbers, i.e. `for (i : Range.of(1, 10))`.

        Another use case is for more complex iterations over ranges of

        reference types, most commonly dates/time.<br>

        <br>

        There are two groups of values by their nature: discrete and

        continuous. Most of the types belong to the second group, as

        there is no direct increment AND decrement for them (we will

        omit hardware limitations for simplicity), such as floating

        point values. What is the increment of 1,3? 1.31 or

        1.30000000001, or maybe something even more unreadable? On the

        other hand, the increment of LocalDate in context of range

        iteration that represents today is rather obvious - it is

        tomorrow.

        <div><br>

        </div>

        <div>There is a pretty limited number of discrete types in jdk.

          Dates, whole numbers and basically that's it. The discrete

          types that are not present in jdk can be really various. For

          example, users can define a comparable type "F1Team" and

          compare them based on their position in the last race. There,

          increment would most likely be the next team in rating. There

          are many domain-specific cases like this.<br>

          <br>

          This is where the problem comes from. If the user would always

          have to pass a comparator to create a range, it would be

          consistent to make the user define increment/decrement as

          well. But we don't want users to pass a comparator if the type

          is already comparable. Similarly, we don't want users to

          define increment/decrement if there is already one in the

          language! I think defining increments for dates (say

          LocalDate.plusDays(1)) would be acceptable,

          even defining increments for floats in context of ranges might

          be acceptable, but making people define increments for

          integers is, in my opinion, completely not. Besides

          performance impact, this is a terrible user experience.</div>

        <div><br>

        </div>

        <div>There are a few solutions to this:<br>

          1) Define ton of overrides for factory methods and specialized

          types for this (uhh, sounds awful)</div>

        <div>2) Introduce new interface, say Discrete<T>, that

          defines T increment() (and possible T decrement()) methods.

          From now on, there are 2 branches:<br>

          2.1) Leave things as is, allow users to define incrementation

          logic for their types, but don't touch integers and other

          built-ins.I see this option as extremely inconsistent and not

          solving the main issue, which is iterability of integers.</div>

        <div>2.2) Retrofit (scary) existing types to implement this

          interface. This should not have any compatibility nor security

          implications, but still sneaking into java.lang every time we

          need some new API to be more user-friendly is obviously not a

          way to go. This basically comes down to a question about how

          deep we want to integrate ranges into language, and is range 

          generalization even worth the invasion into the core of

          language (imo yes).</div>

        <div>3) Leave things as they are, just let users derive

          iterables using something like

          range.asIterableWithStep(IncremetStartegy increment). I think

          this would make an API too narrow as no one will use it for

          routine tasks the same way people do in Rust, Kotlin and other

          languages.<br>

          <br>

          I would love to hear community opinion on this matter. Which

          option is the most preferable, maybe some compromise between a

          few of them, or maybe there is a better way to go that I

          didn't mention here?<br>

          <br>

          Best regards</div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Tue, Sep 24, 2024 at

          5:11 PM Alan Snyder <<a href="mailto:javalists@cbfiddle.com" target="_blank">javalists@cbfiddle.com</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div><span style="color:rgb(0,0,0)">I have another example: I

              have a datatype that represents a region of an audio

              track, for example, one tune in a medley of tunes. I allow

              the region to</span>

            <div style="color:rgb(0,0,0)">specify both a start and end

              time, but the end time is optional (and mostly not used).

              When the end time is not specified, the region ends at the

              start of the next region, or at</div>

            <div style="color:rgb(0,0,0)">the end of the track if there

              is no next region. The latter case is useful because the

              exact track length may not be known. The optionality of

              the end time</div>

            <div style="color:rgb(0,0,0)">is not represented in the type

              system.</div>

            <div style="color:rgb(0,0,0)"><br>

            </div>

            <div style="color:rgb(0,0,0)">Having said that, I’m not sure

              that a general abstract interface would be useful for this

              example.<br id="m_-820581810952948146m_4677786044423171050m_813938930879788063m_-1709372893632724312m_3927220230334710857lineBreakAtBeginningOfMessage">

              <div><br>

                <blockquote type="cite">

                  <div>On Sep 24, 2024, at 2:13 AM, Olexandr Rotan <<a href="mailto:rotanolexandr842@gmail.com" target="_blank">rotanolexandr842@gmail.com</a>>

                    wrote:</div>

                  <br>

                  <div>

                    <div dir="ltr">As part of the redesigning  process ,

                      I am researching whether or not there are use

                      cases that require asserting that the range is

                      exactly half-bounded. This is important because I

                      plan to switch to BoundedAtEnd/BoundedAtStart

                      sealed interfaces instead of flags and runtime

                      checks: Here is what I gathered for now.<br>

                      <br>

                      <ul>

                        <li><strong>Date/Time Handling (Historical or

                            Forecast Data)</strong>: When dealing with

                          events that started at a specific time but

                          have no known end (e.g., open-ended employment

                          contracts or ongoing subscriptions)</li>

                        <li><strong>Stream Processing (Real-time Event

                            Streams)</strong>: In real-time systems, you

                          might process data that has a start time but

                          no defined end, such as monitoring a live

                          video feed or logging system. The range is

                          bounded at the start and unbounded at the end

                          as more data will continuously arrive.</li>

                        <li><strong>Data Pagination (Fetch Until

                            Condition)</strong>: When implementing

                          pagination, sometimes you might want to fetch

                          items starting from a specific index up to an

                          unbounded limit (e.g., fetching all items

                          after a certain point until memory runs out or

                          a condition is met).</li>

                        <li><strong>Auditing and Monitoring</strong>: In

                          systems where audit trails or logging data

                          should capture all events after a certain

                          point (bounded start) with no foreseeable end

                          (unbounded end), such as monitoring changes to

                          records in a database starting from a fixed

                          timestamp.</li>

                        <li><strong>Scientific or Statistical Ranges</strong>:

                          When modeling physical systems or statistical

                          ranges, you might want to capture measurements

                          that begin at a known threshold but

                          theoretically have no upper or lower bound.

                          For example, recording temperature data

                          starting at absolute zero and increasing

                          without any known upper limit.</li>

                        <li><strong>Inventory or Resource Allocation</strong>:

                          Resource allocation policies, such as those

                          for virtual machines, may be based on known

                          minimum allocation thresholds but have

                          flexible or unbounded resource caps, depending

                          on availability.<br>

                          <br>

                          I am writing to ask whether anyone who worked

                          with such systems could confirm/deny that

                          those are real use cases. If so, would it be

                          satisfying enough to assert one-way

                          unboundness with instanceof checks, i.e. range

                          instanceof UnboundedEndRange &&

                          !(range instanceof UnboundedStartRange). Would

                          appreciate any feedback.</li>

                      </ul>

                    </div>

                  </div>

                </blockquote>

              </div>

            </div>

          </div>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div>

</blockquote></div>