Java implementation of Alpha-numeric comparator
Ivan Gerasimov
ivan.gerasimov at oracle.com
Tue Dec 16 18:44:35 UTC 2014
On 16.12.2014 17:58, roger riggs wrote:
> Hi Ivan,
>
> In which package/class do you propose to add the API to get the
> comparator?
>
I was thinking of java.text package, though I don't see a specific class
in which the static method could be naturally included.
This kind of Comparator doesn't seem to be common enough to include in
such a base interface as CharSequence.
> In java.time, comparators are returned from static methods in an
> interface.
> This allows lambda to be used for the implementation.
> For example, ChronoZonedDateTime.timeLineOrder
> <https://docs.oracle.com/javase/8/docs/api/java/time/chrono/ChronoZonedDateTime.html#timeLineOrder-->()[1]
>
> For example a static method could be added to CharSequence:
> public static int Comparator<CharSequence> alphaNumericComparator() ...
>
> In use it would be
> CharSequence.alphaNumericComparator().compare("012", "234");
>
> 2) Should there be any provision for number strings internal to the
> string with leading zeros.
> Should "abc-0123-def" be equal to "abc-123-def"?
>
The strings will be ordered as:
abc-122-def
abc-0122-def
abc-123-def
abc-0123-def
abc-00123-def
abc-0000123-def
abc-124-def
I.e. the strings with the same numeric value will be grouped together,
but the strings with more leading zeros will be put in the order further.
To my eyes the strings with more leading zeros look bigger, that's why I
did it this way :)
By the way, the Microsoft's StrCmpLogicalW() does it in the opposite
direction, i.e. strings with more leading zeros come earlier.
If people find it useful, we can make it configurable.
Sincerely yours,
Ivan
> Roger
>
>
>
> [1]
> https://docs.oracle.com/javase/8/docs/api/java/time/chrono/ChronoZonedDateTime.html#timeLineOrder--
>
> On 12/16/2014 3:57 AM, Ivan Gerasimov wrote:
>> Got it, thanks!
>>
>> Please find the updated webrev at the same location:
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-AlphaNumeric/1/webrev/
>>
>> Sincerely yours,
>> Ivan
>>
>> On 16.12.2014 11:23, Remi Forax wrote:
>>>
>>> On 12/16/2014 09:13 AM, Ivan Gerasimov wrote:
>>>> Thanks Remi for the comments!
>>>>
>>>> As you and Roger suggested I only left a CharSequence variant of
>>>> the comparator.
>>>>
>>>> I also removed the custom Comparator<Character> altogether for now,
>>>> for the sake of simplicity.
>>>> I guess for the purpose of a sample Character.compare(ch1, ch2)
>>>> should be good enough.
>>>>
>>>> Here's the updated webrev:
>>>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-AlphaNumeric/1/webrev/
>>>>
>>>> I'm not certain, why the comparator should be serializable.
>>>> Could you please elaborate on this?
>>>
>>> Yes, all public comparators in the JDK are serializable because
>>> otherwise people will not be able to serialize collections like
>>> TreeSet or ConcurrentSkipListSet that implement Serializable and
>>> take a comparator as an optional argument.
>>>
>>>
>>>>
>>>> Sincerely yours,
>>>> Ivan
>>>
>>> cheers,
>>> Rémi
>>>
>>>>
>>>> On 16.12.2014 2:39, Remi Forax wrote:
>>>>> Hi Ivan, hi Roger,
>>>>>
>>>>> Roger, the API already exists it's the interface Comparator.
>>>>>
>>>>> I agree with Roger that a comparator that use a CharSequence is
>>>>> better that the one that use a char array.
>>>>>
>>>>> The thing that worry me is the Comparator<Character> taken as
>>>>> parameter, it means that
>>>>> each time the method compare() is called on this comparator, the
>>>>> two arguments are boxed.
>>>>>
>>>>> Minor comment, to be included, I think that these comparators
>>>>> should be serializable
>>>>> and in my opinion the best way to do that is to use a lambda
>>>>> instead of a class.
>>>>>
>>>>> Rémi
>>>>>
>>>>> On 12/15/2014 11:31 PM, roger riggs wrote:
>>>>>> Hi Ivan,
>>>>>>
>>>>>> It does seem like a useful function, though I would have started
>>>>>> with the API,
>>>>>> not the implementation.
>>>>>>
>>>>>> Can it apply to CharSequence not only String and maybe skip the
>>>>>> separate char[] version, a char[] array can be wrapped to become
>>>>>> a CharSequence via CharBuffer.
>>>>>> Or a via a new static method to define a CharSequence from a char
>>>>>> array.
>>>>>>
>>>>>> $.02, Roger
>>>>>>
>>>>>> On 12/15/2014 5:53 AM, Ivan Gerasimov wrote:
>>>>>>> Hello everyone!
>>>>>>>
>>>>>>> In certain situations the preferred way of sorting strings is a
>>>>>>> combination of char-comparing sorting with numeric sorting,
>>>>>>> where applicable.
>>>>>>> List of strings sorted this way often look more natural to the
>>>>>>> human eyes:
>>>>>>> { "alpha",
>>>>>>> "java1",
>>>>>>> "java2",
>>>>>>> "java10",
>>>>>>> "zero" }
>>>>>>>
>>>>>>> Here's presented a sample implementation of the comparator,
>>>>>>> which supports this way of sorting.
>>>>>>> I placed it under src/sample directory.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-AlphaNumeric/0/webrev/
>>>>>>>
>>>>>>>
>>>>>>> MSDN provides the function StrCmpLogicalW(), which can be used
>>>>>>> for similar sort order.
>>>>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/bb759947%28v=vs.85%29.aspx
>>>>>>>
>>>>>>>
>>>>>>> The differences are:
>>>>>>> - case-sensitivity (StrCmpLogicalW is case-insensitive);
>>>>>>> - treating leading zeroes;
>>>>>>> - more accurate handling of strings with big numbers, which
>>>>>>> cannot be converted to int/long.
>>>>>>>
>>>>>>> I guess this comparator may become particularly useful when
>>>>>>> we'll have 'java10' and update releases/build numbers > 99 in
>>>>>>> the lists :)
>>>>>>>
>>>>>>> I want to ask the community about how useful this comparator may
>>>>>>> be to you?
>>>>>>>
>>>>>>> Sincerely yours,
>>>>>>> Ivan
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>
More information about the core-libs-dev
mailing list