[10] RFR 8134512 : provide Alpha-Numeric (logical) Comparator

Ivan Gerasimov ivan.gerasimov at oracle.com
Wed Aug 2 06:58:57 UTC 2017


On 7/28/17 12:16 PM, Jonathan Bluett-Duncan wrote:

> Hi Ivan,
>
> It looks like the MyComparator code example which you gave in your 
> last email lost its formatting along the way, so I'm finding it 
> difficult to read. Would you mind resubmitting it?
>

Oh, sorry about that.
I've just uploaded another modification of the comparator here:
http://cr.openjdk.java.net/~igerasim/8134512/test/Test.java

With kind regards,
Ivan

> Cheers,
> Jonathan
>
> On 28 July 2017 at 17:25, Ivan Gerasimov <ivan.gerasimov at oracle.com 
> <mailto:ivan.gerasimov at oracle.com>> wrote:
>
>     Hi Peter!
>
>     Thank a lot for looking into this!
>
>     On 7/28/17 7:32 AM, Peter Levart wrote:
>
>         Hi Ivan,
>
>         In the light of what Stuart Marks wrote, then what do you
>         think about a parameterized comparator (would be sub-optimal,
>         I know) where one would supply
>         2 Comparator(s) which would be used to compare Ax and Nx
>         sub-sequences respectively as described below...
>
>     Yes.  Inspired by what Stuart suggested I made a draft of such a
>     comparator (see below).
>     It works, but as you've said it's not that efficient (mostly due
>     to expensive substrings) and a bit harder to use in a simple case.
>     Now I need to think about how to combine two approaches.
>
>         For Nx sub-sequences, one would need a comparator comparing
>         the reversed sequence lexicographically,
>
>     I'm not sure I understand why they need to be reversed.
>
>         but for Ax sub-sequences, one could choose from a plethora of
>         case-(in)sensitive comparator(s) and collators already
>         available on the platform.
>
>     Yes. In the example below I used compareToIgnoreCase to compare
>     alpha subsequences.
>
>     -------
>
>     class MyComparator implements Comparator<String>
>     {Comparator<String> alphaCmp;Comparator<String>
>     numCmp;MyComparator(Comparator<String> alphaCmp,Comparator<String>
>     numCmp) {this.alphaCmp = alphaCmp;this.numCmp = numCmp;}boolean
>     skipLeadingZeroes(String s, int len) {for (int i = 0; i < len ;
>     ++i) {if (Character.digit(s.charAt(i), 10) != 0)return
>     false;}return true;}@Override public int compare(String o1, String
>     o2) {Supplier<String> s1 = new NumberSlicer(o1);Supplier<String>
>     s2 = new NumberSlicer(o2);while (true) {// alpha part String ss1 =
>     s1.get();String ss2 = s2.get();int cmp = alphaCmp.compare(ss1,
>     ss2);if (cmp != 0) return cmp;if (ss1.length() == 0) return 0;//
>     numeric part ss1 = s1.get();ss2 = s2.get();int len1 =
>     ss1.length();int len2 = ss2.length();int dlen = len1 - len2;if
>     (dlen > 0) {if (!skipLeadingZeroes(ss1, dlen))return 1;ss1 =
>     ss1.substring(dlen, len1);} else if (dlen < 0) {if
>     (!skipLeadingZeroes(ss2, -dlen))return -1;ss2 =
>     ss2.substring(-dlen, len2);}cmp = numCmp.compare(ss1, ss2);if (cmp
>     != 0) return cmp;if (dlen != 0) return dlen;}}static class
>     NumberSlicer implements Supplier<String> {private String
>     sequence;private boolean expectNumber = false;private int index =
>     0;NumberSlicer(String s) {sequence = s;}@Override public String
>     get() {int start = index, end = start, len = sequence.length() -
>     start;for (; len > 0; ++end, --len) {if
>     (Character.isDigit(sequence.ch <http://sequence.ch>arAt(end)) ^
>     expectNumber)break;}expectNumber = !expectNumber;return
>     sequence.substring(start, index = end);}}}------------Here how it
>     is invoked with case-insensitive comparator:Arrays.sort(str,new
>     MyComparator(Comparator.comparing(String::toString,String::compareToIgnoreCase),Comparator.comparing(String::toString,String::compareTo)));------------
>
>     simple test results for case insensitive sorting:java 1java 1
>     javajava 1 JAVAJava 2JAVA 5jaVA 6.1java 10java 10 v 13Java 10 v
>     013Java 10 v 000013java 10 v 113Java 2017Java 2017Java 20017Java
>     200017Java 2000017Java 20000017Java 20000017Java 200000017With
>     kind regards, Ivan
>
>

-- 
With kind regards,
Ivan Gerasimov



More information about the core-libs-dev mailing list