JDK 8 code review request for initial unsigned integer arithmetic library support

Joe Darcy

14 Jan 2012 14 Jan '12

5:26 a.m.

Hello, Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic: 4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned http://cr.openjdk.java.net/~darcy/4504839.1/ For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive. To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong. (This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.) Thanks, -Joe

Show replies by date

Mike Duigou

14 Jan 14 Jan

5:51 a.m.

Really cool stuff Joe. One initial note: parseUnsigned*() returns a signed value that may not be able to hold the entire result. I would rather see a larger size be returned to be sure that subsequent operations don't have to be special cased for > MAX_VALUE. At minimum some documentation describing the interpretation of negative return values is needed. Mike On Jan 13 2012, at 21:26 , Joe Darcy wrote:

...

Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~darcy/4504839.1/

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive.

To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong.

(This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.)

Thanks,

-Joe

Joe Darcy

6:58 p.m.

Hi Mike, On 01/13/2012 09:51 PM, Mike Duigou wrote:

...

Really cool stuff Joe.

One initial note:

parseUnsigned*() returns a signed value that may not be able to hold the entire result. I would rather see a larger size be returned to be sure that subsequent operations don't have to be special cased for> MAX_VALUE. At minimum some documentation describing the interpretation of negative return values is needed.

The intention of the fooUnsigned methods is that they interpret the 32 or 64 bit value as unsigned. So for example, for parseUnsigned, rather than recognizing strings representing values between -2^31 and (2^31)-1, the method recognizes values between 0 and (2^32)-1, mapping results between 2^31 and (2^32)-1 to what are usually thought of as the negative values. I'll take another pass over the new javadoc to try to make this clearer. Thanks, -Joe

...

Mike

On Jan 13 2012, at 21:26 , Joe Darcy wrote:

...
Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~darcy/4504839.1/

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive.

To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong.

(This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.)

Thanks,

-Joe

Eamonn McManus

15 Jan 15 Jan

5:53 p.m.

It's great to see this! The API looks reasonable to me.

...

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms.

The code looks clean and correct to me. But I think we could afford one or two cheap improvements to Long without diving into the full-blown Hacker's Delight algorithms: In toUnsignedBigInteger(i) we could check whether i is nonnegative and use plain BigInteger.valueOf(i) in that case. Also, although the difference is sure to be unmeasurable, I think (int) (i >>> 32) would be better than (int) ((i >> 32) & 0xffffffff). In parseUnsignedLong, we can avoid using BigInteger by parsing all but the last digit as a positive number and then adding in that digit: long first = parseLong(s.substring(0, len - 1), radix); int second = Character.digit(s.charAt(len - 1), radix); if (second < 0) { throw new NumberFormatException("Bad digit at end of " + s); } long result = first * radix + second; if (compareUnsigned(result, first) < 0) { throw new NumberFormatException(String.format("String value %s exceeds " + "range of unsigned long.", s)); } By my measurements this speeds up the parsing of random decimal unsigned longs by about 2.5 times. Changing the existing code to move the limit constant to a field or to test for overflow using bi.bitLength() instead still leaves it about twice as slow. In divideUnsigned, after eliminating negative divisors we could check whether the dividend is also nonnegative and use plain division in that case. In remainderUnsigned, we could check whether both arguments are nonnegative and use plain % in that case, and we could also check whether the divisor is unsigned-less than the dividend, and return it directly in that case. Éamonn On 13 January 2012 21:26, Joe Darcy <joe.darcy@oracle.com> wrote:

...

Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~**darcy/4504839.1/<http://cr.openjdk.java.net/~darcy/4504839.1/>

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive.

To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong.

(This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.)

Thanks,

-Joe

Joe Darcy

18 Jan 18 Jan

2:54 a.m.

Hi Eamonn, On 01/15/2012 09:53 AM, Eamonn McManus wrote:

...

It's great to see this!

I agree :-) I've posted a revised webrev at http://cr.openjdk.java.net/~darcy/4504839.2 More detailed responses inline.

...

The API looks reasonable to me.

...
For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms.

The code looks clean and correct to me. But I think we could afford one or two cheap improvements to Long without diving into the full-blown Hacker's Delight algorithms:

In toUnsignedBigInteger(i) we could check whether i is nonnegative and use plain BigInteger.valueOf(i) in that case. Also, although the difference is sure to be unmeasurable, I think (int) (i >>> 32) would be better than (int) ((i >> 32) & 0xffffffff).

Good points; changed.

...

In parseUnsignedLong, we can avoid using BigInteger by parsing all but the last digit as a positive number and then adding in that digit: long first = parseLong(s.substring(0, len - 1), radix); int second = Character.digit(s.charAt(len - 1), radix); if (second < 0) { throw new NumberFormatException("Bad digit at end of " + s); } long result = first * radix + second; if (compareUnsigned(result, first) < 0) { throw new NumberFormatException(String.format("String value %s exceeds " + "range of unsigned long.", s)); } By my measurements this speeds up the parsing of random decimal unsigned longs by about 2.5 times. Changing the existing code to move the limit constant to a field or to test for overflow using bi.bitLength() instead still leaves it about twice as slow.

Changed. Also from some off-list comments from Mike, I've modified the first sentence of the parseUnsignedLong methods to explicitly mention the "long" type; this is consistent with the phrasing of the signed parseLong methods in java.lang.Long.

...

In divideUnsigned, after eliminating negative divisors we could check whether the dividend is also nonnegative and use plain division in that case.

Changed.

...

In remainderUnsigned, we could check whether both arguments are nonnegative and use plain % in that case, and we could also check whether the divisor is unsigned-less than the dividend, and return it directly in that case.

Changed. I've also added test cases for the unsigned divide and remainder methods. Thanks again, -Joe

...

Éamonn

On 13 January 2012 21:26, Joe Darcy <joe.darcy@oracle.com <mailto:joe.darcy@oracle.com>> wrote:

Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~darcy/4504839.1/ <http://cr.openjdk.java.net/%7Edarcy/4504839.1/>

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive.

To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong.

(This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.)

Thanks,

-Joe

Eamonn McManus

5:08 a.m.

Hi Joe, That looks great to me (emcmanus). One thing I noticed is that the behaviour is not explicitly specified when parseUnsignedLong is given a null String reference. But I see that is also true of the existing parseLong and valueOf(String) and decode(String), so perhaps there should be a separate bug to update the spec there. The phrase "If the string cannot be parsed as a long" does not cover this case as obviously as it might. Cheers, Éamonn On 17 January 2012 18:54, Joe Darcy <joe.darcy@oracle.com> wrote:

...

Hi Eamonn,

On 01/15/2012 09:53 AM, Eamonn McManus wrote:

It's great to see this!

I agree :-)

I've posted a revised webrev at

http://cr.openjdk.java.net/~darcy/4504839.2

More detailed responses inline.

The API looks reasonable to me.

...
For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms.

The code looks clean and correct to me. But I think we could afford one or two cheap improvements to Long without diving into the full-blown Hacker's Delight algorithms:

In toUnsignedBigInteger(i) we could check whether i is nonnegative and use plain BigInteger.valueOf(i) in that case. Also, although the difference is sure to be unmeasurable, I think (int) (i >>> 32) would be better than (int) ((i >> 32) & 0xffffffff).

Good points; changed.

In parseUnsignedLong, we can avoid using BigInteger by parsing all but the last digit as a positive number and then adding in that digit: long first = parseLong(s.substring(0, len - 1), radix); int second = Character.digit(s.charAt(len - 1), radix); if (second < 0) { throw new NumberFormatException("Bad digit at end of " + s); } long result = first * radix + second; if (compareUnsigned(result, first) < 0) { throw new NumberFormatException(String.format("String value %s exceeds " + "range of unsigned long.", s)); } By my measurements this speeds up the parsing of random decimal unsigned longs by about 2.5 times. Changing the existing code to move the limit constant to a field or to test for overflow using bi.bitLength() instead still leaves it about twice as slow.

Changed.

Also from some off-list comments from Mike, I've modified the first sentence of the parseUnsignedLong methods to explicitly mention the "long" type; this is consistent with the phrasing of the signed parseLong methods in java.lang.Long.

In divideUnsigned, after eliminating negative divisors we could check whether the dividend is also nonnegative and use plain division in that case.

Changed.

In remainderUnsigned, we could check whether both arguments are nonnegative and use plain % in that case, and we could also check whether the divisor is unsigned-less than the dividend, and return it directly in that case.

Changed.

I've also added test cases for the unsigned divide and remainder methods.

Thanks again,

-Joe

Éamonn

On 13 January 2012 21:26, Joe Darcy <joe.darcy@oracle.com> wrote:

...
Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~darcy/4504839.1/

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive.

To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong.

(This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.)

Thanks,

-Joe

Joe Darcy

5:41 a.m.

Hi Eamonn, The body javadoc text of the two-argument parseUnsignedLong method does state 620 * An exception of type {@code NumberFormatException} is 621 * thrown if any of the following situations occurs: 622 * <ul> 623 * <li>The first argument is {@code null} or is a string of 624 * length zero. ... However, it is true that the method does not have an explicit @throws clause detailing this condition and that somewhat unconventionally an NPE is *not* throw for an nonsensical null input. The behavior of the one-argument version of parseUnsignedLong is defined in terms of the two-argument version so strictly from a specification perspective, I think the existing text is okay as-is even if suboptimal. Thanks for the reviews, -Joe On 01/17/2012 09:08 PM, Eamonn McManus wrote:

...

Hi Joe,

That looks great to me (emcmanus). One thing I noticed is that the behaviour is not explicitly specified when parseUnsignedLong is given a null String reference. But I see that is also true of the existing parseLong and valueOf(String) and decode(String), so perhaps there should be a separate bug to update the spec there. The phrase "If the string cannot be parsed as a long" does not cover this case as obviously as it might.

Cheers, Éamonn

On 17 January 2012 18:54, Joe Darcy <joe.darcy@oracle.com <mailto:joe.darcy@oracle.com>> wrote:

Hi Eamonn,

On 01/15/2012 09:53 AM, Eamonn McManus wrote:

...
It's great to see this!

I agree :-)

I've posted a revised webrev at

http://cr.openjdk.java.net/~darcy/4504839.2 <http://cr.openjdk.java.net/%7Edarcy/4504839.2>

More detailed responses inline.

...
The API looks reasonable to me.

> For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms.

The code looks clean and correct to me. But I think we could afford one or two cheap improvements to Long without diving into the full-blown Hacker's Delight algorithms:

In toUnsignedBigInteger(i) we could check whether i is nonnegative and use plain BigInteger.valueOf(i) in that case. Also, although the difference is sure to be unmeasurable, I think (int) (i >>> 32) would be better than (int) ((i >> 32) & 0xffffffff).

Good points; changed.

...
In parseUnsignedLong, we can avoid using BigInteger by parsing all but the last digit as a positive number and then adding in that digit: long first = parseLong(s.substring(0, len - 1), radix); int second = Character.digit(s.charAt(len - 1), radix); if (second < 0) { throw new NumberFormatException("Bad digit at end of " + s); } long result = first * radix + second; if (compareUnsigned(result, first) < 0) { throw new NumberFormatException(String.format("String value %s exceeds " + "range of unsigned long.", s)); } By my measurements this speeds up the parsing of random decimal unsigned longs by about 2.5 times. Changing the existing code to move the limit constant to a field or to test for overflow using bi.bitLength() instead still leaves it about twice as slow.

Changed.

Also from some off-list comments from Mike, I've modified the first sentence of the parseUnsignedLong methods to explicitly mention the "long" type; this is consistent with the phrasing of the signed parseLong methods in java.lang.Long.

...
In divideUnsigned, after eliminating negative divisors we could check whether the dividend is also nonnegative and use plain division in that case.

Changed.

...
In remainderUnsigned, we could check whether both arguments are nonnegative and use plain % in that case, and we could also check whether the divisor is unsigned-less than the dividend, and return it directly in that case.

Changed.

I've also added test cases for the unsigned divide and remainder methods.

Thanks again,

-Joe

...
Éamonn

On 13 January 2012 21:26, Joe Darcy <joe.darcy@oracle.com <mailto:joe.darcy@oracle.com>> wrote:

Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~darcy/4504839.1/ <http://cr.openjdk.java.net/%7Edarcy/4504839.1/>

For the first cut, I've favored keeping the code straightforward over trickier but potentially faster algorithms. Tests need to be written for the unsigned divide and remainder methods, but otherwise the regression tests are fairly extensive.

To avoid the overhead of having to deal with boxed objects, the unsigned functionality is implemented as static methods on Integer and Long, etc. as opposed to introducing new types like UnsignedInteger and UnsignedLong.

(This work is not meant to preclude other integer arithmetic enhancements from going into JDK 8, such as add/subtract/multiply/divide methods that throw exceptions on overflow.)

Thanks,

-Joe

Ulf Zibis

19 Jan 19 Jan

3:52 a.m.

Am 18.01.2012 03:54, schrieb Joe Darcy:

...

I've posted a revised webrev at

http://cr.openjdk.java.net/~darcy/4504839.2

Instead <code>'\u0030'</code> you can use {@code '\u005Cu0030'} Byte: ===== 459 public static int toUnsignedInt(byte x) { 460 return ((int) x) & 0xff; 461 } This should be good enough (similar at Short, Integer): 459 public static int toUnsignedInt(byte x) { 460 return x & 0xff; 461 } (This notation if regularly used in sun.nio.cs coders.) missing: public static short toUnsignedShort(byte x) superfluous: public static long toUnsignedInt(byte x) public static long toUnsignedLong(byte x) (similar at Short) one can use: int i = toUnsignedShort(x) long l = toUnsignedShort(x) (similar at Short) Integer: ======== 623 * <li>The value represented by the string is larger than the 624 * largest unsigned {@code int}, 232-1. If you format {@code int}, then you speak about the java type int, which is always signed, never unsigned. IMO you should better write 'unsigned 32-bit integer". (similar at Long) 598 * Parses the string argument as an unsigned integer in the radix 599 * specified by the second argument. IMHO, there should be a note about what happens on values above 2^31 - 1. 672 * Parses the string argument as an unsigned decimal integer. The 673 * characters in the string must all be decimal digits, except Better, like lines 598ff, or contrariwise (similar at Long): 672 * Parses the string argument as an unsigned decimal integer. 673 * 674 * The characters in the string must all be decimal digits, except Long: ===== What about: private static final BigInteger BEYOND_UNSIGNED_LONG = BigInteger.valueOf(1).shiftLeft(64); private static BigInteger toUnsignedBigInteger(long i) { BigInteger result = BigInteger.valueOf(i); if (i < 0L) result = result.add(BEYOND_UNSIGNED_LONG); return result; } Instead private static BigInteger toUnsignedBigInteger(long i) at class BigInteger we more generally could have: public static BigInteger unsignedValueOf(long i) 610 * Parses the string argument as an unsigned {@code long} in the 611 * radix specified by the second argument. IMHO, there should be a note about what happens on values above 2^63 - 1. -Ulf

Eamonn McManus

6:43 a.m.

Ulf Zibis writes:

...

What about: private static final BigInteger BEYOND_UNSIGNED_LONG = BigInteger.valueOf(1).**shiftLeft(64); private static BigInteger toUnsignedBigInteger(long i) { BigInteger result = BigInteger.valueOf(i); if (i < 0L) result = result.add(BEYOND_UNSIGNED_**LONG); return result; }

That's a nice idea! But the problem is that it would mean that BigInteger.class would be loaded as soon as Long.class is, which I think is undesirable. However it does make me think that we could change this: if (i >= 0L) return BigInteger.valueOf(i); else { int upper = (int) (i >>> 32); int lower = (int) i; // return (upper << 32) + lower return (BigInteger.valueOf(Integer.toUnsignedLong(upper))).shiftLeft(32). add(BigInteger.valueOf(Integer.toUnsignedLong(lower))); } to this: if (i >= 0L) { return BigInteger.valueOf(i); } else { return BigInteger.valueOf(i & Long.MAX_VALUE).setBit(63); } Éamonn On 18 January 2012 19:52, Ulf Zibis <Ulf.Zibis@gmx.de> wrote:

...

Am 18.01.2012 03:54, schrieb Joe Darcy:

I've posted a revised webrev at

...
http://cr.openjdk.java.net/~**darcy/4504839.2<http://cr.openjdk.java.net/~darcy/4504839.2>

Instead <code>'\u0030'</code> you can use {@code '\u005Cu0030'}

Byte: ===== 459 public static int toUnsignedInt(byte x) { 460 return ((int) x) & 0xff; 461 } This should be good enough (similar at Short, Integer): 459 public static int toUnsignedInt(byte x) { 460 return x & 0xff; 461 } (This notation if regularly used in sun.nio.cs coders.)

missing: public static short toUnsignedShort(byte x)

superfluous: public static long toUnsignedInt(byte x) public static long toUnsignedLong(byte x) (similar at Short) one can use: int i = toUnsignedShort(x) long l = toUnsignedShort(x) (similar at Short)

Integer: ======== 623 * <li>The value represented by the string is larger than the 624 * largest unsigned {@code int}, 232-1. If you format {@code int}, then you speak about the java type int, which is always signed, never unsigned. IMO you should better write 'unsigned 32-bit integer". (similar at Long)

598 * Parses the string argument as an unsigned integer in the radix 599 * specified by the second argument. IMHO, there should be a note about what happens on values above 2^31 - 1.

672 * Parses the string argument as an unsigned decimal integer. The 673 * characters in the string must all be decimal digits, except Better, like lines 598ff, or contrariwise (similar at Long): 672 * Parses the string argument as an unsigned decimal integer. 673 * 674 * The characters in the string must all be decimal digits, except

Long: ===== What about: private static final BigInteger BEYOND_UNSIGNED_LONG = BigInteger.valueOf(1).**shiftLeft(64); private static BigInteger toUnsignedBigInteger(long i) { BigInteger result = BigInteger.valueOf(i); if (i < 0L) result = result.add(BEYOND_UNSIGNED_**LONG); return result; }

Instead private static BigInteger toUnsignedBigInteger(long i) at class BigInteger we more generally could have: public static BigInteger unsignedValueOf(long i)

610 * Parses the string argument as an unsigned {@code long} in the 611 * radix specified by the second argument. IMHO, there should be a note about what happens on values above 2^63 - 1.

-Ulf

Ulf Zibis

4:05 p.m.

Am 19.01.2012 07:43, schrieb Eamonn McManus:

...

Ulf Zibis writes:

...
What about: private static final BigInteger BEYOND_UNSIGNED_LONG = BigInteger.valueOf(1).shiftLeft(64); private static BigInteger toUnsignedBigInteger(long i) { BigInteger result = BigInteger.valueOf(i); if (i < 0L) result = result.add(BEYOND_UNSIGNED_LONG); return result; }

That's a nice idea! But the problem is that it would mean that BigInteger.class would be loaded as soon as Long.class is, which I think is undesirable. Thanks for the critic. I didn't see that. The problem could be easily avoided if method toUnsignedBigInteger(long i) would be moved to class BigInteger as unsignedValueOf(long i), as I additionally noted in my last post.

...

However it does make me think that we could change...to this:

if (i >= 0L) { return BigInteger.valueOf(i); } else { return BigInteger.valueOf(i & Long.MAX_VALUE).setBit(63); } Another nice idea! But again, moving the entire method to BigInteger would additionally avoid to clown around with the available BigInteger's public APIs. Having the method at BigInteger would allow elegant direct access to the private value fields.

-Ulf

Joseph Darcy

21 Jan 21 Jan

12:35 a.m.

On 1/19/2012 8:05 AM, Ulf Zibis wrote:

...

Am 19.01.2012 07:43, schrieb Eamonn McManus:

...
Ulf Zibis writes:

...
What about: private static final BigInteger BEYOND_UNSIGNED_LONG = BigInteger.valueOf(1).shiftLeft(64); private static BigInteger toUnsignedBigInteger(long i) { BigInteger result = BigInteger.valueOf(i); if (i < 0L) result = result.add(BEYOND_UNSIGNED_LONG); return result; }

That's a nice idea! But the problem is that it would mean that BigInteger.class would be loaded as soon as Long.class is, which I think is undesirable. Thanks for the critic. I didn't see that. The problem could be easily avoided if method toUnsignedBigInteger(long i) would be moved to class BigInteger as unsignedValueOf(long i), as I additionally noted in my last post.

...
However it does make me think that we could change...to this:

if (i >= 0L) { return BigInteger.valueOf(i); } else { return BigInteger.valueOf(i & Long.MAX_VALUE).setBit(63); } Another nice idea! But again, moving the entire method to BigInteger would additionally avoid to clown around with the available BigInteger's public APIs. Having the method at BigInteger would allow elegant direct access to the private value fields.

If the operation in question starts becoming a bottleneck, these alternate implementations can be explored. In the meantime, I plan to stick with the straightforward code and not setup the infrastructure needed to get at BigInteger internals. Thanks, -Joe

Ulf Zibis

1:53 a.m.

Am 21.01.2012 01:35, schrieb Joseph Darcy:

...

On 1/19/2012 8:05 AM, Ulf Zibis wrote:

...
But again, moving the entire method to BigInteger would additionally avoid to clown around with the available BigInteger's public APIs. Having the method at BigInteger would allow elegant direct access to the private value fields.

If the operation in question starts becoming a bottleneck, these alternate implementations can be explored. But the alternatives for potentially faster algorithms would be limited if you stick BigInteger toUnsignedBigInteger(long i) to class Long.

-Ulf

Eamonn McManus

2:32 a.m.

On 20 January 2012 17:53, Ulf Zibis <Ulf.Zibis@gmx.de> wrote:

...

Am 21.01.2012 01:35, schrieb Joseph Darcy:

...
On 1/19/2012 8:05 AM, Ulf Zibis wrote:

...
But again, moving the entire method to BigInteger would additionally avoid to clown around with the available BigInteger's public APIs. Having the method at BigInteger would allow elegant direct access to the private value fields.

If the operation in question starts becoming a bottleneck, these alternate implementations can be explored.

But the alternatives for potentially faster algorithms would be limited if you stick BigInteger toUnsignedBigInteger(long i) to class Long.

There's no reason Long and BigInteger can't conspire to achieve this without changing their APIs, if it proves interesting. It's not completely straightforward since they are in different packages, but perfectly possible using a sun.* intermediary. Éamonn

Joseph Darcy

12:31 a.m.

On 1/18/2012 7:52 PM, Ulf Zibis wrote:

...

Am 18.01.2012 03:54, schrieb Joe Darcy:

...
I've posted a revised webrev at

http://cr.openjdk.java.net/~darcy/4504839.2

Instead <code>'\u0030'</code> you can use {@code '\u005Cu0030'}

That is a fine cleanup, but I'll do a bulk conversion of all the instances of that pattern later on under another bug.

...

Byte: ===== 459 public static int toUnsignedInt(byte x) { 460 return ((int) x) & 0xff; 461 } This should be good enough (similar at Short, Integer): 459 public static int toUnsignedInt(byte x) { 460 return x & 0xff; 461 } (This notation if regularly used in sun.nio.cs coders.)

I think the current code more explicitly indicates the intention of the method's operation to more casual readers less familiar with the details of primitive widening conversions, etc.

...

missing: public static short toUnsignedShort(byte x)

That omission was intentional. The language and VM only really support arithmetic on int and long values, not short or byte, so I only providing methods to widen to int and long.

...

superfluous: public static long toUnsignedInt(byte x) public static long toUnsignedLong(byte x) (similar at Short) one can use: int i = toUnsignedShort(x) long l = toUnsignedShort(x) (similar at Short)

Integer: ======== 623 * <li>The value represented by the string is larger than the 624 * largest unsigned {@code int}, 232-1. If you format {@code int}, then you speak about the java type int, which is always signed, never unsigned. IMO you should better write 'unsigned 32-bit integer". (similar at Long)

Due to similar feedback, I've made various clarifications to the javadoc, but the basic situation is that the "fooUnsigned" methods interpret the bits as unsigned values, as already done in toHexString and related methods. Thanks, -Joe

Ulf Zibis

1:39 a.m.

Thanks for your feedback. Am 21.01.2012 01:31, schrieb Joseph Darcy:

...

On 1/18/2012 7:52 PM, Ulf Zibis wrote:

...
Am 18.01.2012 03:54, schrieb Joe Darcy:

...
I've posted a revised webrev at

http://cr.openjdk.java.net/~darcy/4504839.2

Instead <code>'\u0030'</code> you can use {@code '\u005Cu0030'}

That is a fine cleanup, but I'll do a bulk conversion of all the instances of that pattern later on under another bug. I only meant the new lines, where you have a mixture of {@code...} and <code>...</code>. Then IMHO better exclusively use <code>...</code>.

...

...
Byte: ===== 459 public static int toUnsignedInt(byte x) { 460 return ((int) x) & 0xff; 461 } This should be good enough (similar at Short, Integer): 459 public static int toUnsignedInt(byte x) { 460 return x & 0xff; 461 } (This notation if regularly used in sun.nio.cs coders.)

I think the current code more explicitly indicates the intention of the method's operation to more casual readers less familiar with the details of primitive widening conversions, etc.

Aha! But I think, people who deal with unsigned bits know that, and otherwise they puzzle, what about the hack is the miracle behind the cast, like me ;-)

...

...
missing: public static short toUnsignedShort(byte x)

That omission was intentional. The language and VM only really support arithmetic on int and long values, not short or byte, so I only providing methods to widen to int and long.

I think, this is the VM's deal. Otherwise, people, who intentionally use short type, would have to cast: short s = (short)toUnsignedInt(byte x);

...

...
superfluous: public static int toUnsignedInt(byte x) public static long toUnsignedLong(byte x) (similar at Short) one can use: int i = toUnsignedShort(x) long l = toUnsignedShort(x) (similar at Short)

Integer: ======== 623 * <li>The value represented by the string is larger than the 624 * largest unsigned {@code int}, 232-1. If you format {@code int}, then you speak about the java type int, which is always signed, never unsigned. IMO you should better write 'unsigned 32-bit integer". (similar at Long)

Due to similar feedback, I've made various clarifications to the javadoc, but the basic situation is that the "fooUnsigned" methods interpret the bits as unsigned values, as already done in toHexString and related methods.

There is a difference: toHexString performs: transform _to_ hex string but... toUnsignedInt performs: transform _as/from_ unsigned -Ulf

Ulf Zibis

20 Jan 20 Jan

3:12 p.m.

A little different approach... I worry about the wording of e.g. toUnsignedInt(x). At first look, it claims to return an unsigned integer, which fairly doesn't exist in Java for now. 1. Better: unsignedIntValueOf(x) 2. We could have a naming problem if unsigned integers were introduced in any future for Java. Then e.g. toUnsignedInt(x) could have a very different meaning. Instead e.g. int Byte.unsignedIntValueOf(byte x) aka int Byte.toUnsignedInt(byte x) I would vote for int Integer.unsignedValueOf(byte x) At least, we only need: short Short.unsignedValueOf(byte x) int Integer.unsignedValueOf(short x) long Long.unsignedValueOf(int x) BigInteger BigInteger.unsignedValueOf(long x) -Ulf Am 14.01.2012 06:26, schrieb Joe Darcy:

...

Hello,

Polishing up some work I've had *almost* done for a long time, please review an initial take on providing library support for unsigned integer arithmetic:

4504839 Java libraries should provide support for unsigned integer arithmetic 4215269 Some Integer.toHexString(int) results cannot be decoded back to an int 6322074 Converting integers to string as if unsigned

http://cr.openjdk.java.net/~darcy/4504839.1/

Ulf Zibis

8:27 p.m.

Am 20.01.2012 16:12, schrieb Ulf Zibis:

...

A little different approach...

Instead e.g. int Byte.unsignedIntValueOf(byte x) aka int Byte.toUnsignedInt(byte x) I would vote for int Integer.unsignedValueOf(byte x)

Alternative: int Integer.valueAsUnsigned(byte x) -Ulf

5162

Age (days ago)

5169

Last active (days ago)

List overview

Download

16 comments

5 participants

participants (5)

Eamonn McManus
Joe Darcy
Joseph Darcy
Mike Duigou
Ulf Zibis